Log in
All Remote Jobs
Remote Site Reliability Engineer jobs
Site Reliability Engineer (Internal Engineering) (Remote)
Remote Site Reliability Engineer (Internal Engineering) (Remote)
Posted
3 days ago
Apply now
Please, let KnowBe4 know you found this job on RemoteYeah. This helps us grow 🌱.
Apply now
Description:
The Internal Site Reliability Engineer (SRE) ensures the reliability, scalability, and performance of internal systems and infrastructure.
This role involves monitoring, automation, incident management, and maintaining self-hosted platforms to support smooth development operations.
The Internal SRE works closely with cross-functional teams to manage GitLab CI/CD workflows and cloud infrastructure on AWS.
The position emphasizes proactive problem-solving, automation, and collaboration to continuously improve system stability and efficiency.
Responsibilities include managing and maintaining GitLab environments to ensure high availability and security.
The SRE will design and implement CI/CD pipelines to automate software delivery.
Monitoring and troubleshooting system performance issues using observability tools like Prometheus, Grafana, or Datadog is required.
Collaboration with development teams to align infrastructure efforts with project needs and timelines is essential.
The role involves building and maintaining infrastructure as code (IaC) solutions using tools like Terraform and Ansible.
Managing AWS services, including ECS, S3, API Gateway, DynamoDB, RDS, IAM, and VPC, is part of the job.
Participation in incident response, conducting root cause analysis and post-incident reviews is expected.
Automating manual tasks to improve operational efficiency and reduce technical debt is a key responsibility.
Requirements:
A Bachelor’s degree in Computer Science, Information Technology, or a related field is required.
Equivalent work experience in SRE, DevOps, or infrastructure management may substitute for formal education.
Experience managing and securing self-hosted GitLab environments is necessary.
Expertise in designing and maintaining automated pipelines for continuous delivery is required.
Strong knowledge of AWS services, including ECS, S3, API Gateway, DynamoDB, RDS, IAM, VPC, and Lambda, is essential.
Proficiency in Terraform, Ansible, or similar tools for Infrastructure-as-Code is required.
Experience with Prometheus, Grafana, Datadog, or other observability platforms is necessary.
Proficiency in Python, Bash, or other scripting languages to automate tasks is required.
The ability to lead incident response efforts and conduct root cause analysis is essential.
Strong interpersonal skills to work effectively across teams and with stakeholders are required.
Benefits:
KnowBe4 has been recognized as a best place to work for women, millennials, and in technology for four consecutive years.
The company has been certified as a "Great Place To Work" in 8 countries.
Employees enjoy a welcoming workplace that encourages them to be themselves.
The company promotes continuous professional development and radical transparency.
There are opportunities for team engagement through activities like team lunches, trivia competitions, and local outings.
Apply now
Please, let KnowBe4 know you found this job on RemoteYeah . This helps us grow 🌱.
Apply now
About the job
Posted on
November 20, 2024
Job type
Full-time
Salary
-
Location requirements
-
Position
Site Reliability Engineer
Experience level
Mid-level
Technology stack
AWS
GitLab
Lambda
Terraform
Dynamodb
CI/CD
Python
Bash
Ansible
Prometheus
K
KnowBe4
View company profile
Report this job
Job expired or something else is wrong with this job?
Report this job
Leave a feedback