Please, let Checkmate know you found this job
on RemoteYeah.
This helps us grow 🌱.
Description:
The Site Reliability Engineer will ensure the reliability and availability of production systems and services by monitoring, troubleshooting, and responding to incidents.
This role involves developing and maintaining tools and automation for system monitoring, alerting, and incident response to minimize manual intervention.
The engineer will collaborate with development teams to plan for capacity scaling and performance improvements based on usage patterns and growth forecasts.
Collaboration with development and product teams is essential to ensure that new features and services are designed with reliability in mind.
The engineer will also maintain documentation for operational processes, system configurations, and best practices.
Requirements:
A Bachelor's degree in computer science, information technology, or a related field (or equivalent work experience) is required.
Proven experience in software development and/or system administration is necessary.
Strong scripting and coding skills (e.g., Python, Go, Shell) for automation and tool development are essential.
Familiarity with containerization and orchestration technologies like Docker and Kubernetes is required.
Experience with cloud platforms (e.g., AWS, Azure, GCP) and infrastructure as code tools (e.g., Terraform) is necessary.
Proficiency in monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack) is required.
Knowledge of network, security, and database concepts is essential.
Strong problem-solving skills and the ability to work well under pressure are necessary.
An understanding of agile and DevOps methodologies is required.
Excellent communication and collaboration skills are essential for this role.
Availability to work during US hours till 3 pm ET is essential for this role.
Candidates must have their own system/work setup for remote work.
Benefits:
The position offers the opportunity to work in a dynamic and collaborative environment.
Employees will have the chance to develop and enhance their skills in a cutting-edge technology landscape.
The role provides flexibility with remote work options.
Employees will be part of a team that values innovation and reliability in production systems.
Apply now
Please, let Checkmate know you found this job
on RemoteYeah
.
This helps us grow 🌱.