Remote Site Reliability Engineer(SRE)/DevOps Engineer
Posted
This job is closed
This job post is closed and the position is probably filled. Please do not apply.
π€ Automatically closed by a robot after apply link
was detected as broken.
Description:
Collaborate with development teams to design and implement automated deployment and testing pipelines.
Develop and maintain monitoring and alerting systems to proactively identify and address issues.
Troubleshoot and escalate production incidents to minimize downtime and improve system reliability.
Continuously improve our infrastructure and processes to optimize scalability and efficiency.
Participate in on-call rotations as needed to ensure 24/7 support for our platform.
Perform routine maintenance and upgrades as needed to keep our systems up to date.
Contribute to ongoing efforts to improve our security posture and compliance with industry standards.
Requirements:
Bachelor's degree in Computer Engineering, Computer Science, or related field.
3+ years of experience in a similar role, preferably with experience in a high traffic, high availability environment.
Proficiency in at least one programming language (Python, Ruby, Java, Go, etc.).
Strong understanding of cloud infrastructure and related technologies (AWS, GCP, Azure, Kubernetes, Docker, etc.).
Excellent troubleshooting and problem-solving skills.
Experience with automation and configuration management tools (Chef, Ansible, Puppet, Terraform, etc.).
Familiarity with monitoring and alerting tools (Prometheus, Grafana, Nagios, etc.).
Benefits:
Opportunity to work in a fast-paced, high availability environment.
Chance to collaborate with development teams on innovative projects.
Exposure to a variety of cloud technologies and automation tools.
On-call rotations to ensure 24/7 support for the platform.
Continuous learning and improvement of infrastructure and processes.
Contribution to enhancing security posture and compliance with industry standards.