Please let Replit know you found this job on RemoteYeah. This helps us get more companies to post jobs here for you.
Description:
Join Replit's Site Reliability Engineering team to ensure the reliability, scalability, and performance of infrastructure serving millions of developers.
Design and implement robust monitoring solutions, automate operational tasks, and improve infrastructure reliability and performance.
Requirements:
4-8 years of experience in Site Reliability Engineering or similar roles (DevOps, Systems Engineering, Infrastructure Engineering).
Strong programming skills in automation languages (Python, Go, etc.).
Deep understanding of distributed systems and experience with container orchestration platforms (Kubernetes).
Proven track record of implementing monitoring/observability solutions and strong incident management skills.
Experience with infrastructure as code and configuration management tools.