Please let Replit know you found this job on RemoteYeah. This helps us get more companies to post jobs here for you.
Description:
Join Replit's Site Reliability Engineering (SRE) team to ensure the reliability, scalability, and performance of infrastructure serving millions of developers.
Proactively identify and resolve reliability issues, design observability solutions, lead incident response, and mentor the engineering team.
Requirements:
8-10 years of experience in Site Reliability Engineering or similar roles (DevOps, Systems Engineering).
Strong programming skills in Python or Go, with a deep understanding of distributed systems and container orchestration (Kubernetes).
Proven experience in monitoring and observability solutions, incident management, and infrastructure as code (Terraform, Pulumi).
Excellent communication and interpersonal skills, with a passion for making software creation accessible.
Benefits:
Competitive salary and equity, 401(k) with 4% match (US only), health, dental, vision, and life insurance.
Flexible time off, paid parental leave, wellness stipend, and an autonomous work environment.