Please let PlayOn know you found this job on RemoteYeah. This helps us get more companies to post jobs here for you.
Description:
Seeking an experienced Senior Site Reliability Engineer to enhance system reliability, performance, and scalability.
Role involves building tools, automation, and visibility for resilient software delivery, collaborating with application engineers, DevOps, and QA teams.
Requirements:
Solid experience in Python and proficiency in at least one of Java, C++, or Go.
Strong understanding of Linux systems, cloud infrastructure (AWS, GCP, or Azure), and modern deployment practices (Docker, Kubernetes, Terraform).
Experience with CI/CD pipelines, version control, and automated testing frameworks.
Familiarity with observability tools (e.g., Prometheus, Grafana, ELK, Datadog) and log/metric analysis.
Proven ability to document Critical User Journeys and translate them into actionable SLA/SLO.
Strong collaboration skills and problem-solving mindset.
Benefits:
Multiple medical insurance plans, dental, vision, life, and disability insurance.
Employee Emergency Fund and company equity (stock options).