Xsolla is seeking a Site Reliability Engineer to ensure high reliability and availability of systems while meeting SLAs, SLOs, and SLIs.
The role involves monitoring systems for issues, responding to incidents, and driving incident resolution to minimize downtime.
Responsibilities include developing comprehensive monitoring solutions, supporting services before they go live, engaging in service capacity planning, and collaborating with development teams to enhance operational stability.
The position requires proven experience in a Site Reliability Engineer or similar role within a large-scale production environment, with a focus on IT operations or development.
The candidate should have proficiency in scripting languages such as Python and Bash, with a strong understanding of Go and PHP being a plus.
Familiarity with monitoring systems like Datadog, Prometheus, and Grafana, as well as experience with Docker, Kubernetes, and infrastructure automation tools like Terraform, is essential.
The role also requires excellent problem-solving skills, communication abilities, and experience with Linux-based infrastructures.
Requirements:
Candidates must have 5 to 10 years of proven experience as a Site Reliability Engineer or in a similar Software Engineering role in a large-scale production environment.
A strong background in IT, either as an Operations or Development professional, is required.
Proficiency in scripting languages such as Python and Bash is necessary, with a strong understanding of Go and PHP being advantageous.
Deep knowledge of monitoring systems such as Datadog, Prometheus, and Grafana is essential.
A good understanding of continuous integration/continuous delivery processes and platforms, preferably Gitlab, along with experience with Helm, is required.
Experience with Docker, Kubernetes, or other container orchestration systems is necessary.
Familiarity with infrastructure automation tools like Terraform is required.
Candidates should have experience with automation, system administration, and system hardening.
Experience with Linux-based infrastructures and Linux/Unix administration is essential.
Demonstrated problem-solving skills, particularly in debugging and troubleshooting complex software systems, are required.
Excellent communication skills with the ability to articulate and solve complex technical problems are necessary.
Familiarity with Xsolla's technology stack, including Ubuntu, Kubernetes, Gitlab, Terraform, and Google Cloud Platform, is a plus.
Benefits:
Xsolla offers a comprehensive Benefits Program that prioritizes the physical, mental, and emotional well-being of employees and their families.
The benefits include 100% company-paid medical, dental, and vision plans.
Employees enjoy unlimited Flexible Time Off to promote work-life balance.
A personalized career roadmap is provided for each employee to support their professional development.
The company invests in training and educational opportunities to ensure team members thrive both personally and professionally.
Xsolla fosters a supportive environment that values creativity, collaboration, and the transformative power of play.