Remote Site Reliability Engineer

at Xsolla

Posted 2 days ago 3 applied

Description:

  • Xsolla is seeking a Site Reliability Engineer to ensure high reliability and availability of systems while meeting SLAs, SLOs, and SLIs.
  • The role involves monitoring systems for issues, responding to incidents, and driving incident resolution to minimize downtime.
  • Responsibilities include developing comprehensive monitoring solutions, supporting services before they go live, engaging in service capacity planning, and collaborating with development teams to enhance operational stability.
  • The position requires proven experience in a Site Reliability Engineer or similar role within a large-scale production environment, with a focus on IT operations or development.
  • The candidate should have proficiency in scripting languages such as Python and Bash, with a strong understanding of Go and PHP being a plus.
  • Familiarity with monitoring systems like Datadog, Prometheus, and Grafana, as well as experience with Docker, Kubernetes, and infrastructure automation tools like Terraform, is essential.
  • The role also requires excellent problem-solving skills, communication abilities, and experience with Linux-based infrastructures.

Requirements:

  • Candidates must have 5 to 10 years of proven experience as a Site Reliability Engineer or in a similar Software Engineering role in a large-scale production environment.
  • A strong background in IT, either as an Operations or Development professional, is required.
  • Proficiency in scripting languages such as Python and Bash is necessary, with a strong understanding of Go and PHP being advantageous.
  • Deep knowledge of monitoring systems such as Datadog, Prometheus, and Grafana is essential.
  • A good understanding of continuous integration/continuous delivery processes and platforms, preferably Gitlab, along with experience with Helm, is required.
  • Experience with Docker, Kubernetes, or other container orchestration systems is necessary.
  • Familiarity with infrastructure automation tools like Terraform is required.
  • Candidates should have experience with automation, system administration, and system hardening.
  • Experience with Linux-based infrastructures and Linux/Unix administration is essential.
  • Demonstrated problem-solving skills, particularly in debugging and troubleshooting complex software systems, are required.
  • Excellent communication skills with the ability to articulate and solve complex technical problems are necessary.
  • Familiarity with Xsolla's technology stack, including Ubuntu, Kubernetes, Gitlab, Terraform, and Google Cloud Platform, is a plus.

Benefits:

  • Xsolla offers a comprehensive Benefits Program that prioritizes the physical, mental, and emotional well-being of employees and their families.
  • The benefits include 100% company-paid medical, dental, and vision plans.
  • Employees enjoy unlimited Flexible Time Off to promote work-life balance.
  • A personalized career roadmap is provided for each employee to support their professional development.
  • The company invests in training and educational opportunities to ensure team members thrive both personally and professionally.
  • Xsolla fosters a supportive environment that values creativity, collaboration, and the transformative power of play.