Remote Senior Site Reliability Engineer

Posted

Apply now
Please, let Playson know you found this job on RemoteYeah. This helps us grow 🌱.

Description:

  • Playson is a leading iGaming supplier founded in 2012, recognized worldwide for providing a high-end micro-service-based platform as a service.
  • The company processes billions of financial transactions per day and aims for zero latency in its cross-regional setup.
  • The Senior Site Reliability Engineer/DevOps will manage day-to-day alerts, system checks, and issue escalation as necessary.
  • The role includes providing 24x7 on-call support for critical SaaS events and documenting issues and remediation steps.
  • Responsibilities involve proactively creating monitors within the EKS/K8s ecosystem and deploying to EKS/K8s clusters using Terraform and Helm/Flux.
  • The engineer will enhance infrastructure health by implementing checks and scripts to address known issues and maintain and develop deployment code.
  • The position requires implementing/integrating new technologies into the Cloud Infrastructure and collaborating with other teams for top-notch support.
  • Customer focus is prioritized in planning deployments/updates to ensure minimal impact.
  • Conducting root cause analysis (RCA) and taking corrective actions to prevent issue recurrence is essential.
  • The engineer will assign alert-related actions to the appropriate team after investigation and handle support requests for environment-specific actions.

Requirements:

  • Strong experience with issue processing, including RCA and Postmortems, is required.
  • Proficiency in Kubernetes, including deployment, scaling, and troubleshooting, is necessary.
  • Familiarity with AWS, Terraform, Docker, and CI/CD practices is essential.
  • Experience with monitoring tools such as DataDog, Prometheus, Grafana, and logging solutions like Elasticsearch, Logstash, and Kibana (ELK Stack) or AWS CloudWatch is required.
  • A strong understanding of networking concepts and protocols is needed.
  • Proficiency in at least one scripting language, such as Python, NodeJS, or Go, is necessary.
  • Experience with configuration management tools like FluxCD or ArgoCD is required.
  • Proficiency in Git or other version control systems is essential.
  • Familiarity with incident response and management tools like PagerDuty, Opsgenie, or VictorOps is necessary.
  • The candidate should demonstrate ownership, proactiveness, persistence, and a passion for maintaining a high-traffic online platform.

Benefits:

  • Employees are eligible for quarterly bonuses based on transparent and systematic evaluation.
  • A flexible work schedule is offered to accommodate personal needs.
  • Remote work options are available for enhanced flexibility.
  • Comprehensive medical insurance is provided for employees and their significant others.
  • Financial support for life events is included as part of the benefits package.
  • Employees enjoy unlimited paid vacation and unlimited paid sick leave.
  • Reimbursement for professional development courses and training is available to support career growth.
Apply now
Please, let Playson know you found this job on RemoteYeah . This helps us grow 🌱.
Report this job

Job expired or something else is wrong with this job?

Report this job
Leave a feedback