Please, let Playson know you found this job
on RemoteYeah.
This helps us grow 🌱.
Description:
Playson is a leading iGaming supplier founded in 2012, recognized worldwide for providing a high-end micro-service-based platform as a service.
The company processes billions of financial transactions per day and aims for zero latency in its cross-regional setup.
The Senior Site Reliability Engineer/DevOps will manage day-to-day alerts, system checks, and issue escalation as necessary.
The role includes providing 24x7 on-call support for critical SaaS events and documenting issues and remediation steps.
Responsibilities involve proactively creating monitors within the EKS/K8s ecosystem and deploying to EKS/K8s clusters using Terraform and Helm/Flux.
The engineer will enhance infrastructure health by implementing checks and scripts to address known issues and maintain and develop deployment code.
The position requires implementing/integrating new technologies into the Cloud Infrastructure and collaborating with other teams for top-notch support.
Customer focus is prioritized in planning deployments/updates to ensure minimal impact.
Conducting root cause analysis (RCA) and taking corrective actions to prevent issue recurrence is essential.
The engineer will assign alert-related actions to the appropriate team after investigation and handle support requests for environment-specific actions.
Requirements:
Strong experience with issue processing, including RCA and Postmortems, is required.
Proficiency in Kubernetes, including deployment, scaling, and troubleshooting, is necessary.
Familiarity with AWS, Terraform, Docker, and CI/CD practices is essential.
Experience with monitoring tools such as DataDog, Prometheus, Grafana, and logging solutions like Elasticsearch, Logstash, and Kibana (ELK Stack) or AWS CloudWatch is required.
A strong understanding of networking concepts and protocols is needed.
Proficiency in at least one scripting language, such as Python, NodeJS, or Go, is necessary.
Experience with configuration management tools like FluxCD or ArgoCD is required.
Proficiency in Git or other version control systems is essential.
Familiarity with incident response and management tools like PagerDuty, Opsgenie, or VictorOps is necessary.
The candidate should demonstrate ownership, proactiveness, persistence, and a passion for maintaining a high-traffic online platform.
Benefits:
Employees are eligible for quarterly bonuses based on transparent and systematic evaluation.
A flexible work schedule is offered to accommodate personal needs.
Remote work options are available for enhanced flexibility.
Comprehensive medical insurance is provided for employees and their significant others.
Financial support for life events is included as part of the benefits package.
Employees enjoy unlimited paid vacation and unlimited paid sick leave.
Reimbursement for professional development courses and training is available to support career growth.
Apply now
Please, let Playson know you found this job
on RemoteYeah
.
This helps us grow 🌱.