Please, let na BHub know you found this job
on RemoteYeah.
This helps us grow 🌱.
Description:
As a key member of the SRE team, you will play a crucial role in providing infrastructure services to enhance platform resilience, ensuring availability and smooth operation without failures.
You will collaborate with the entire engineering team to specify availability, performance, reliability, and efficiency requirements for current and future versions of our product.
You will ensure the availability and resilience of our product's infrastructure.
You will support the team in orchestrating by designing and implementing solutions to ensure the scalability of our infrastructure, meeting the continuous growth needs of the business.
You will implement and manage monitoring systems to proactively identify issues and anomalies in our production environment.
You will build and maintain automations that reduce toil and improve the reliability of our systems as a whole.
You will proactively support continuous improvement in the area.
Requirements:
You must have proven experience as a Site Reliability Engineer, Production Engineer, DevOps Engineer, Software Engineer, or a similar role in a high-scale environment.
You need strong proficiency in cloud infrastructure (AWS), including concepts of operating systems, distributed systems, storage, networking, and security.
You should have strong knowledge of the Linux operating system, including security, troubleshooting, metrics collection, and performance analysis.
You must have experience with the implementation, maintenance, and monitoring of environments.
You should have experience with containerization technologies and container orchestration platforms such as Kubernetes or equivalents.
You need proficiency in infrastructure as code (CDK, Terraform, Pulumi, etc.).
You must possess excellent problem-solving and debugging skills.
You should have strong communication and collaboration skills.
You need knowledge of best security practices in cloud environments.
You should have good knowledge of CI/CD and DevOps.
You must have knowledge in observability, including building alerts and monitoring.
Benefits:
You will receive a meal/food allowance (Flash Card).
You will have access to a health plan.
You will be provided with life insurance.
You will enjoy a day off on your birthday month.
You will have access to Zenklub for mental health support.
You will receive a TotalPass for fitness and wellness activities.
Apply now
Please, let na BHub know you found this job
on RemoteYeah
.
This helps us grow 🌱.