Remote Senior Site Reliability Engineer

Posted

Apply now
Please, let ZayZoon know you found this job on RemoteYeah. This helps us grow 🌱.

Description:

  • ZayZoon is seeking a Senior Site Reliability Engineer to enhance its cloud infrastructure with complex AWS builds, infrastructure-as-code, and observability/logging/APM solutions.
  • The role involves working in an embedded reliability team alongside app and data engineers to monitor, benchmark, and scale ZayZoon’s products.
  • Responsibilities include developing and maintaining infrastructure-as-code CloudFormation templates, focusing on serverless resources such as ECS, Fargate, and Lambda.
  • The engineer will perform instrumentation and daily metrics analysis of infrastructure performance and Ruby on Rails applications using AWS tools and third-party observability platforms.
  • Managing deployment pipelines, including blue/green deployments and intelligent auto-scaling, is a key responsibility.
  • The role requires maintaining resource dependencies, particularly for databases, and planning for updates and downtime.
  • The engineer will project costs and implement AWS cost-saving programs and reserved instances.
  • Collaboration with risk and security teams to ensure ongoing SOC-2 and cybersecurity compliance is essential.
  • Extensive collaboration with app developers on shared metrics, database performance, and load testing is expected.
  • The engineer will also work with data engineers to facilitate data warehouse development, ELT, and ETL processes.
  • Participation in the agile development process, including sprint planning, story grooming, and stand-ups, is required.
  • Adherence to SDLC and secure coding practices is mandatory.

Requirements:

  • Candidates must have 5+ years of infrastructure experience.
  • A minimum of 2+ years of AWS experience, including certification and deployment of production applications, is required.
  • Proficiency with Infrastructure as Code (IaC), specifically CloudFormation, is necessary.
  • Experience with containerization technologies such as Docker, ECS, and ECR is essential.
  • Candidates should have experience analyzing and addressing performance issues using observability platforms like DataDog, NewRelic, and OTel.
  • The ability to build quickly for experimentation and cleanly for core functionality is important.
  • Strong SQL and data analysis skills, along with a willingness to engage in data-driven problem-solving, are required.

Benefits:

  • ZayZoon offers a fully remote work environment across Canada and the US.
  • The company is committed to reviewing every application and providing timely feedback to candidates.
  • Employees can expect a supportive hiring process, ensuring they are kept informed throughout.
Apply now
Please, let ZayZoon know you found this job on RemoteYeah . This helps us grow 🌱.
About the job
Report this job

Job expired or something else is wrong with this job?

Report this job
Leave a feedback