Remote Staff Site Reliability Engineer

Posted

This job is closed

This job post is closed and the position is probably filled. Please do not apply.  Automatically closed by a robot after apply link was detected as broken.

Description:

  • Varo’s SRE team is well established, designing, building, and running large-scale, distributed, fault-tolerant systems that power most of Varo's operations.
  • The team focuses on AWS and Kubernetes, maintaining an open-source first and result-oriented mindset.
  • The SRE team is automation and observability focused, striving to automate manual tasks and promote a data-driven approach to scaling the platform.
  • Daily activities include scaling production infrastructure, building CI/CD pipelines, and collaborating with developers to enhance operations.
  • As a Staff Site Reliability Engineer (SRE), you will ensure the reliability, scalability, and performance of cloud-based services.
  • You will drive best practices and contribute to the design and implementation of robust cloud infrastructures while shaping the technical roadmap.

Requirements:

  • A minimum of 12 years of experience as a Site Reliability, DevOps, or Software Engineer with proficiency in one or more high-level languages (such as Python, GoLang, Ruby, Java, or JavaScript) is required.
  • Proven leadership experience in SRE team settings, focusing on driving and architecting projects is essential.
  • Expert Linux and troubleshooting skills are required.
  • Experience in building and supporting high-availability cloud environments in AWS is necessary.
  • Expertise in Infrastructure as Code (IaC) and deployment automation with tools such as Terraform, Helm, Gitlab, or equivalent is required.
  • Experience running Kubernetes and Istio in production is essential.
  • Advanced observability skills with monitoring, logging, and tracing tools such as Prometheus, Grafana, Jaeger/Tempo, ELK/Loki, and OpenTelemetry are required.
  • Experience instrumenting code (Java/Kotlin, Python, Go, etc.) and creating simple instrumentation frameworks for developers is necessary.
  • Participation in an on-call rotation for after-hours production infrastructure incidents is required.
  • Experience with SDLC, CI/CD, and related tooling is essential.
  • Kafka and message streaming experience is a plus.

Benefits:

  • The salary range for this role is $200,000 - $220,000 per year, based on function, level, and geographic location.
  • Final offer amounts are determined by multiple factors, including candidate experience and expertise, and may vary from the identified range.
About the job
Posted on
Job type
Salary
$ 200,000 - 220,000 USD / year
Leave a feedback