Remote Senior Site Reliability Engineer

Posted

This job is closed

This job post is closed and the position is probably filled. Please do not apply.  Automatically closed by a robot after apply link was detected as broken.

Description:

  • Monitor and troubleshoot production incidents proactively, identifying and resolving issues quickly and efficiently.
  • Implement automated monitoring and alerting systems for early detection of potential problems.
  • Collaborate with development teams to perform deployments and rollbacks with minimal disruption.
  • Optimize the performance and scalability of AWS infrastructure, including DynamoDB, MySQL, S3, CloudSearch, OpenSearch, Kafka, Presto, SES, and E2.
  • Write and maintain infrastructure code using Terraform and scripts to automate tasks and improve operational efficiency.
  • Proactively identify and address potential security vulnerabilities.
  • Participate in incident response and post-mortem analysis activities to identify root causes and prevent future occurrences.
  • Help onboard and mentor junior team members, sharing knowledge and expertise.
  • Stay up to date on the latest cloud technologies and best practices for SRE.
  • Participate in a low-volume on-call rotation with other Site Reliability Engineers.
  • Explore new technologies and innovative solutions to improve service quality and speed to market.
  • Participate in technical discussions and deep dives with other engineering and product teams.

Requirements:

  • 5+ years of experience as a Site Reliability Engineer or similar role.
  • Strong understanding of AWS cloud services, including DynamoDB, MySQL, S3, CloudSearch, OpenSearch, Kafka, Presto, SES, and E2.
  • Experience with infrastructure automation tools like Ansible, Terraform, or CloudFormation.
  • Experience with monitoring and alerting tools like DataDog, Prometheus, Grafana, Kibana, and PagerDuty.
  • Experience with Cl/CD pipelines and deployment strategies.
  • Strong problem-solving and analytical skills.
  • Excellent communication and collaboration skills.
  • Ability to work independently and take ownership of complex tasks.
  • Passion for technology and a desire to learn and grow.
  • Experience with Rancher, Cattleprod, Jenkins, TeamCity, PostgreSQL, and MongoDB.
  • Experience with security best practices and tools.
  • Experience working in a fast-paced, agile environment.

Benefits:

  • The base salary range for this position is $120,000 - $140,000 annually.
  • Comprehensive benefits including medical, dental, vision, life & disability insurance, a 401(k) plan with company match, and an unlimited PTO policy.
About the job
Posted on
Job type
Salary
$ 120,000 - 140,000 USD / year
Position

-

Experience level
Leave a feedback