Remote L3 Cloud DevOps Engineer / Site Reliability Engineer (SRE)

Posted

This job is closed

This job post is closed and the position is probably filled. Please do not apply.  Automatically closed by a robot after apply link was detected as broken.

Description:

  • We are seeking an experienced L3 Cloud DevOps Engineer with a strong focus on Site Reliability Engineering (SRE) to join our team.
  • This role is centered around the creation and enhancement of monitoring and alerting tools, with significant emphasis on using Grafana, Prometheus, and Datadog.
  • The ideal candidate will have hands-on experience with Python scripting and a solid understanding of user and system monitoring.
  • This role involves proactive dashboard building, cross-functional collaboration, and addressing service issues through monitoring and remediation.

Requirements:

  • Extensive hands-on experience with Python scripting is required.
  • Strong expertise in Site Reliability Engineering (SRE) practices is essential.
  • Proficiency in Grafana, including dashboard creation and modification, is necessary.
  • In-depth knowledge of Prometheus and Datadog tools for monitoring and alerting is required.
  • Experience with user and system monitoring, along with the ability to create and enhance dashboards and runbooks, is needed.
  • DevOps experience is a secondary but desirable skill set.
  • Relevant certifications or courses in Python, SRE, Grafana, and Prometheus are a plus.

Benefits:

  • This is a full-time contractor position that offers the flexibility of remote work.
  • The role provides opportunities for professional growth and development in the field of Cloud DevOps and SRE.
  • You will have the chance to work with cutting-edge monitoring and alerting tools.
  • Collaboration with cross-functional teams will enhance your skills and experience in a dynamic environment.
About the job
Posted on
Job type
Salary
-
Location requirements

-

Position
Experience level
Technology stack
Leave a feedback