Remote L3 Cloud DevOps Engineer / Site Reliability Engineer (SRE)
Posted
Apply now
Please, let NTD software know you found this job
on RemoteYeah.
This helps us grow 🌱.
Description:
We are seeking an experienced L3 Cloud DevOps Engineer with a strong focus on Site Reliability Engineering (SRE) to join our team.
This role is centered around the creation and enhancement of monitoring and alerting tools, with significant emphasis on using Grafana, Prometheus, and Datadog.
The ideal candidate will have hands-on experience with Python scripting and a solid understanding of user and system monitoring.
This role involves proactive dashboard building, cross-functional collaboration, and addressing service issues through monitoring and remediation.
Requirements:
Extensive hands-on experience with Python scripting is required.
Strong expertise in Site Reliability Engineering (SRE) practices is essential.
Proficiency in Grafana, including dashboard creation and modification, is necessary.
In-depth knowledge of Prometheus and Datadog tools for monitoring and alerting is required.
Experience with user and system monitoring, along with the ability to create and enhance dashboards and runbooks, is needed.
DevOps experience is a secondary but desirable skill set.
Relevant certifications or courses in Python, SRE, Grafana, and Prometheus are a plus.
Benefits:
This is a full-time contractor position that offers the flexibility of remote work.
The role provides opportunities for professional growth and development in the field of Cloud DevOps and SRE.
You will have the chance to work with cutting-edge monitoring and alerting tools.
Collaboration with cross-functional teams will enhance your skills and experience in a dynamic environment.
Apply now
Please, let NTD software know you found this job
on RemoteYeah
.
This helps us grow 🌱.