This job post is closed and the position is probably filled. Please do not apply.
🤖 Automatically closed by a robot after apply link
was detected as broken.
Description:
Nivoda is looking for a skilled Site Reliability Engineering (SRE) Manager to ensure the reliability, scalability, and performance of infrastructure and services.
This fully remote position allows you to collaborate with a global team.
Responsibilities include owning the production estate, incident management, designing incident tracking processes, and developing monitoring and automation tooling.
The role involves building and leading a high-performing SRE team through coaching, mentoring, and fostering a culture of collaboration and innovation.
Requirements:
Proven experience in a senior or lead SRE role with a strong track record in maintaining reliable infrastructure.
Expertise in incident management, monitoring tools like Prometheus and Grafana, and cloud platforms such as AWS, Azure, or GCP.
Proficiency in scripting languages like Python, Bash, or Go, and infrastructure as code tools like Terraform or CloudFormation.
Excellent communication and collaboration skills to work effectively in a remote, cross-functional team.
Demonstrated leadership capabilities with a passion for mentoring and developing team members.
Benefits:
Dynamic working environment in a fast-growing company.
Intellectually challenging work with a significant impact on Nivoda's success.
Opportunity to connect with peers globally in a decentralized team.
Collaborative and supportive work environment with flexible working hours.