This job post is closed and the position is probably filled. Please do not apply.
🤖 Automatically closed by a robot after apply link
was detected as broken.
Description:
The Site Reliability Engineer (SRE) Manager will be responsible for ensuring the reliability, scalability, and performance of the infrastructure and services through technical contributions, team building, and management.
This position is fully remote, allowing for collaboration with a global team.
Nivoda is an industry-leading B2B diamond and gemstones marketplace connecting jewelry retailers to gemstone supplies.
The SRE Manager will take ownership of the production estate, drive automation initiatives, and build a first-class SRE team.
Responsibilities include incident management, monitoring, automation, and fostering a culture of collaboration and continuous improvement.
Requirements:
Proven experience in a senior or lead SRE role with a strong track record of building and maintaining highly reliable infrastructure and services.
Expertise in incident management, monitoring, alerting, and observability tools like Prometheus, Grafana, ELK stack, or Datadog.
Experience with cloud platforms such as AWS, Azure, or GCP, and infrastructure as code tools like Terraform or CloudFormation.
Strong scripting and automation skills in languages such as Python, Bash, or Go.
Excellent communication and collaboration skills to work effectively with cross-functional teams in a remote environment.
Demonstrated leadership capabilities with a passion for mentoring and developing team members.
Benefits:
Dynamic working environment in a fast-growing company.
Intellectually challenging work with a significant role in Nivoda’s success and scalability.
Global peer connections in a decentralized team.
Collaborative and supportive work environment with minimal hierarchy.