The position is for a Senior Site Reliability Engineer at Dremio, located in India, and is a remote role.
The role involves maintaining and improving mission-critical systems in a cloud-native environment.
Responsibilities include designing reliable infrastructure, automating deployment processes, and ensuring services scale across multiple cloud providers.
The position offers deep technical engagement with Kubernetes, service meshes, and observability tools.
The engineer will promote a culture of resilience and continuous improvement.
Key accountabilities include leading improvements in Kubernetes usage, extending cross-cloud networking solutions, and collaborating with engineering teams to ensure production readiness.
The role requires defining and implementing Service Level Indicators (SLIs) and Service Level Objectives (SLOs).
The engineer will drive observability efforts, optimize and debug code, and advocate for reliability engineering practices.
Participation in an on-call rotation and leading incident response is expected.
The position also involves promoting scalable practices and supporting continuous delivery transformation.
Requirements:
Candidates must have 10+ years of experience in Site Reliability Engineering, DevOps, or Cloud Infrastructure, with deep exposure to distributed systems.
Advanced proficiency in Kubernetes, Istio, Terraform, Terragrunt, and ArgoCD/Flux is required.
A strong understanding of cloud-native networking, VPNs, and multi-cloud connectivity solutions is necessary.
Demonstrated hands-on experience with cloud platforms including GCP, AWS, and Azure is essential.
Candidates should be skilled in Python or Go, with the ability to debug and review Java when necessary.
Proven ability to design, analyze, and troubleshoot large-scale distributed architectures is required.
Strong communication, ownership, and problem-solving abilities are essential, with a focus on resilience and automation.
Bonus points for experience managing Kubernetes clusters at large scale (1,000+ nodes) and developing production-grade SLIs/SLOs.
Benefits:
The position offers a competitive compensation package.
A flexible hybrid work environment is provided, with Workplace Wednesdays to promote team connection and collaboration.
Employees receive catered lunches or meal credits on in-office days, along with local social events.
There is generous paid time off and wellness initiatives available.
Comprehensive healthcare coverage, including medical, dental, and vision, is included.
Professional development opportunities and support for continued learning are offered.
The company promotes a collaborative, fast-paced culture driven by innovation, ownership, and accountability.