Remote Senior Site Reliability Engineer (Remote - India)

at Jobgether

Posted 3 days ago 0 applied

Description:

  • The position is for a Senior Site Reliability Engineer at Dremio, located in India, and is a remote role.
  • The role involves maintaining and improving mission-critical systems in a cloud-native environment.
  • Responsibilities include designing reliable infrastructure, automating deployment processes, and ensuring services scale across multiple cloud providers.
  • The position offers deep technical engagement with Kubernetes, service meshes, and observability tools.
  • The engineer will promote a culture of resilience and continuous improvement.
  • Key accountabilities include leading improvements in Kubernetes usage, extending cross-cloud networking solutions, and collaborating with engineering teams to ensure production readiness.
  • The role requires defining and implementing Service Level Indicators (SLIs) and Service Level Objectives (SLOs).
  • The engineer will drive observability efforts, optimize and debug code, and advocate for reliability engineering practices.
  • Participation in an on-call rotation and leading incident response is expected.
  • The position also involves promoting scalable practices and supporting continuous delivery transformation.

Requirements:

  • Candidates must have 10+ years of experience in Site Reliability Engineering, DevOps, or Cloud Infrastructure, with deep exposure to distributed systems.
  • Advanced proficiency in Kubernetes, Istio, Terraform, Terragrunt, and ArgoCD/Flux is required.
  • A strong understanding of cloud-native networking, VPNs, and multi-cloud connectivity solutions is necessary.
  • Demonstrated hands-on experience with cloud platforms including GCP, AWS, and Azure is essential.
  • Candidates should be skilled in Python or Go, with the ability to debug and review Java when necessary.
  • Proven ability to design, analyze, and troubleshoot large-scale distributed architectures is required.
  • Strong communication, ownership, and problem-solving abilities are essential, with a focus on resilience and automation.
  • Bonus points for experience managing Kubernetes clusters at large scale (1,000+ nodes) and developing production-grade SLIs/SLOs.

Benefits:

  • The position offers a competitive compensation package.
  • A flexible hybrid work environment is provided, with Workplace Wednesdays to promote team connection and collaboration.
  • Employees receive catered lunches or meal credits on in-office days, along with local social events.
  • There is generous paid time off and wellness initiatives available.
  • Comprehensive healthcare coverage, including medical, dental, and vision, is included.
  • Professional development opportunities and support for continued learning are offered.
  • The company promotes a collaborative, fast-paced culture driven by innovation, ownership, and accountability.