Please, let Guidewire Software know you found this job
on RemoteYeah.
This helps us grow 🌱.
Description:
Guidewire delivers software for Property and Casualty (P&C) insurance companies to protect customers during crises, natural disasters, accidents, and cyber risks.
The Senior Site Reliability Engineer (SRE) will be part of the SRE-Application team, responsible for building and evolving the SRE practice for applications on the Guidewire Cloud Platform.
The role involves applying expertise in automation, software engineering, and operational discipline to ensure the reliability, performance, and scalability of cloud-based solutions.
Responsibilities include collaborating with development teams to troubleshoot problems, developing automated runbooks, monitoring and enhancing application reliability, and documenting incidents for process improvement.
The position requires participation in on-call rotations to ensure service availability and reliability.
Requirements:
Proven experience as a Senior SRE or similar role, with a track record of improving system reliability.
Strong problem-solving skills and the ability to analyze complex systems and devise effective solutions.
Excellent collaboration and communication abilities to work cross-functionally and document processes clearly.
Experience with automation, monitoring, and performance optimization tools and techniques.
Dedication to maximizing uptime, scalability, and delivering an exceptional end-user experience.
A passion for technology and a strong desire to continuously learn and grow skills.
Alignment with Guidewire's mission to leverage technology to help protect and support others.
Proven experience designing and deploying SLIs, SLOs, and Error Budgets.
Proven experience leveraging application performance monitoring (APM) and telemetry tools.
Proven experience triaging and debugging distributed systems on cloud infrastructure.
Proven experience in designing and engineering CI/CD pipelines within Kubernetes and legacy ecosystems.
Proven experience in designing and engineering monitors, dashboards, and synthetic transactions in Datadog.
Proven experience in building, deploying, and running scalable infrastructure within AWS and Kubernetes using Terraform.
Proven experience in managing infrastructure configuration at scale using tools such as GitOps, Puppet, or Ansible.
Good understanding of AWS cloud networking and security with hands-on experience remediating infrastructure vulnerabilities.
Proficiency with Linux system administration and programming/scripting using Python, Go, Java, shell, or equivalent.
Benefits:
Opportunity to join a mission-driven company and make a real impact in the lives of people facing challenges.
Work with cutting-edge technology and collaborate with talented peers.
Grow skills in a culture that values innovation, teamwork, and work-life balance.
Competitive compensation and comprehensive benefits.
Opportunities for career development.
Participation in mandatory on-call rotations to ensure service availability and reliability, including responding to incidents outside of regular business hours.
Apply now
Please, let Guidewire Software know you found this job
on RemoteYeah
.
This helps us grow 🌱.