Remote Site Reliability Engineer III (SRE) - Guidewire Cloud Platform (Application)
Posted
Apply now
Please, let Guidewire Software know you found this job
on RemoteYeah.
This helps us grow 🌱.
Description:
We are seeking a Site Reliability Engineer III who is eager to contribute to the transformation of the insurance industry with our leading cloud platform.
As a member of the SRE-Application team, you'll play a critical role in ensuring the reliability, performance, and scalability of applications running on our Guidewire Cloud Platform.
This position offers a unique opportunity to apply your skills in automation, software engineering, and operational discipline to support our cloud-based solutions.
You will work with development teams to troubleshoot and resolve issues, minimizing customer impact.
You will develop and maintain automated runbooks to manage issues proactively.
You will apply engineering principles and automation to enhance our operating environments.
You will monitor and improve the reliability and performance of applications on the Guidewire Cloud Platform.
You will use your software engineering expertise to optimize systems and reduce manual toil.
You will document incidents and develop processes to prevent future occurrences.
You will stay current with industry trends, tools, and best practices in site reliability engineering.
You will foster a culture of innovation, learning, and continuous improvement.
You will participate in on-call rotations to ensure the availability and reliability of our services.
Requirements:
You must have experience as an SRE or similar role, with a focus on improving system reliability.
You should possess strong problem-solving skills and the ability to analyze complex systems and devise effective solutions.
Effective collaboration and communication skills are required to work cross-functionally and document processes clearly.
You need experience with automation, monitoring, and performance optimization tools and techniques.
A commitment to maximizing uptime, scalability, and delivering an exceptional end-user experience is essential.
You should have a passion for technology and a desire to continuously learn and grow your skills.
You must align with Guidewire's mission to leverage technology to help protect and support others.
Required skills include experience with designing and implementing SLIs, SLOs, and Error Budgets.
Familiarity with application performance monitoring (APM) and telemetry tools to maintain expected service levels for applications is necessary.
Proficiency with Linux system administration and the ability to program/script using Python, Go, Java, shell, or equivalent is required.
You should have experience troubleshooting and debugging distributed systems on cloud infrastructure.
Experience with CICD pipelines within K8S and legacy ecosystems is necessary.
You need experience creating monitors, dashboards, and synthetic transactions in monitoring tools like Datadog.
Experience deploying and managing scalable infrastructure within AWS and Kubernetes ecosystems using Terraform and other cloud-native approaches is required.
You should have experience with infrastructure configuration management using tools such as GitOps, Puppet, or Ansible.
An understanding of AWS cloud networking and security, with some hands-on experience remediating infrastructure vulnerabilities is necessary.
Benefits:
Guidewire offers equal employment opportunities to all applicants for employment and prohibits discrimination and harassment of any type.
All offers are contingent upon passing a criminal history and other background checks where applicable to the position.
Individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment.
Apply now
Please, let Guidewire Software know you found this job
on RemoteYeah
.
This helps us grow 🌱.