Granicus is seeking an experienced and highly skilled Senior Site Reliability Engineer (SRE) to join their SRE team.
The role involves ensuring the reliability, scalability, and performance of services.
Responsibilities include providing on-call production support, working on customer and internal engineering tickets, and managing the SRE backlog.
The SRE will continuously monitor the health and performance of services, respond to alerts and incidents, and develop automation scripts to streamline operations.
The position requires assisting in troubleshooting incidents, participating in system improvements, and collaborating with software engineers on application requirements.
Documentation of processes and capacity planning activities are also key responsibilities.
The SRE must implement and adhere to security best practices to protect systems and data.
Requirements:
Candidates must have 5+ years of experience in site reliability engineering, system administration, or a similar role, with a proven track record of managing large-scale, high-availability systems.
Experience supporting AI/ML infrastructure, including model deployment and integration with services like AWS Bedrock, is highly desirable.
Expertise in Linux/Unix systems and cloud platforms such as AWS, Azure, or Google Cloud is required.
Strong proficiency in scripting languages (Python, Bash, Ruby) and programming languages (Go, Java, C++) is necessary.
Familiarity with AI/ML operations, including model lifecycle management and inference performance tuning, is expected.
Experience with the ELK Stack for centralized logging and monitoring, as well as configuration management tools like Ansible, Chef, or Puppet, is required.
Relevant certifications such as AWS Certified DevOps Engineer or AWS Certified Machine Learning – Specialty are a plus.
Benefits:
Granicus offers a competitive benefits package that allows employees to tailor benefits to their needs.
Benefits include flexible time off, medical (with a 100% employer-paid option), dental and vision insurance, and a 401(k) plan with matching contributions.
Employees receive paid parental leave and employer-paid short and long-term disability insurance, group term life insurance, and AD&D insurance.
Group legal coverage is also provided, along with additional benefits.