This job post is closed and the position is probably filled. Please do not apply.
🤖 Automatically closed by a robot after apply link
was detected as broken.
Description:
The Site Reliability Engineer will be part of the Everbridge Kubernetes Platform team, focusing on ensuring the overall service quality and availability of Everbridge's solutions.
The role involves designing, deploying, and managing Kubernetes at scale, while promoting Kubernetes and SRE best practices.
Responsibilities include maintaining the Kubernetes infrastructure within AWS, including services such as VPCs, EC2, Transit Gateways, IAM roles and policies, Route53, S3, Security Groups, and NACLs.
The engineer will enhance operational availability, security, scalability, efficiency, monitoring, instrumentation, and overall service reliability of Everbridge's Kubernetes solutions.
Collaboration with Agile teams, including Architects, Developers, Quality, Data, Security, and other engineers, is essential for designing and implementing highly reliable solutions.
The role requires researching and implementing SRE and Kubernetes best practices, creating automation, fostering cross-functional collaboration, and making data-driven decisions to ensure system integrity and reliability.
The engineer will manage Windows workstations and servers and participate in a rotating on-call rotation to resolve production escalations.
Requirements:
Candidates must have 2+ years of technical AWS experience, managing and owning systems in a production environment.
A minimum of 1+ years of Kubernetes experience (EKS, AKS, GKE, or self-managed) is required.
Candidates should have 2+ years of experience with Terraform or similar Infrastructure as Code (IaC) tools.
A minimum of 3+ years of Microsoft Windows System Administration experience is necessary.
Experience with tooling such as GitLab CICD, Packer, Docker, EKS, Kubernetes, Spinnaker, Helm, Argo, and Jenkins is required.
Familiarity with telemetry tools like Datadog, SumoLogic, Grafana, and Prometheus is essential.
Candidates should have experience writing automation in languages such as Python, Go, Bash, or Java.
Experience with configuration management tools such as Salt, Ansible, or AWS user_data is required.
A background in a DevOps/SRE production environment and Agile practices is necessary.
UNIX/Linux experience is required.
Experience working on DoD programs is preferred.
Candidates must currently hold a Secret Clearance or be a US citizen with the ability to obtain a Secret Clearance.
Candidates must have or be able to obtain and maintain DoD 8140 “Intermediate” level or higher certification (formerly DoD 8170 IAM Level II).
Benefits:
The estimated salary for this role ranges from $90,000 to $130,000, with potential variable compensation based on skills, qualifications, and experience.
Everbridge offers a comprehensive and inclusive range of employee benefits, including healthcare and dental coverage.
Benefits also include parental planning and mental health support, disability income benefits, life and AD&D insurance, and a 401(k) plan with matching contributions.
Employees receive paid time off and fitness reimbursements as part of the benefits package.