Remote Sr. Site Reliability Engineer at Cleo

Description:

The Sr. Site Reliability Engineer at Cleo will demonstrate passion and leadership in application deployment, reliability, and performance.
This role involves investigating and troubleshooting issues across the platform.
The engineer will analyze, develop, and enhance automation deployments.
Writing scripts to automate complex tasks is a key responsibility.
The engineer will lead from an engineering perspective in sprint meetings, helping to estimate stories and raise awareness regarding implementation details.
Monitoring an APM (e.g., DataDog) in production and raising awareness to the team is required.
The role includes controlling application log collection and analysis, as well as application and instance alerts related to site reliability.
The engineer will lead discussions in infrastructure architecture and contribute to technical discussions focused on reliability.
Maintaining, evaluating, and upgrading the platform's base infrastructure is essential.
The engineer will lead application code deployment methods and mentor other engineers.
Demonstrating ownership for automation across the team and resolving production issues is expected.
Performing code reviews and identifying strategies to resolve technical debt are part of the responsibilities.
Ensuring infrastructure scalability and creating user stories to resolve production issues, technical debt, and opportunities for improvement are also required.

Requirements:

Candidates should have 4+ years of experience in DevOps, System, or Software Engineering.
Strong experience running production application workloads in Cloud (AWS) is necessary.
An understanding of public Cloud networks, VPC peering, etc., is required.
Candidates should have 3+ years of experience utilizing Cloud computing (EC2, SNS/SQS, RDS).
Experience with containers and orchestration (Docker, Kubernetes, ECS) is essential.
Familiarity with administrating technologies at scale such as ElasticSearch, Postgres, and Redis is required.
Candidates should have experience with monitoring tools like Data Dog and AWS.
Experience managing and automating Continuous Integration and Continuous Delivery (CI/CD) using GitLab, CircleCI, TeamCity, Jenkins, or similar is necessary.
Provisioning and configuration management experience with Terraform is required.
Candidates should have Linux or Windows server administration experience.
A strong desire to influence the direction of Cleo’s DevOps practices is essential.
Familiarity with scripting languages such as Python, JS, or TS is required.
Experience integrating 3rd party tooling (container scanning, static and dynamic analysis) into the pipeline is necessary.
Candidates should be able to weave monitoring, logging, and alerting into everything they build.
A quality and security-conscious approach when implementing solutions is required.
The ability to debug complicated issues in collaboration with peers is essential.
Experience working within and upholding compliance with HIPAA and other standards (SOC2 and HITRUST) is necessary.

Benefits:

The base salary range for this position is $165,000 - $180,000 annually.
Cleo offers health insurance, including medical, dental, and vision coverage.
Employees receive 15 paid holidays and a 5-day winter break.
Flexible vacation time and sick time are provided.
The company offers 16 weeks of paid parental leave.
A 401(k) plan is available for employees.
Disability insurance and life insurance are included in the benefits package.
Wellness perks and additional benefits are also offered.