This job post is closed and the position is probably filled. Please do not apply.
π€ Automatically closed by a robot after apply link
was detected as broken.
Description:
As a Senior Site Reliability Engineer at Rise8, you will develop and deliver configuration and deployment automation required for continuously improving the functionality, availability, and manageability of platform and associated services.
You will design, build, and maintain scalable and reliable infrastructure in cloud environments.
The role involves automating the deployment process and infrastructure management.
You will develop and implement Service Level Objectives (SLOs) to measure and maintain system reliability.
Leading incident response with root cause analysis and post-mortems to continuously improve system resilience will be part of your responsibilities.
You will work closely with developers and platform engineers in troubleshooting issues.
Taking operational responsibility for improving the services that your team supports is expected.
You will work in an environment that supports your individual growth.
Continuously enhancing the product and delivering software that resonates positively with users and meets their needs effectively is a key aspect of the role.
Participation in on-call rotations to ensure high availability of critical systems is required.
Requirements:
A background of 6-10 years in cloud/platform operations or related roles across diverse environments is required.
Strong proficiency in administering Azure or AWS Gov Cloud environments is necessary.
Expertise in container orchestration, specifically Kubernetes, is essential.
You must have expertise with monitoring and observability platforms.
Proven experience in incident management and troubleshooting large-scale distributed systems is required.
Strong proficiency in the use of Infrastructure as Code (IAC) tools, such as Terraform, is necessary.
Subject Matter Expertise in Linux Operating Systems administration is required.
Strong skills in developing automation solutions using scripting languages to streamline recurring tasks are essential.
An excellent understanding of networking concepts and practical experience involving technologies like Load Balancers, DNS, SSL, Firewalls, NAT, and NTP is necessary.
Proven ability to troubleshoot operating systems and applications effectively is required.
Experience handling large-scale production systems and adeptness in resolving production-related challenges is necessary.
Experience implementing SLOs and other reliability practices is required.
Excellent communication skills and a keen interest in collaborative work environments are essential.
Enthusiasm for skill growth, a penchant for tackling intriguing tasks, and addressing complex issues is necessary.
A degree (BA/BS) in Computer Science or a related field, or equivalent practical experience is required.
Benefits:
Rise8 offers a flexible schedule in a 100% distributed workforce.
Premium insurance coverage includes up to 100% of the employee premium and up to 80% of the combined dependent premium on the base health plan, along with 100% coverage for employee and dependent Dental and Vision, as well as employee premiums for Life and Disability coverage.
A 401k match at 10% of gross pay is provided for retirement.
Paid time off (PTO) includes 4 weeks of combined accrued vacation and sick leave, 10 Federal holidays, your birthday, jury duty, and bereavement.
An accrued budget of up to $3,500 per year for education and training classes, travel, events, and materials is available.
Rise8 offers $750 per year for home office technology and equipment, as well as $100 per year for Rise8 merchandise from their Swag Store.
A wellness budget covers 100% reimbursement on a variety of wellness activities and products, up to $500 per calendar year, or alternatively, a $75 monthly credit towards a Life Time membership.
Equipment provided includes a MacBook Pro and multi-port adapter.