Please, let Crisis Text Line, Inc. know you found this job
on RemoteYeah.
This helps us grow π±.
Description:
The Staff Site Reliability Engineer (SRE) is a remote position that reports to the Senior Engineering Manager of SRE/Infrastructure.
This role is a key technical leader responsible for ensuring the reliability, scalability, and security of the platform.
The SRE will architect, build, and maintain tooling that empowers software engineering teams and manage the infrastructure supporting the Crisis Text Line service.
Collaboration with developers is essential to drive performance optimization, implement best practices, and ensure a secure environment.
The position focuses on enhancing engineer productivity through automation and streamlined workflows.
Key responsibilities include leading and mentoring a team of 5 SREs, enforcing security best practices, designing and maintaining AWS infrastructure, optimizing application performance, developing monitoring systems, automating tasks, responding to incidents, and conducting security audits.
The SRE will also communicate expectations and progress clearly, provide mentorship, write high-quality code, manage time effectively, and participate in retrospectives to improve processes.
Requirements:
A Bachelor's degree in Computer Science, Engineering, or a related field is required; a Master's degree is preferred.
Proven experience as a Staff SRE or in a similar role, with a strong focus on infrastructure and DevOps in a software delivery capacity is necessary.
Experience maintaining the reliability of online SaaS/PaaS with a 24/7 schedule is required.
Proficiency in AWS and infrastructure as code tools such as Terraform or CloudFormation is essential.
Strong scripting and automation skills, particularly in Python, along with knowledge of containerization and orchestration tools like Docker and Kubernetes are required.
Experience implementing CI/CD pipelines and observability tools, such as GitHub Actions and Datadog, is necessary.
A commitment to ethical practices, data privacy, and security is essential.
Solid understanding of network protocols, security principles, and best practices is required.
Excellent problem-solving skills and the ability to work under pressure, along with strong communication skills for effective collaboration with cross-functional teams, are necessary.
The ability to learn quickly and manage time effectively by focusing on priorities and meeting deadlines is required.
An understanding of essential computer science principles, including basic data structures and control structures, is necessary.
Benefits:
Crisis Text Line offers 20 paid holidays, including federal holidays, election day, a holiday break from December 24 through January 1, 2 renewal days, and 2 floating holidays.
Employees receive flexible paid time off, which includes 15 vacation days, 3 personal days, and 7 sick days.
Medical, dental, and vision benefits are provided for the staff member and their family at no cost to the employee.
A 403B retirement plan with a 3% contribution by Crisis Text Line is available to support financial wellness.
Employees are entitled to 12 weeks of paid parental leave after 6 months of employment.
A student loan repayment program is available after 2 years of continuous full-time service.
Family support is offered through a virtual childcare platform.
Stipends and allowances include monthly mental health support, internet service, annual professional development, wellness, and a one-time home office setup allowance in the first year.
Note that benefits are only for US-based employees, and international benefits may differ.
Apply now
Please, let Crisis Text Line, Inc. know you found this job
on RemoteYeah
.
This helps us grow π±.