This job post is closed and the position is probably filled. Please do not apply.
🤖 Automatically closed by a robot after apply link
was detected as broken.
Description:
Replicant is seeking a Senior Site Reliability Engineer to enhance the infrastructure and systems that support their AI platform for customer service automation.
The role involves ensuring the smooth operation and high availability of production systems, monitoring performance, identifying bottlenecks, and implementing optimizations.
Responsibilities include developing and maintaining tools for incident resolution, collaborating with engineering teams to improve application reliability and scalability, and participating in on-call rotations.
The engineer will contribute to infrastructure design with a focus on scalability, security, and cost-effectiveness, while staying updated on industry best practices in SRE and DevOps.
The core technology stack includes TypeScript/NodeJS, Python, Kubernetes on GCP, and tools like Helm, Terraform, Datadog, and Prometheus.
Requirements:
Candidates must have proven experience in managing and troubleshooting complex, distributed systems in a production environment.
A strong understanding of cloud platforms, preferably GCP, and containerization technologies like Kubernetes is required.
Proficiency in scripting languages and automation tools such as Python, Bash, and Terraform is essential.
Experience with monitoring and observability systems like Datadog and Prometheus is necessary.
Excellent problem-solving skills and a proactive approach to identifying potential issues are required.
Strong communication and collaboration skills are essential for effective teamwork.
A passion for ensuring the reliability and performance of critical systems is expected.
Benefits:
The position offers a remote working environment that respects time zone differences.
Employees receive highly competitive salaries, equity, and a 401(k) plan for US employees.
Top-of-the-line healthcare benefits, including medical, vision, and dental coverage, are provided.
A health and wellness perk is included to support employee well-being.
An equipment stipend is available to ensure employees have the necessary tools for their work.
The company has a flexible vacation policy to promote work-life balance.
Employees can participate in team trips and offsites, fostering a strong team culture.
After 4.5 years of service, employees are eligible for a 5-week sabbatical.