This job post is closed and the position is probably filled. Please do not apply.
π€ Automatically closed by a robot after apply link
was detected as broken.
Description:
The Senior DevOps Engineer will be responsible for ensuring high availability, reliability, and scalability of systems.
Architect, build, and monitor cloud-native architectures using Kubernetes, particularly focusing on machine learning and AI workloads.
Collaborate with data scientists and ML engineers to streamline the build and deployment process for ML models and LLMs in Kubernetes.
Manage infrastructure for continuous integration, delivery, and monitoring of ML models and AI services.
Optimize infrastructure for efficient training, deployment, and scaling of ML models and LLMs, utilizing Kubernetes and cloud-native tools like AWS SageMaker.
Develop and maintain monitoring and alerting solutions tailored to ML and AI workloads.
Troubleshoot and resolve production incidents with minimal downtime.
Ensure security and compliance of production systems, particularly protecting sensitive AI and ML data.
Mentor and coach junior DevOps engineers.
Requirements:
Bachelor's degree in Computer Science, Engineering, or related field.
Minimum 7 years of experience in maintaining optimal performance of online production environments.
At least 4 years of experience managing production Kubernetes infrastructure.
Strong experience with Docker for containerization.
Deep understanding of the machine learning lifecycle, including model training, deployment, monitoring, and scaling.
Experience with MLOps tools and frameworks like Kubeflow, MLflow.
Familiarity with LLMOps and scripting languages such as Python.
Proficiency in Infrastructure deployment and automation tools like Terraform, CloudFormation.
Expertise in monitoring and logging solutions such as Prometheus and Grafana.
Strong knowledge of Linux systems, networking, and security concepts.
Excellent communication and collaboration skills.
Experience working in an agile environment.
Certifications like CKA or CKAD are a plus.
Benefits:
Great team camaraderie and collaboration.
Opportunity to work remotely.
Competitive salary ranging from $130,000 to $175,000 a year.
3 weeks of paid vacation.
Generous medical, dental, and vision plans.
Sick leave and paid holidays.
Modern technologies and tools for continuous learning.
Supportive and self-managing team environment.
Amenities like stocked kitchen, stand/sit workstations, and casual work environment.