Remote Platform Engineer II - Machine Learning Infrastructure

Posted 3 weeks ago 5 applied

Description:

The Hendrix ML Platform team is focused on developing a robust platform for training and serving machine learning models across Spotify.
This platform aims to streamline the productionization of AI and ML models by reducing the complexities involved in creating backend services for serving predictions and training models.
Responsibilities include managing and maintaining large scale production Kubernetes clusters for ML workloads, including ML platform infrastructure and necessary dev ops.
The role involves contributing to the Spotify ML Platform SDK and building tools for various ML operations.
Collaboration with Machine Learning Engineers (MLE), researchers, and product teams is essential to deliver scalable ML platform tooling solutions that meet timelines and specifications.
The position requires working independently and collaboratively on squad projects, often necessitating the learning and application of new technologies.
The engineer will design, document, and implement reliable, testable, and maintainable solutions for ML infrastructure capabilities.

Requirements:

Candidates must have 3+ years of hands-on experience implementing production ML infrastructure at scale using Python, Go, or similar languages.
A minimum of 3+ years of experience working with a public cloud provider such as GCP, AWS, or Azure is required, with a preference for GCP.
Knowledge of deep learning fundamentals, algorithms, and open-source tools such as Huggingface, Ray, PyTorch, or TensorFlow is necessary.
An understanding of distributed training leveraging GPUs and Kubernetes is considered a good to have.
A general understanding of data processing for ML is required.
Experience with agile software processes and modular code design following industry standards is essential.

Benefits:

This role is based in Toronto, providing a location for in-person meetings while allowing flexibility to work from home.
The company offers the flexibility to work where you are most productive, accommodating both remote and in-office work arrangements.

Apply now

Please let Spotify know you found this job on RemoteYeah. This helps us get more companies to post jobs here for you.

Hiring company

S

Spotify

View all Spotify jobs Visit spotifyjobs.com

About the job

Posted on

July 20, 2025

Job type

Full-time

Salary

-

Location requirements

🇨🇦 Canada

Job title

Machine Learning Ops Engineer

Experience level

Mid-level

Degree requirement

🎓🚫 No degree required

Skills

TensorFlow AWS Azure GCP Kubernetes Agile Machine Learning Deep learning Python Go PyTorch Hugging Face

Benefits

-

Report this job

Job expired or something else is wrong with this job?

Report job