Description:

The Personalization team at Spotify focuses on enhancing user experience by making music and podcast recommendations more enjoyable.
The role involves collaborating with the Text-to-Speech (TTS) team, Speak, to create generated voice audio that enriches user experiences.
Responsibilities include optimizing machine learning models for production use cases, ensuring efficiency and scalability.
The position requires designing and building efficient serving infrastructure for machine learning models to support large-scale deployments.
The engineer will optimize machine learning models in Pytorch or other libraries for real-time serving and production applications.
The role includes leading the transition of machine learning models from research and development into production.
Building and maintaining scalable Kubernetes clusters for managing and deploying machine learning models is essential.
The engineer will implement and monitor logging metrics, diagnose infrastructure issues, and contribute to an on-call schedule for production stability.
The position involves influencing technical design, architecture, and infrastructure decisions for diverse machine learning architectures.
Collaboration with stakeholders to drive initiatives related to serving and optimizing machine learning models at scale is required.

Requirements:

A passion for speech, audio, and/or generative machine learning is essential.
Candidates must have expertise in optimizing machine learning models for production use cases and extensive experience with frameworks like Pytorch.
Experience in building efficient, scalable infrastructure for serving machine learning models and managing Kubernetes clusters in multi-region setups is required.
A strong understanding of transitioning machine learning models from research to production is necessary.
Familiarity with writing logging metrics and diagnosing production issues is important, along with a willingness to participate in an on-call schedule.
A collaborative mindset and enjoyment in working closely with research scientists, machine learning engineers, and backend engineers are crucial.
Candidates should thrive in environments that require solving complex infrastructure challenges, including scaling and performance optimization.
Experience with low-level machine learning libraries (e.g., Triton, CUDA) and performance optimization for custom components is a bonus.

The position offers flexibility to work from anywhere within the European region, excluding France due to on-call restrictions.
The team operates within the GMT/CET time zone for collaboration, allowing for a balanced work-life integration.
Employees are encouraged to work in environments that suit them best, promoting productivity and comfort.