The Diffuse Project is seeking a Machine Learning Infrastructure Engineer to lead the development of robust, scalable backend systems that power machine learning–driven discoveries in structural biology.
The role involves working at the intersection of scientific research and software engineering, collaborating with researchers to train, test, and deploy ML models directly on experimental data from X-ray crystallography and cryo-EM.
This position is a 6-month assignment with the potential for extension.
Key responsibilities include architecting, building, and maintaining ML infrastructure pipelines for model training, validation, and deployment across diverse experimental datasets in collaboration with scientists.
The engineer will design and manage data ingestion and preprocessing workflows for structural biology data, develop and maintain backend services and APIs, support GPU/accelerated training, implement data versioning and model tracking tools, and collaborate with ML researchers and experimentalists.
Requirements:
Strong programming skills in Python, ideally with experience in PyTorch.
A deep understanding of machine learning infrastructure, including model training pipelines, GPU utilization, experiment tracking, and deployment.
Proficiency in backend development, including REST APIs, containerization with Docker, workflow management, and data engineering tools.
Experience with distributed compute environments.
A solid understanding of scientific computing workflows, version control, and reproducibility principles.
At least two years of experience working on ML models.
Familiarity with structural biology data formats is a bonus.
Experience designing systems for diffusion-based models is also a bonus.
The ability to work effectively in a multidisciplinary team environment is essential.
Benefits:
This position offers W-2, fixed-term employment for a 6-month assignment with the potential for extension based on performance and business needs.
The role is remote, with access to an office located in Emeryville, CA, and may require some travel for in-person collaboration.
The successful candidate will receive a competitive compensation package, commensurate with their experience and location.