Nebius is a leader in cloud computing focused on serving the global AI economy by providing tools and resources to solve real-world challenges.
The company is headquartered in Amsterdam and has a global presence with R&D hubs across Europe, North America, and Israel.
The team consists of over 800 employees, including more than 400 skilled engineers with expertise in hardware and software engineering.
The role is part of AI Studio within Nebius Cloud, which operates one of the largest GPU clouds with tens of thousands of GPUs.
The position involves building an inference and fine-tuning platform for various foundation models, including text, vision, audio, and multimodal architectures.
Responsibilities include enhancing fine-tuning methodologies for LLMs, researching and implementing inference optimization techniques, and re-implementing open-source LLM architectures in JAX.
Requirements:
Candidates must have a profound understanding of the theoretical foundations of machine learning and reinforcement learning.
Deep expertise in modern deep learning for language processing and generation is required.
Substantial experience in training large models on multiple computational nodes is necessary.
A reasonable understanding of performance aspects of large neural network training, including sharding strategies and custom kernels, is expected.
Strong software engineering skills, particularly in Python, are essential.
Deep experience with modern deep learning frameworks, specifically JAX, is required.
Proficiency in contemporary software engineering practices, including CI/CD, version control, and unit testing, is necessary.
Strong communication and leadership abilities are expected.
Benefits:
The position offers a competitive salary and a comprehensive benefits package.
There are opportunities for professional growth within Nebius.
Hybrid working arrangements are available.
The work environment is dynamic and collaborative, valuing initiative and innovation.