Our client is a publicly traded company leading the AI revolution with an AI-centric cloud platform.
The company provides advanced infrastructure, including large-scale GPU clusters, cloud platforms, tools, and services for developers.
The mission is to democratize access to AI infrastructure and empower organizations to create, optimize, and deploy AI solutions at any scale.
We are seeking a Senior AI/ML Specialist Solutions Architect to design and implement scalable AI solutions for AI-focused customers.
The role involves working with state-of-the-art technologies and contributing to one of the most powerful commercially available supercomputers.
Responsibilities include architecting and optimizing distributed training and inference systems, designing customer-focused solutions, leading the transition of ML pipelines, building long-term customer relationships, creating whitepapers, providing technical leadership, and collaborating with engineering and product teams.
Requirements:
Candidates must have 5+ years of experience with cloud technologies and infrastructure, ideally in senior MLOps or Solutions Architect roles.
Proven expertise in scaling and optimizing AI workloads across multi-node and multi-GPU environments is required.
Demonstrated success in delivering ML products from POC to production is essential.
Deep knowledge of ML frameworks like PyTorch and JAX is necessary.
A strong background in the NVIDIA HPC ecosystem (CUDA, NCCL, Infiniband) is required.
Exceptional communication skills to engage both technical teams and business stakeholders are a must.
Legal authorization to work in the United States on a full-time basis without sponsorship is required.
Benefits:
The position offers competitive compensation ranging from $180,000 to $300,000 per year, negotiable based on experience and location.
Full medical benefits include 100% company-paid medical, dental, and vision coverage for employees and their families.
A 401(k) plan with a 4% match program is provided.
Employees can participate in a stock options plan.
The company offers a flexible remote work environment.
Company-paid short-term, long-term disability, and life insurance coverage are included.
Paid parental leave is available, with 20 weeks for primary caregivers and 12 weeks for secondary caregivers.
Up to $85/month is provided for mobile and internet expenses.
Employees will work with state-of-the-art AI and cloud technologies, including the latest NVIDIA GPUs.
The opportunity to be part of a team operating one of the most powerful commercially available supercomputers is available.
Employees will contribute to sustainable AI infrastructure, with energy-efficient data centers that recover waste heat to warm nearby residential buildings.