Serve Robotics is seeking a highly skilled ML Performance Engineer to join their robotics team.
This role bridges the gap between ML research and real-time deployment, enabling advanced ML models to run efficiently on edge hardware such as NVIDIA Jetson platforms.
The engineer will work closely with ML researchers, embedded systems engineers, and robotics software teams to ensure optimal performance of state-of-the-art models on robotic platforms.
Responsibilities include owning the full lifecycle of ML model deployment, converting and optimizing trained models for Jetson platforms, developing CUDA kernels for low-latency inference, and profiling existing ML workloads.
The engineer will also identify and remove compute and memory bottlenecks, design strategies for model compression, manage memory layout and concurrency, build benchmarking pipelines, and collaborate with QA teams to validate model behavior.
Requirements:
A Bachelor’s degree in Computer Science, Robotics, Electrical Engineering, or a related field is required.
Candidates must have 3+ years of experience in deploying ML models on embedded or edge platforms, preferably in robotics.
A minimum of 2+ years of experience with CUDA, TensorRT, and other NVIDIA acceleration tools is necessary.
Proficiency in Python and C++ is required, especially for performance-sensitive systems.
Experience with NVIDIA Jetson platforms and edge inference tools is essential.
Familiarity with model conversion workflows, such as PyTorch to ONNX to TensorRT, is required.
Benefits:
Serve Robotics offers a collaborative and respectful work environment with a diverse team of tech industry veterans.
Employees will have the opportunity to work on cutting-edge technology in robotics and machine learning.
The company is focused on solving real-world problems and improving the end-to-end user experience.
There are opportunities for professional growth and influence over model architectures for edge deployability.