This job post is closed and the position is probably filled. Please do not apply.
🤖 Automatically closed by a robot after apply link
was detected as broken.
Description:
The Senior Software Engineer, AI Inference position at Deepgram involves implementing and optimizing inference code for speech AI models.
Responsibilities include developing, testing, and deploying application code for massive-scale production services.
The role requires debugging complex system issues related to networking, scheduling, and high-performance computing interactions.
Building internal tooling for analysis and benchmarking to enhance efficiency improvements is a key aspect of the job.
Experimenting with optimization techniques for machine learning workloads on NVIDIA GPUs and implementing successful strategies in production is essential.
Requirements:
Ability to work collaboratively in a fast-paced environment and adapt to changing priorities.
Proven industry experience in building and shipping production services.
Strong proficiency in a lower-level language such as C, C++, or Rust.
Experience in breaking down large projects into smaller experiments or incremental improvements.
Expertise in a machine learning framework like Torch or Tensorflow.
Familiarity with GPU programming using tools like CUDA or libraries such as cuDNN, cuBLAS, etc.
Benefits:
Opportunity to work with a trailblazing research team on novel model architectures.
Ownership of features from collaboration with researchers to testing in production.
In-depth involvement with profilers, hardware architectures, and inference algorithms.
Collaborative work environment within a humble team focused on mission-critical production services.
Professional growth and impact in the AI industry with cutting-edge technology.