Remote Machine Learning Engineer — Inference Optimization

Posted 2 months ago

Share:

Please let Featherless AI know you found this job on RemoteYeah. This helps us get more companies to post jobs here for you.

Description:

  • Own and optimize model inference performance at scale for large-scale ML models.
  • Work at the intersection of research and production to create fast, reliable, and cost-efficient systems.

Requirements:

  • Strong experience in ML inference optimization or high-performance ML systems.
  • Solid understanding of deep learning internals (attention, memory layout, compute graphs).
  • Hands-on experience with PyTorch (or similar) and model deployment.
  • Familiarity with GPU performance tuning (CUDA, ROCm, Triton, or kernel-level optimizations).
  • Experience scaling inference for real users.
  • Comfortable in fast-moving startup environments with ownership and ambiguity.

Benefits:

  • Real ownership over performance-critical systems.
  • Direct impact on product reliability and unit economics.
  • Close collaboration with research, infra, and product teams.
  • Competitive compensation + meaningful equity at Series A.
  • A team that values engineering quality over hype.

Job type

Experience level

Required experience

-

Salary

-

Degree requirement

No degree required

Location requirements

Report this job

Job expired or something else is wrong with this job?

Report job
SerpApi

SerpApi

Scrape Google and other search engines from our fast, easy, and complete API.

RemoteYeah Ads