The company aims to embed empathetic, low-latency voice into applications to enhance human-computer interaction.
The vision is to create a world where AI voice companions can understand and reflect human emotions seamlessly at scale.
The culture is characterized by a self-driven, small-core team that prioritizes ownership, fast iteration, and minimal bureaucracy.
Key responsibilities include optimizing transformer-based voice inference for ultra-low latency, fine-tuning models for emotion understanding and synthesis, and profiling and reducing bottlenecks in streaming ML pipelines.
The role involves designing and building SDKs for voice integration in consumer applications and collaborating with founders on architecture and customer feedback.
The engineer will own the end-to-end ML system, from model design to infrastructure deployment.
Requirements:
Candidates must have experience with PyTorch and CUDA.
Familiarity with vLLM, SGLang, and streaming technologies is required.
Proficiency in Docker and Kubernetes is necessary.
The position requires the ability to work closely with founders in a small, hardcore team of four people.
A strong background in machine learning infrastructure is essential, as the role involves owning ML infrastructure from day one.
Benefits:
The position offers equity up to 2% along with competitive compensation.
Employees will have the opportunity to work on a high-impact consumer product that is reshaping voice interaction.
The role provides a chance to be part of a team experiencing massive early traction with top consumer AI pipelines.