Remote Machine Learning Engineer (Voice Cloning and Speech Synthesis)

Posted

Apply now
Please, let Factored know you found this job on RemoteYeah. This helps us grow 🌱.

Description:

  • Factored is seeking an experienced Machine Learning Engineer with expertise in text-to-speech (TTS) models and voice cloning technologies.
  • The role involves developing and optimizing ML models to enhance user experience for voice actors generating content in multiple languages.
  • Responsibilities include designing, developing, and optimizing TTS models while maintaining the style and authenticity of original voice actors.
  • The engineer will implement real-time, scalable voice cloning systems with under 1-second inference time.
  • Collaboration with teams on audio datasets, including voice recordings and multilingual transcriptions, is essential.
  • The position requires experimentation with models like StyleDiffusion and exploring advanced approaches for realistic speech synthesis.
  • Ensuring performance reliability across millions of users by scaling systems for high-demand scenarios is a key task.
  • The engineer will handle audio data preparation, including splitting, up/downsampling, and file management using tools like Whisper.
  • Integration of models into a cloud environment (e.g., AWS) for deployment and monitoring is also part of the role.

Requirements:

  • Candidates must have strong proficiency in Python and experience with machine learning frameworks such as TensorFlow or PyTorch.
  • Proven expertise in speech synthesis models and TTS technologies, focusing on realistic, human-like outputs, is required.
  • Experience with voice cloning and familiarity with models like StyleDiffusion or similar is necessary.
  • The ability to deliver real-time solutions with high-performance reliability in production environments is essential.
  • Experience working with audio datasets, including data preprocessing, splitting, upsampling/downsampling, and file management, is required.
  • Familiarity with multilingual models and working with transcriptions in multiple languages is expected.
  • Proficiency in cloud platforms like AWS and experience deploying machine learning models in production environments is necessary.
  • Experience with Whisper or similar tools for handling audio datasets is required.
  • Knowledge of traditional ML techniques, including XGBoost or gradient boosting for model optimization, is a plus.

Benefits:

  • Factored offers a transparent workplace where every employee has a voice in building the company.
  • The company is committed to investing in employees' career and professional growth in meaningful ways.
  • Employees are encouraged to work with passionate and intelligent colleagues, fostering a collaborative environment.
  • The company values honesty, diligence, and kindness, creating a positive workplace culture.
  • Employees have opportunities for learning and growth based on merit, not just experience.
  • Factored promotes a fun and engaging work environment, with activities such as making music, playing sports, and hosting parties.
Leave a feedback