We are seeking a highly skilled and independent Data Scientist with 4-10 years of experience to join our team on a contractual basis (3-12 months).
The focus of this role will be on advancing our generative AI capabilities, particularly in content creation pipelines.
Responsibilities include developing and optimizing generative AI models for audio generation and lip-sync, ensuring high fidelity and natural delivery.
The candidate will extend current language models to support regional languages beyond US/UK English for audio and content generation.
Implementing features for emotional delivery in audio (e.g., shouting, crying) to enhance realism is also required.
The role involves integrating and synchronizing background scores seamlessly with generated video content.
The candidate will work towards achieving video quality standards comparable to VO3/Sora.
Conducting fine-tuning of Flux models on existing generative datasets is part of the responsibilities.
Ensuring consistency in scenes and character generation across multiple outputs is essential.
The candidate will design and implement an automated evaluation framework to replace manual review processes.
Developing robust filters for pre-screening content (cover images, video frames, audio) before human review is required.
The role demands driving initiatives independently, showcasing high agency and accountability.
A research-first approach with rapid experimentation in the fast-evolving Generative AI space is expected.
Strong first-principle thinking to tackle complex challenges is necessary.
Requirements:
The candidate must have 4-10 years of experience in Data Science, with a strong focus on Generative AI.
Proven expertise in developing and deploying models for audio and video generation is required.
Demonstrated experience with natural language processing (NLP), especially for regional language adaptation, is essential.
Familiarity with state-of-the-art models in generative AI (e.g., Flux, diffusion models, GANs) is necessary.
Experience with model fine-tuning and optimization techniques is required.
Some experience in ML deployments and FastAPI is preferred.
Strong programming skills in Python and relevant deep learning frameworks (e.g., TensorFlow, PyTorch) are essential.
Experience in designing and implementing automated evaluation metrics for generative content is required.
A portfolio or demonstrable experience in projects related to content generation, lip-sync, or emotional AI is a plus.
The candidate must have the ability to work independently, with an initial commitment to on-site work in Bengaluru.
Exceptional problem-solving skills and a proactive approach to research and experimentation are necessary.
Benefits:
The position offers a best-in-class salary, as we hire only the best and pay accordingly.
Employees will have the opportunity to participate in Proximity Talks, where they can meet other designers, engineers, and product geeks, and learn from experts in the field.
The role provides a chance to keep on learning with a world-class team, working with the best in the field, challenging oneself constantly, and learning something new every day.