Remote Machine Learning Engineer (RAGs) at Factored

Description:

Factored is seeking a skilled Machine Learning Engineer with a focus on Retrieval-Augmented Generation (RAG) models to join their team.
The role involves designing, developing, and optimizing RAG models that integrate retrieval-based and generation-based approaches to solve complex problems for high-profile clients.
Responsibilities include improving RAG model performance through advanced algorithms and model fine-tuning.
The engineer will collaborate with client Data and Engineering teams to build robust machine learning infrastructure.
The position requires working closely with client leadership to identify AI/ML opportunities for transformative solutions.
The engineer will fine-tune large language models (LLMs) within the RAG framework for specific tasks and domains.
The role includes deploying RAG models into production environments and ensuring seamless integration.
Advanced machine learning techniques will be applied to develop effective AI solutions tailored to client needs.
The engineer must write clean, maintainable, and scalable code, ensuring all development is well-documented and testable.
User experience and customer needs will be prioritized in all product development efforts.
The engineer will design and develop frameworks for GenAI products, such as search interfaces, chatbots, and summarization tools.
The role contributes to client growth and success through innovative, AI-driven solutions and provides technical leadership in identifying AI/ML opportunities.

A Bachelor’s or Master’s degree in Computer Science, Statistics, Mathematics, or a related field is required.
Candidates must have 5+ years of hands-on experience developing and deploying machine learning models in production environments.
A minimum of 4 years of experience with production NLP and deep learning models using frameworks like PyTorch and TensorFlow is necessary.
At least 1 year of experience with Retrieval-Augmented Generation (RAG) and advanced techniques to optimize model performance is required.
Proven experience writing production-level code with strong proficiency in Python is essential.
Expertise in working with large language models (LLMs) such as GPT, Gemini, and Claude, along with proficiency in LLM frameworks like LangChain, is required.
A strong understanding of prompting techniques and the trade-offs between prompting and fine-tuning is necessary.
Experience with cloud platforms such as AWS or GCP (AWS preferred) or equivalent on-premise platforms is required.
Nice to have: Experience with cloud data warehouses (e.g., Snowflake, BigQuery) and relational databases (e.g., PostgreSQL, MySQL) is a plus.
Knowledge of building recommender systems is also a nice to have.

Factored offers a transparent workplace where everyone has a voice in building the company.
The company is committed to supporting career and professional growth in meaningful ways.
Employees are recognized for their intelligence and passion, with a focus on collaboration and kindness.
The work environment encourages learning and growth based on merit, not just resumes.
Factored invests in its employees, aiming to create a high-performing, fast-growing business that positively impacts the perception of technical talent in Latin America.
The company promotes a culture of fun and camaraderie, with opportunities to engage in activities like music, sports, games, and cooking together.