Remote Senior Data Scientist with LLM experience

at Fusemachines

Posted 9 hours ago 1 applied

Description:

  • Fusemachines is a leading provider of AI strategy, talent, and education services, founded by Dr. Sameer Maskey.
  • The company aims to democratize AI and has a presence in four countries: Nepal, the United States, Canada, and the Dominican Republic.
  • This is a remote role offered as a 6-month contract, with the potential for full-time employment directly with the client afterward.
  • As a Data Scientist, you will contribute to new product development in a collaborative, small-team environment.
  • Your responsibilities will include writing production code for both run-time and build-time applications.
  • You will design and implement data-driven solutions for complex business challenges by working with large-scale natural language datasets.
  • The role involves prototyping new ideas and collaborating with data scientists, product designers, data engineers, front-end developers, and domain experts.
  • You will work in a fast-paced, start-up-like culture while leveraging the resources and scale of an established company.

Requirements:

  • You must have practical experience with large language models (LLMs), prompt engineering, fine-tuning RAG-based applications, and benchmarking using frameworks like LangChain.
  • A strong background in natural language processing (NLP) is required, with experience using tools such as spaCy, word2vec, Flair, and BERT.
  • Formal training in machine learning is necessary, including knowledge of dimensionality reduction, clustering, embeddings, and sequence classification algorithms.
  • Proficiency in Python and experience with ML frameworks like PyTorch, TensorFlow, and Hugging Face Transformers are essential.
  • You should have experience with cloud platforms such as AWS, GCP, or Azure.
  • An understanding of data modeling principles and complex data architectures is required.
  • Experience working with relational and NoSQL databases and vector stores (e.g., MySQL, Postgres, Solr, Elasticsearch, OpenSearch) is necessary.
  • Familiarity with distributed computing frameworks like Spark, Scala, or Ray is highly preferred.
  • Knowledge of API development, containerization (Docker, Kubernetes), and ML deployment is highly preferred.
  • Hands-on experience with ML Ops/AI Ops, including experiment tracking tools like LangFuse and DVC, is required.
  • Experience with deep learning frameworks such as PyTorch, TensorFlow, and Hugging Face Transformers is necessary.
  • A Master's degree in Data Science, Computer Science, Statistics, Machine Learning, or a related field is preferred.
  • You should have at least 5+ years of relevant work experience.

Benefits:

  • The position offers the opportunity to work remotely, providing flexibility in your work environment.
  • You will have the chance to contribute to innovative projects in a collaborative team setting.
  • The role allows for professional growth and the potential for full-time employment after the initial contract period.
  • You will be part of a diverse and inclusive workplace that values contributions from all qualified individuals.