Welcome to RemoteYeah 2.0! Find out more about the new version here.

Remote Member of Technical Staff, Data Engineer

at Odyssey

Posted 1 week ago | 0 applied

Description:

  • Odyssey is seeking a data engineer to own their ML/data platform, which involves a mixture of infrastructure, tooling, and data pipelines.
  • The role will enable researchers to efficiently work with multimodal data, conduct experiments, and move models to production.
  • The position offers significant autonomy in technical decisions and opportunities for growth into a technical leadership role.
  • Responsibilities include designing and implementing scalable data pipelines, collaborating with ML researchers, making architectural decisions, and improving Kubernetes-based infrastructure.
  • A typical week involves designing scalable data pipelines, optimizing data preprocessing, and improving data platform infrastructure.

Requirements:

  • Candidates must have 5+ years of software engineering experience, particularly in data platforms.
  • Strong Python development and system design expertise is required.
  • Deep experience with data pipeline development and ETL processes is essential.
  • Production Kubernetes experience and container orchestration expertise are necessary.
  • Hands-on experience with data-oriented ML infrastructure tools, such as experiment tracking and feature stores, is required.
  • Proficiency with cloud platforms like AWS, GCP, or Azure is needed.
  • Experience with data versioning and experiment tracking systems is important.
  • A solid understanding of ML workflows and researcher needs is required.

Benefits:

  • The position offers the opportunity to help build the data platform engineering team as the company scales.
  • Employees will have the chance to define the technical strategy for data platform and infrastructure.
  • There are opportunities to establish partnerships with data platform framework open source projects and vendors.
  • The role allows for shaping the technical hiring strategy and engaging with the broader data and ML infrastructure community.