Remote Member of Technical Staff (Data Engineer) at Reka

Description:

We are seeking highly skilled Data Engineers to join our team.
In this role, you will take ownership of the entire data lifecycle.
You will collaborate closely with researchers and engineers to ensure that data is reliable, accessible, and optimized for model training and evaluation.
Additionally, you will have the opportunity to explore innovative data augmentation techniques.
You will gain firsthand experience in how data is used to develop cutting-edge multimodal foundation models.

You should have experience in pre-processing datasets for AI training.
Proven experience in data engineering with a strong background in building and managing scalable data pipelines is required.
You must have worked on one or more modalities other than text.
Proficiency in Python is essential.
You should keep up with state-of-the-art techniques for preparing language model training data.
Experience with cloud platforms (e.g., AWS, Azure, Google Cloud) and data-related services (e.g., S3, BigQuery, Redshift) is necessary.
Familiarity with machine learning workflows, particularly in the context of model training and evaluation, is required.
Preferred qualifications include experience with containerization technologies (e.g., Docker) and orchestration tools (e.g., Kubernetes).

You will have the opportunity to work with a collaborative mission-driven team on cutting-edge AI technology.
The work environment is open and inclusive.
Employees receive 4 weeks of paid leave.
Visa support is provided, including H1B and OPT transfer for US Employees.
Healthcare benefits, including vision and dental, are offered.