This job post is closed and the position is probably filled. Please do not apply.
π€ Automatically closed by a robot after apply link
was detected as broken.
Description:
Take ownership of the entire data lifecycle, collaborating closely with researchers and engineers.
Ensure data is reliable, accessible, and optimized for model training and evaluation.
Explore innovative data augmentation techniques.
Gain firsthand experience in developing cutting-edge multimodal foundation models.
Requirements:
Proven experience in data engineering with a strong background in building and managing scalable data pipelines.
Proficiency in Python and working with big data technologies like Hadoop or Spark.
Experience with cloud platforms (e.g., AWS, Azure, Google Cloud) and data-related services (e.g., S3, BigQuery, Redshift).
Strong understanding of data storage, processing, and retrieval methods for structured and unstructured data.
Familiarity with machine learning workflows, especially in model training and evaluation.
Preferred experience with containerization technologies (e.g., Docker) and orchestration tools (e.g., Kubernetes).
Strong problem-solving skills and a proactive approach to implementing new technologies.
Benefits:
Work with a collaborative mission-driven team on cutting-edge AI technology.
Open and inclusive work environment.
4 weeks paid leave.
Visa support (H1B and OPT transfer for US Employees).