Welcome to RemoteYeah 2.0! Find out more about the new version here.

Remote Sr. Data Engineer

at H1

Posted 2 weeks ago | 0 applied

Description:

  • H1 is seeking a Senior Data Engineer to design and scale systems and pipelines for their data platform.
  • The role involves developing and maintaining scalable data extraction frameworks for structured and unstructured data from diverse sources.
  • Responsibilities include building and optimizing ETL/ELT pipelines using big data technologies, particularly Apache Spark on cloud platforms like AWS EMR.
  • The engineer will improve the efficiency, reliability, and performance of data processing systems through thoughtful design and continuous optimization.
  • The position requires transforming, cleaning, and normalizing complex datasets to ensure high standards of data quality and consistency.
  • The Senior Data Engineer will partner with senior engineers to evolve H1’s data architecture and infrastructure to support product and platform scalability.
  • Leading data integration efforts across multiple systems and ensuring accuracy and seamless collaboration across teams is essential.
  • The role includes monitoring and troubleshooting data flows and pipelines, proactively identifying and resolving performance issues.
  • Clear documentation of systems, workflows, and processes is necessary to promote transparency and operational excellence.
  • Participation in code reviews and promoting a culture of engineering excellence, mentorship, and continuous improvement is expected.
  • Collaboration with cross-functional teams to align technical execution with business goals is a key aspect of the role.

Requirements:

  • Candidates must have 6+ years of experience in data engineering, specifically with large-scale data systems and pipelines.
  • Proficiency in programming languages such as Python, Java, or similar is required.
  • Strong SQL skills are necessary, including the ability to write optimized complex queries for large datasets using advanced SQL operators like GROUP BY, HAVING, window functions, and complex joins.
  • Experience with big data tools like Apache Spark, particularly on cloud platforms, is essential, with a preference for AWS EMR.
  • Familiarity with Docker or other containerization technologies is required.
  • An understanding of Large Language Models (LLMs) and their applications is preferred.
  • Familiarity with model training and fine-tuning, especially in NLP contexts, is a bonus.
  • Basic knowledge of network, security, and encryption protocols such as HTTP/HTTPS/TLS is necessary.
  • Strong analytical and problem-solving skills with a focus on data quality and performance optimization are required.
  • Candidates should have a passion for writing clean, efficient code and following best practices.

Benefits:

  • H1 offers a full suite of health insurance options, along with generous paid time off.
  • Employees can enjoy pre-planned company-wide wellness holidays.
  • Retirement options are available for employees.
  • Health and charitable donation stipends are provided.
  • Employees can participate in impactful Business Resource Groups.
  • Flexible work hours and the opportunity to work from anywhere are offered.
  • The position provides the chance to work with leading biotech and life sciences companies in an innovative industry focused on improving global healthcare.