Please, let H1 know you found this job
on RemoteYeah.
This helps us grow 🌱.
Description:
H1 is seeking a Senior Data Engineer to design and scale systems and pipelines for their data platform.
The role involves developing and maintaining scalable data extraction frameworks for structured and unstructured data from diverse sources.
Responsibilities include building and optimizing ETL/ELT pipelines using big data technologies, particularly Apache Spark on cloud platforms like AWS EMR.
The engineer will improve the efficiency, reliability, and performance of data processing systems through thoughtful design and continuous optimization.
The position requires transforming, cleaning, and normalizing complex datasets to ensure high standards of data quality and consistency.
The Senior Data Engineer will partner with senior engineers to evolve H1’s data architecture and infrastructure to support product and platform scalability.
Leading data integration efforts across multiple systems and ensuring accuracy and seamless collaboration across teams is essential.
The role includes monitoring and troubleshooting data flows and pipelines, proactively identifying and resolving performance issues.
Clear documentation of systems, workflows, and processes is necessary to promote transparency and operational excellence.
Participation in code reviews and promoting a culture of engineering excellence, mentorship, and continuous improvement is expected.
Collaboration with cross-functional teams to align technical execution with business goals is a key aspect of the role.
Requirements:
Candidates must have 6+ years of experience in data engineering, specifically with large-scale data systems and pipelines.
Proficiency in programming languages such as Python, Java, or similar is required.
Strong SQL skills are necessary, including the ability to write optimized complex queries for large datasets using advanced SQL operators like GROUP BY, HAVING, window functions, and complex joins.
Experience with big data tools like Apache Spark, particularly on cloud platforms, is essential, with a preference for AWS EMR.
Familiarity with Docker or other containerization technologies is required.
An understanding of Large Language Models (LLMs) and their applications is preferred.
Familiarity with model training and fine-tuning, especially in NLP contexts, is a bonus.
Basic knowledge of network, security, and encryption protocols such as HTTP/HTTPS/TLS is necessary.
Strong analytical and problem-solving skills with a focus on data quality and performance optimization are required.
Candidates should have a passion for writing clean, efficient code and following best practices.
Benefits:
H1 offers a full suite of health insurance options, along with generous paid time off.
Employees can enjoy pre-planned company-wide wellness holidays.
Retirement options are available for employees.
Health and charitable donation stipends are provided.
Employees can participate in impactful Business Resource Groups.
Flexible work hours and the opportunity to work from anywhere are offered.
The position provides the chance to work with leading biotech and life sciences companies in an innovative industry focused on improving global healthcare.
Apply now
Please, let H1 know you found this job
on RemoteYeah
.
This helps us grow 🌱.