Remote Data Engineer at Endpoint Clinical

Description:

The Data Engineer plays a critical role in designing, implementing, and maintaining the data infrastructure that drives business intelligence, analytics, and data science initiatives.
This position requires expertise in Databricks, SQL, Python, Spark, and other Big Data tools, with a strong emphasis on ELT/ETL processes.
Responsibilities include designing, developing, and maintaining scalable ETL/ELT pipelines using Databricks and other big data technologies.
The engineer will optimize data workflows to handle large volumes of data efficiently and build and manage data warehouses and data lakes to store structured and unstructured data.
The role involves utilizing SQL, Python, and Spark for data extraction, transformation, and loading processes.
The Data Engineer will work closely with data analysts and data scientists to understand their data needs and ensure the availability of clean, reliable data.
Integrating data from various sources and implementing data quality checks to ensure data accuracy, completeness, and consistency are key tasks.
The engineer will develop and enforce data governance policies and procedures to maintain high data quality standards.
Developing and supporting BI tools and dashboards to provide business insights and data-driven decision-making support is also part of the role.
Continuous monitoring and improvement of data pipeline performance, addressing bottlenecks, and optimizing resources are essential responsibilities.
Documentation of data processes, workflows, and architecture for future reference and knowledge sharing is required.
Ensuring compliance with data security and privacy regulations, such as GDPR and HIPAA, is crucial.

Requirements:

A Bachelor's degree in Computer Science, Software Engineering, Mathematics, or a related technical field is preferred.
Candidates should have 4-6 years of technical experience with a strong focus on Big Data technologies in areas such as software engineering, integrations, data warehousing, data analysis, or business intelligence, preferably at a technology or biotech/pharma company.
Proficiency in Databricks for data engineering tasks is required.
Advanced knowledge of SQL for complex queries, data manipulation, and performance tuning is necessary.
Strong programming skills in Python for scripting and automation are essential.
Experience with Big Data tools (e.g., Spark, Hadoop) and data processing frameworks is required.
Familiarity with BI tools (e.g., Tableau, Power BI) and experience in developing dashboards and reports is necessary.
Experience with cloud platforms and tools like Azure ADF or Databricks is preferred.
Familiarity with data modeling and data architecture design is required.
An understanding of machine learning concepts and their application in data engineering is beneficial.
Candidates should possess keen attention to detail, excellent organizational skills, and the ability to multi-task.
Strong interpersonal skills with the ability to work effectively with a wide variety of professionals are essential.

Benefits:

All job offers will be based on a candidate’s location, skills, prior relevant experience, applicable degrees/certifications, as well as internal equity and market data.
Regular, full-time or part-time employees working 30 or more hours per week are eligible for comprehensive benefits including Medical, Dental, Vision, Life, STD/LTD, and 401(K).
Employees will receive Paid time off (PTO) or Flexible time off (FTO), and a Company bonus where applicable.
Endpoint Clinical is an equal opportunities employer AA/M/F/Veteran/Disability.
Pursuant to the San Francisco Fair Chance Ordinance, qualified applicants with arrest and conviction records will be considered for employment.