Remote Data Engineer

at People Data Labs

Posted 11 hours ago 2 applied

Description:

  • People Data Labs (PDL) is the provider of people and company data, focusing on data collection and standardization to help customers build and scale data solutions.
  • The company seeks individuals who can balance extreme ownership with a collaborative mindset, as the Data Engineering Team is crucial to their operations.
  • The role involves building infrastructure for data ingestion, transformation, and loading using technologies like Spark, SQL, AWS, and Databricks.
  • Responsibilities include creating an entity resolution framework, developing CI/CD pipelines, and solving undefined data engineering problems.

Requirements:

  • Candidates should have 4-6+ years of industry experience with examples of strategic technical problem-solving and implementation.
  • Strong software development fundamentals and experience with Python are required.
  • Expertise in Apache Spark (Java, Scala, and/or Python-based) and experience with SQL is necessary.
  • Applicants must have experience building scalable data processing systems and using data pipeline orchestration tools like Airflow.
  • Knowledge of modern data design and storage patterns, experience in Databricks, and familiarity with cloud computing services (preferably AWS) are essential.
  • Understanding of data warehousing and modern data storage formats is also required.
  • Candidates should demonstrate the ability to balance ownership and autonomy while collaborating effectively, manage remote work proactively, and communicate well in writing.

Benefits:

  • Employees receive stock options and competitive salaries.
  • The company offers unlimited paid time off and comprehensive medical, dental, and vision insurance.
  • Health, fitness, and office stipends are provided.
  • There is a permanent ability to work remotely and flexibly.