People Data Labs (PDL) is the provider of people and company data, focusing on data collection and standardization to help customers build and scale data solutions.
The company seeks individuals who can balance extreme ownership with a collaborative mindset, as the Data Engineering Team is crucial to their operations.
The role involves building infrastructure for data ingestion, transformation, and loading using technologies like Spark, SQL, AWS, and Databricks.
Responsibilities include creating an entity resolution framework, developing CI/CD pipelines, and solving undefined data engineering problems.
Requirements:
Candidates should have 4-6+ years of industry experience with examples of strategic technical problem-solving and implementation.
Strong software development fundamentals and experience with Python are required.
Expertise in Apache Spark (Java, Scala, and/or Python-based) and experience with SQL is necessary.
Applicants must have experience building scalable data processing systems and using data pipeline orchestration tools like Airflow.
Knowledge of modern data design and storage patterns, experience in Databricks, and familiarity with cloud computing services (preferably AWS) are essential.
Understanding of data warehousing and modern data storage formats is also required.
Candidates should demonstrate the ability to balance ownership and autonomy while collaborating effectively, manage remote work proactively, and communicate well in writing.
Benefits:
Employees receive stock options and competitive salaries.
The company offers unlimited paid time off and comprehensive medical, dental, and vision insurance.
Health, fitness, and office stipends are provided.
There is a permanent ability to work remotely and flexibly.