This job post is closed and the position is probably filled. Please do not apply.
π€ Automatically closed by a robot after apply link
was detected as broken.
Description:
Evooq is seeking a talented Data Engineer to join their team and assist in constructing their data platform, utilizing advanced technologies like Dagster, Starburst, and S3.
This is a fully remote position, aligned with Singapore hours, requiring candidates to be in a similar timezone.
Responsibilities include designing, developing, and maintaining ELT pipelines for data ingestion, processing, and storage.
The preferred tools are Dagster for orchestration, Starburst Galaxy as a query engine, dbt for transformations, and Amazon S3 + Iceberg for an open data lakehouse.
The role involves implementing and managing a data platform using a modern data stack, centralizing data from diverse sources, collaborating with data science and analytics teams, ensuring data quality, and optimizing data management systems' performance.
Additionally, participation in the automation of data processes and CI/CD of data workflows is expected.
Requirements:
Hands-on experience in developing data pipelines using a modern data stack and data lakehouse.
Proficiency in SQL and Python is required.
Knowledge of open table formats like Apache Iceberg, file formats such as Parquet, and relational and non-relational database management systems.
Experience with dbt and cloud-based infrastructure, particularly AWS, is necessary.
Familiarity with object storage systems like S3 and understanding of data governance principles are essential.
Nice to have: Experience with data processing technologies like Apache Spark, Hadoop, orchestration tools such as Airflow, Trino-based query engines, centralizing data into a unified data lakehouse, data visualization tools, and DevOps practices.
Benefits:
This position offers the opportunity to work remotely, following Singapore hours.
You will be part of a team contributing to building a cutting-edge data platform using advanced technologies.
The role involves collaborating with data science and analytics teams, ensuring data quality, and optimizing data management systems' performance.
Additionally, you will participate in the automation of data processes and continuous integration/continuous deployment (CI/CD) of data workflows.