Remote Data Engineer at Wave HQ

Description:

The Data Engineer will report to the Senior Manager of AI & Data Platform and will be responsible for building tools and infrastructure to support the Data Products and Insights & Innovation teams.
This role requires a talented, curious self-starter who is driven to solve complex problems and can manage multiple domains and stakeholders.
The Data Engineer will collaborate with all levels of the Data and AI team and various engineering teams to develop data solutions, scale data infrastructure, and advance the organization towards a data-centric model.
Key responsibilities include designing, building, and deploying components of a modern data stack, including CDC ingestion using Debezium, a centralized Hudi data lake, and various data pipelines.
The role involves maintaining legacy Python ELT scripts while transitioning to dbt models in Redshift, ensuring operational stability and innovation.
Collaboration within a cross-functional team is essential for planning and rolling out data infrastructure and processing pipelines that support analytics, machine learning, and GenAI services.
The Data Engineer must thrive in ambiguous conditions, independently identifying opportunities to optimize pipelines and improve data workflows under tight deadlines.
Responsibilities also include responding to PagerDuty alerts, implementing monitoring solutions, and ensuring high availability and reliability of data systems.
Strong communication skills are necessary to assist technical and non-technical audiences and to help internal teams surface actionable insights that enhance customer satisfaction.

Requirements:

Candidates must have 3+ years of experience in building data pipelines and managing a secure, modern data stack, including CDC streaming ingestion using tools like Debezium into a Hudi data lake.
At least 3 years of experience working with AWS cloud infrastructure, including Kafka (MSK), Spark/AWS Glue, and infrastructure as code (IaC) using Terraform is required.
Strong coding skills in Python, SQL, and dbt are necessary, with the ability to write and review high-quality, maintainable code.
Prior experience in building data lakes on S3 using Apache Hudi with various file formats such as Parquet, Avro, JSON, and CSV is essential.
Candidates should have experience building and managing multi-stage workflows using serverless Lambdas and AWS Step Functions.
Familiarity with data governance practices, including data quality, lineage, and privacy, is required, along with experience using cataloging tools.
Experience developing and deploying data pipeline solutions using CI/CD best practices is necessary.
Working knowledge of data integration tools such as Stitch and Segment CDP is required.
Knowledge and practical experience with analytical and machine learning tools like Athena, Redshift, or Sagemaker Feature Store is a bonus.

Benefits:

Employees have the flexibility to work from the office in downtown Toronto or remotely, allowing them to choose their preferred work environment.
The company offers diverse learning experiences, educational allowances, mentorship, and support for personal growth.
A strong investment in health and wellness is made, addressing body, mind, and soul.
Fair compensation and various office perks are provided, along with the expected benefits of a growing tech company.
Wave promotes a diverse and inclusive culture, valuing individuality and encouraging open feedback, fostering an environment where innovation flourishes.
The company has been recognized as one of Canada's Top Ten Most Admired Corporate Cultures and one of Canada’s Great Places to Work in various categories.