The Cloud Data Engineer is responsible for optimizing data architecture and pipelines, ensuring optimal data flow and collection for cross-functional teams.
This role involves engaging throughout the entire project lifecycle, from data mapping and modeling to consumption.
The Data Engineer must be self-directed and capable of supporting the data needs of Project Kitty Hawk (PKH) and the AWS Data Lake by working on both small and moderate-scope projects.
The position reports to the VP of Information Technology and is exempt from the overtime provisions of the Fair Labor Standards Act (FLSA).
Key responsibilities include implementing and managing infrastructure as code through AWS CDK, ensuring scalable and robust Data Lake solutions, and utilizing AWS Lake Formation, AWS Glue Catalog, and S3 for data lake implementation.
The role also involves implementing CI/CD pipelines using SaaS services like GitHub, GitHub Actions, and AWS CodeDeploy, designing and managing pipelines to load data into the Data Lake from multiple sources, and identifying improvements to data engineering practices.
The Data Engineer will manage and optimize Data Lake storage on AWS for efficient data access and performance, collaborate with architects to maintain compliance with enterprise security and access controls, and automate testing and deployment processes for data jobs.
Additional responsibilities include developing BI datasets, reports, and dashboards, ensuring compliance with change control procedures, documenting all phases of work, and collaborating with business analysts to improve reporting and analysis processes.
Requirements:
A Bachelor’s degree in computer science or a related technical field; equivalent experience may be considered.
A minimum of 5 years of experience in data engineering, focusing on data modeling, Data Lakes, and dashboard development utilizing tools such as QuickSight, PowerBI, or Tableau.
2-5 years of previous analytics and/or quantitative work experience in a higher education setting.
3-5 years of experience as a data engineer within an AWS cloud environment.
3-5 years of experience implementing and using CI/CD pipelines with GitHub, GitHub Actions, and AWS CodeDeploy.
3-5 years of experience working with infrastructure-as-code (IaC) using AWS CloudFormation with AWS CDK.
A creative problem solver with a practical solutions orientation, passionate about technology and software development.
Experience developing custom scripts, complex SQL queries, and dataframes with PySpark and Python with Pandas.
Excellent oral and written communication skills, with the ability to present complex information effectively to a wide range of audiences.
Ability to prioritize multiple projects and assignments, work independently as well as in a team environment, and deal effectively with complex assignments and projects.
Proficiency in Python and Typescript.
Experience in Agile environments and familiarity with business intelligence tools (e.g., Tableau, Power BI, QuickSight).
Proven ability to innovate and implement solutions to complex data challenges.
Benefits:
Project Kitty Hawk offers a comprehensive benefits package that includes full medical, dental, and vision coverage.
Employees receive a 401K with match and a generous time-off policy that includes paid volunteer time.
The organization is committed to providing competitive salaries with the potential for performance bonuses.
Remote working flexibility is available.
Project Kitty Hawk is an Equal Opportunity Employer and is dedicated to non-discrimination in all employment practices.