This job post is closed and the position is probably filled. Please do not apply.
🤖 Automatically closed by a robot after apply link
was detected as broken.
Description:
Build and manage a data lake in AWS, enhancing the existing LakeFormation based architecture.
Develop and maintain data pipelines from various sources like streaming datasets, APIs, and data stores using PySpark and AWS Glue.
Create datasets from the data lake to support different use cases such as business analytics, dashboards, reports, and machine learning.
Make technical decisions on serving data consumers efficiently.
Utilize existing AWS architectures and create new ones using the CDK toolkit.
Automate data workloads in AWS, including pipeline automation and monitoring implementation.
Collaborate with cross-functional teams to understand business needs and design appropriate data flows.
Requirements:
Bachelor’s degree in computer science, a related technical field, or equivalent practical experience.
Minimum 3 years of hands-on experience developing data solutions in a modern cloud environment.
Proficiency in Python.
Experience in creating and maintaining ETL jobs (experience with PySpark is a plus).
Familiarity with relational and non-relational data stores.
Knowledge of AWS ecosystem, Infrastructure-as-code methodologies, and CDK.
Ability to manage production data workloads effectively, including issue detection, diagnosis, and monitoring.
Passion for excellence and contributing to a sustainable world.
Benefits:
Full-time position.
Compensation based on experience.
Opportunity to join a company focused on creating a circular supply chain for electric vehicles and clean energy products.
Chance to work on critical growth projects with a significant impact on day-to-day operations and scalability.