We are seeking a highly skilled Lead Data Engineer with strong expertise in PySpark, SQL, and Python, Azure Data Factory, Synapse, Databricks, and Fabric, as well as a solid understanding of ETL and data warehousing end-to-end principles.
The ideal candidate will have a proven track record of designing, building, and maintaining scalable data pipelines in a collaborative and fast-paced environment.
Key responsibilities include designing and developing scalable data pipelines using PySpark to support analytics and reporting needs, writing efficient SQL and Python code to transform, cleanse, and optimize large datasets, and collaborating with machine learning engineers, product managers, and developers to understand data requirements and deliver solutions.
The role also involves implementing and maintaining robust ETL processes to integrate structured and semi-structured data from various sources, ensuring data quality, integrity, and reliability across pipelines and systems, and participating in code reviews, troubleshooting, and performance tuning.
The candidate will work independently and proactively to identify and resolve data-related issues, contribute to Azure-based data solutions, support cloud migration initiatives and DevOps practices, and provide guidance on best practices while mentoring junior team members when needed.
Requirements:
Candidates must have 8+ years of overall experience working with cross-functional teams, including machine learning engineers, developers, product managers, and analytics teams.
A minimum of 3+ years of hands-on experience developing and managing data pipelines using PySpark is required.
Candidates should have 3 to 5 years of experience with Azure-native services, including Azure Data Lake Storage (ADLS), Azure Data Factory (ADF), Databricks, Azure Synapse Analytics / Azure SQL DB / Fabric.
Strong programming skills in Python and SQL are essential.
Solid experience in ETL processes and data modeling/data warehousing end-to-end solutions is necessary.
Candidates must be self-driven, resourceful, and comfortable working in dynamic, fast-paced environments.
Advanced written and spoken English is a must-have for this position (B2, C1, or C2 only).
Strong communication skills are also required.
Nice to have qualifications include Databricks certification, knowledge of DevOps, CI/CD pipelines, and cloud migration best practices, familiarity with Event Hub, IoT Hub, Azure Stream Analytics, Azure Analysis Services, and Cosmos DB, basic understanding of SAP HANA, and intermediate-level experience with Power BI.
Benefits:
The position is 100% remote for nearshore candidates located in Central or South America.
The contract type is independent contractor, which does not include PTO, tax deductions, or insurance, but covers the monthly payment based on hours worked.
The initial contract/project duration is 6 months, with the possibility of extension based on performance.
Full-time working hours are Monday to Friday (8 hours per day, 40 hours per week), from 8:00 AM to 5:00 PM PST (U.S. time zone).
Contractors are required to use their own laptop/PC.
Payment methods include international bank transfer, PayPal, Wise, Payoneer, etc.
Joining the team offers the opportunity to be part of an innovative group shaping the future of technology, work in a collaborative and inclusive environment, and access opportunities for professional development and career growth.