Remote Sr. Data Engineer Azure Databricks at Fusemachines

Description:

This is a remote, contract position responsible for designing, building, and maintaining the infrastructure required for data integration, storage, processing, and analytics (BI, visualization, and Advanced Analytics).
The role requires a skilled Senior Data Engineer with a strong background in Python, SQL, PySpark, Azure, Databricks, Synapse, Azure Data Lake, DevOps, and cloud-based large scale data applications.
The ideal candidate will develop in an Agile environment, contributing to the architecture, design, and implementation of Data products in the Aviation Industry, including migration from Synapse to Azure Data Lake.
Responsibilities include hands-on coding, mentoring junior staff, and collaborating with multi-disciplined teams to achieve project objectives.
The position involves architecting, designing, developing, testing, and maintaining high-performance, large-scale, complex data architectures that support data integration, storage, processing, orchestration, and infrastructure.

Requirements:

A full-time Bachelor's degree in Computer Science or a similar field is required.
At least 5 years of experience as a data engineer with strong expertise in Databricks, Azure, DevOps, or other hyperscalers is necessary.
A minimum of 5 years of experience with Azure DevOps and GitHub is required.
Proven experience delivering large scale projects and products for Data and Analytics, including migrations, is essential.
Candidates must hold the following certifications: Databricks Certified Associate Developer for Apache Spark, Databricks Certified Data Engineer Associate, Microsoft Certified: Azure Fundamentals, and Microsoft Certified: Azure Data Engineer Associate. The Microsoft Exam: Designing and Implementing Microsoft DevOps Solutions certification is a nice to have.
Strong programming skills in Python (must have), Scala, and proficiency in writing efficient and optimized code for data integration, migration, storage, processing, and manipulation are required.
A strong understanding and experience with SQL and writing advanced SQL queries is necessary.
Candidates should have a thorough understanding of big data principles, techniques, and best practices.
Strong experience with scalable and distributed Data Processing Technologies such as Spark/PySpark, DBT, and Kafka is required.
Solid Databricks development experience with significant Python, PySpark, Spark SQL, Pandas, and NumPy in an Azure environment is essential.
Experience in designing and implementing efficient ELT/ETL processes in Azure and Databricks is required.
Proficiency with Relational Databases and NonSQL Databases is necessary.
A good understanding of Data Modeling and Database Design Principles is required.
Strong experience in designing and implementing Data Warehousing, data lake, and data lake house solutions in Azure and Databricks is necessary.
Knowledge of Delta Lake, Unity Catalog, Delta Sharing, and Delta Live Tables (DLT) is required.
Strong understanding of the software development lifecycle (SDLC), especially Agile methodologies, is necessary.
Knowledge of SDLC tools and technologies such as Azure DevOps and GitHub is required.
Strong understanding of DevOps principles, including CI/CD, infrastructure as code, and performance tuning is necessary.
Knowledge in cloud computing specifically in Microsoft Azure services related to data and analytics is required.
Experience in Orchestration using technologies like Databricks workflows and Apache Airflow is necessary.
Strong analytical skills to identify and address technical issues and performance bottlenecks are required.
Proficiency in debugging and troubleshooting issues in complex data and analytics environments is necessary.
A good understanding of Data Quality and Governance is required.
Experience with BI solutions including PowerBI is a plus.
Strong written and verbal communication skills are necessary for collaboration with cross-functional teams.
Ability to document processes, procedures, and deployment configurations is required.
Understanding of security practices and the ability to implement security controls is necessary.
Self-motivated with the ability to work well in a team and mentor junior members is required.
A willingness to stay updated with the latest services and trends in Data Engineering is necessary.
Comfort with picking up new technologies independently and working in a rapidly changing environment is required.

Benefits:

The position offers the opportunity to work remotely, providing flexibility in work location.
Candidates will have the chance to work on large-scale, complex data architectures in a leading AI strategy company.
The role includes opportunities for professional growth and development through mentoring and collaboration with experienced teams.
Employees will be part of an Agile team, promoting continuous improvement and innovation in data engineering practices.
The company values diversity and is an Equal Opportunity Employer, ensuring a supportive work environment for all employees.