Welcome to RemoteYeah 2.0! Find out more about the new version here.

Remote Lead Data Engineer – Remote Job.

at Bertoni Solutions

Posted 1 day ago 4 applied

Description:

  • We are seeking a highly skilled Lead Data Engineer with strong expertise in PySpark, SQL, and Python, as well as a solid understanding of ETL and data warehousing principles.
  • The ideal candidate will have a proven track record of designing, building, and maintaining scalable data pipelines in a collaborative and fast-paced environment.
  • Key responsibilities include designing and developing scalable data pipelines using PySpark to support analytics and reporting needs.
  • The candidate will write efficient SQL and Python code to transform, cleanse, and optimize large datasets.
  • Collaboration with machine learning engineers, product managers, and developers to understand data requirements and deliver solutions is essential.
  • The role involves implementing and maintaining robust ETL processes to integrate structured and semi-structured data from various sources.
  • Ensuring data quality, integrity, and reliability across pipelines and systems is a critical responsibility.
  • Participation in code reviews, troubleshooting, and performance tuning is expected.
  • The candidate will work independently and proactively to identify and resolve data-related issues.
  • If applicable, the role may involve contributing to Azure-based data solutions, including ADF, Synapse, ADLS, and other services.
  • Support for cloud migration initiatives and DevOps practices may also be relevant to the role.
  • Providing guidance on best practices and mentoring junior team members when needed is part of the job.

Requirements:

  • Candidates must have 8+ years of overall experience working with cross-functional teams, including machine learning engineers, developers, product managers, and analytics teams.
  • A minimum of 3+ years of hands-on experience developing and managing data pipelines using PySpark is required.
  • Strong programming skills in Python and SQL are essential.
  • A deep understanding of ETL processes and data warehousing fundamentals is necessary.
  • Candidates should be self-driven, resourceful, and comfortable working in dynamic, fast-paced environments.
  • Advanced written and spoken English fluency is a must-have for this position (B2, C1, or C2 only).
  • Additional nice-to-have qualifications include Databricks certification and experience with Azure-native services such as Azure Data Lake Storage (ADLS), Azure Data Factory (ADF), and Azure Synapse Analytics.
  • Familiarity with Event Hub, IoT Hub, Azure Stream Analytics, Azure Analysis Services, and Cosmos DB is beneficial.
  • A basic understanding of SAP HANA and intermediate-level experience with Power BI is preferred.
  • Knowledge of DevOps, CI/CD pipelines, and cloud migration best practices is also advantageous.
  • Candidates must meet mandatory requirements, including 3+ years of experience with PySpark/Python, ETL, and data warehousing processes, proven leadership experience, and must be located in Central or South America.

Benefits:

  • The position is 100% remote for nearshore candidates located in Central or South America.
  • The contract type is independent contractor, which does not include PTO, tax deductions, or insurance, but covers the monthly payment based on hours worked.
  • The initial contract/project duration is 6 months, with the possibility of extension based on performance.
  • Full-time working hours are Monday to Friday, 8 hours per day, 40 hours per week, from 8:00 AM to 5:00 PM PST (U.S. time zone).
  • Contractors are required to use their own laptop/PC.
  • The expected start date is as soon as possible.
  • Payment methods include international bank transfer, PayPal, Wise, Payoneer, etc.
  • Joining the team offers the opportunity to be part of an innovative group shaping the future of technology, work in a collaborative and inclusive environment, and access opportunities for professional development and career growth.