Remote Senior Data Engineer

Posted

This job is closed

This job post is closed and the position is probably filled. Please do not apply.  Automatically closed by a robot after apply link was detected as broken.

Description:

  • We are currently seeking a Senior Data Engineer with 5-7 years of experience.
  • The ideal candidate will have the ability to work independently within an AGILE working environment.
  • Experience working with cloud infrastructure leveraging tools such as Apache Airflow, Databricks, DBT, and Snowflake is required.
  • Familiarity with real-time data processing and AI implementation, including generative AI, is highly advantageous.
  • Responsibilities include designing, building, and maintaining scalable and robust data pipelines to support analytics and machine learning models, ensuring high data quality and reliability for both batch and real-time use cases.
  • The candidate will design, maintain, and optimize data models and data structures in tools such as Snowflake and Databricks.
  • They will leverage Databricks and Cloud-native solutions for big data processing, ensuring efficient management of Spark jobs and seamless integration with other data services.
  • The role involves utilizing PySpark and/or Ray to build and scale distributed computing tasks, enhancing the performance of machine learning model training and inference processes.
  • Monitoring, troubleshooting, and resolving issues within data pipelines and infrastructure while implementing best practices for data engineering and continuous improvement is essential.
  • The candidate will integrate generative AI capabilities into data pipelines and workflows to support advanced use cases such as data enrichment, automated content generation, and natural language processing.
  • Collaboration with machine learning engineers to optimize generative AI workflows, ensuring seamless deployment and scalability in production environments is required.
  • Developing APIs and tools to enable internal teams to consume generative AI models and services efficiently is part of the role.
  • Staying informed about advancements in generative AI technologies and recommending their adoption to improve business processes and analytics capabilities is expected.
  • The candidate will diagrammatically document data engineering workflows and generative AI integrations.
  • Collaboration with other Data Engineers, Product Owners, Software Developers, and Machine Learning Engineers to implement new product features by understanding their needs and delivering on time is crucial.

Requirements:

  • A minimum of 5 years of experience deploying enterprise-level scalable data engineering solutions is required.
  • Strong examples of independently developed data pipelines end-to-end, from problem formulation, raw data, to implementation, optimization, and results are necessary.
  • A proven track record of building and managing scalable cloud-based infrastructure on AWS (including S3, Dynamo DB, EMR) is essential.
  • Experience implementing and managing AI model lifecycles in production, including generative AI models, is required.
  • Familiarity with tools like OpenAI API, Hugging Face Transformers, or equivalent platforms for generative AI is advantageous.
  • Strong experience using Apache Airflow (or equivalent), Snowflake, and Lucene-based search engines is necessary.
  • Advanced SQL and Python knowledge with associated coding experience is required.
  • Experience with Databricks (Delta format, Unity Catalog) is essential.
  • Strong experience with DevOps practices for continuous integration and continuous delivery (CI/CD) is necessary.
  • Experience wrangling structured and unstructured file formats (Parquet, CSV, JSON) is required.
  • Understanding and implementation of best practices within ETL and ELT processes is essential.
  • Data quality best practice implementation using tools like Great Expectations is necessary.
  • Real-time data processing experience using Apache Kafka (or equivalent) is advantageous.
  • Knowledge of generative AI model architectures and their integration into scalable systems is required.
  • A proven ability to work independently with minimal supervision is essential.
  • The candidate should take initiative and be action-focused.
  • Mentoring and sharing knowledge with junior team members is expected.
  • A strong ability to collaborate within cross-functional teams is necessary.
  • Excellent communication skills with the ability to communicate with stakeholders across varying interest groups are required.
  • Fluency in spoken and written English is essential.

Benefits:

  • The position offers the opportunity to work in a global, multidisciplinary research, analytics, and data consultancy.
  • Employees will be part of a team dedicated to building trusting relationships with people through data and intelligence.
  • The company promotes a diverse, inclusive, and authentic workplace.
  • Candidates are encouraged to apply even if their experience does not perfectly align with every qualification, as they may be the right fit for this or other roles.
Leave a feedback