Fusemachines is a 10+ year old AI company focused on delivering advanced AI products and solutions across various industries.
The company is dedicated to democratizing AI and leveraging global AI talent from underserved communities.
This is a remote full-time contractual position in the Travel & Hospitality Industry.
The role involves designing, building, testing, optimizing, and maintaining infrastructure and code for data integration, storage, processing, pipelines, and analytics.
Responsibilities include implementing data flow controls and ensuring high data quality and accessibility for analytics and business intelligence.
The ideal candidate should have a strong foundation in programming and effective data management across various storage systems and technologies.
The position requires someone who can quickly ramp up, contribute immediately, and work independently or with junior team members with minimal oversight.
The candidate should have a strong background in Python, SQL, PySpark, Redshift, and AWS cloud-based large-scale data solutions, with a focus on data quality, performance, and cost optimization.
The role is suited for individuals passionate about using data to drive insights and support organizational strategic goals through innovative data engineering solutions.
Requirements:
A full-time Bachelor's degree in Computer Science, Information Systems, Engineering, or a related field is required.
A minimum of 5 years of real-world data engineering development experience in AWS is necessary, with certifications preferred.
Strong expertise in Python, SQL, PySpark, and AWS in an Agile environment is essential, along with a proven track record of building and optimizing data pipelines and architectures.
The candidate must be able to understand requirements and design end-to-end solutions with minimal oversight.
Strong programming skills in one or more languages such as Python or Scala are required, with proficiency in writing efficient and optimized code for data integration and processing.
Knowledge of SDLC tools and technologies, including project management software (Jira), source code management (GitHub), CI/CD systems, and binary repository managers is necessary.
A good understanding of Data Modelling and Database Design Principles is required, including the ability to design efficient database schemas.
Strong SQL skills and experience with complex data sets, Enterprise Data Warehouse, and advanced SQL queries are essential.
Experience in data integration from various sources, including APIs and databases, is required.
The candidate should have strong experience in implementing data pipelines and efficient ELT/ETL processes in AWS.
Experience with scalable and distributed Data Technologies such as Spark/PySpark, DBT, and Kafka is necessary.
Familiarity with stream-processing systems is a plus.
Strong experience in designing and implementing Data Warehousing solutions in AWS with Redshift is required.
Experience in orchestration using Apache Airflow is necessary.
Deep knowledge of AWS services such as Lambda, Kinesis, S3, and EMR is essential.
A good understanding of Data Quality and Governance is required.
Familiarity with BI solutions, including Looker and LookML, is a plus.
Strong knowledge of DevOps principles and tools is necessary.
Good problem-solving skills and the ability to troubleshoot data processing pipelines are required.
Strong leadership, project management, and organizational skills are essential.
Excellent communication skills to collaborate with cross-functional teams are necessary.
The ability to document processes, procedures, and deployment configurations is required.
Benefits:
The position offers the opportunity to work remotely, providing flexibility in work arrangements.
Employees will be part of a diverse and inclusive work environment, committed to equal opportunity.
The role allows for professional growth and the chance to work with cutting-edge technologies in the field of data engineering.
Employees will have the opportunity to mentor and guide junior team members, enhancing leadership skills.
The company fosters a culture of continuous learning and improvement, encouraging employees to expand their skills in data engineering and cloud platforms.