Please, let Kiddom know you found this job
on RemoteYeah.
This helps us grow 🌱.
Description:
Design, build, and maintain scalable data pipelines to transform raw data into analytics-ready datasets.
Ensure optimal performance, reliability, and efficiency of the data pipelines.
Integrate machine learning models into data pipelines to enhance analytics capabilities.
Collaborate with data scientists to deploy and monitor ML models in production.
Ensure the scalability and reliability of ML workflows and infrastructure.
Develop and optimize ML models for predictive analytics and data-driven decision-making.
Monitor the data infrastructure for performance bottlenecks and implement optimizations as necessary.
Collaborate with other engineering teams to ensure seamless data integration with high availability.
Requirements:
Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
3+ years of experience as a data engineer, and 8+ years of software engineering experience (including data engineering).
Expertise in using Amazon SageMaker for building, training, and deploying machine learning models.
Knowledge of AWS Lambda for serverless execution of code, especially for model inference and lightweight processing tasks.
Familiarity with AWS Glue or similar ETL tools (Extract, Transform, Load).
Familiarity with Snowflake, RDS, Cassandra database services for structured data storage and querying. Proficiency in using Amazon S3 for data storage and retrieval, especially for large datasets used in machine learning.
Knowledge of AWS EC2 for scalable computing resources and ECS for containerized application deployment, useful for training and deploying models. Understanding of AWS Identity and Access Management (IAM) for managing permissions and security.
Familiarity with Amazon Kinesis for real-time data streaming and processing.
Skills in preprocessing and transforming raw data into a format suitable for machine learning using DBT.
Experience with CI/CD tools and practices for automating the deployment and monitoring of machine learning models.
Knowledge of AWS CloudWatch and AWS CloudTrail for monitoring model performance and logging events.
Proficiency in using AWS CloudFormation or Terraform to manage and provision AWS resources programmatically.
Strong programming skills in Python.
Proficiency in SQL for querying databases and manipulating structured data.
Understanding of security best practices in AWS, including data encryption and network security.
Knowledge of AWS cost management and optimization strategies to ensure efficient use of resources.
Experience in developing and deploying APIs for model inference and interaction with other systems using AWS API Gateway and AWS Lambda.
Benefits:
Competitive salary.
Meaningful equity.
Health benefits: medical (various PPO/HMO/HSA plans), dental, vision, disability, and life insurance.
10 paid sick days per year.
Unlimited vacation time policy (subject to internal approval). Average use 4 weeks off per year.
Paid family leave for eligible employees.
Apply now
Please, let Kiddom know you found this job
on RemoteYeah
.
This helps us grow 🌱.