Welcome to RemoteYeah 2.0! Find out more about the new version here.

Remote LLM Ops Engineer - Serverless & CI/CD (AWS)

at Expedite Commerce

Posted 1 day ago 0 applied

Description:

  • This role is for an LLM Ops Engineer focused on Serverless & CI/CD within AWS, specifically for Agentic AI systems that enhance enterprise SaaS.
  • The position involves engineering the infrastructure that supports conversational interfaces, dynamic UIs, and intelligent agents on AWS Serverless infrastructure.
  • The engineer will build and maintain CI/CD infrastructure for Agentic AI solutions using Terraform on AWS.
  • Responsibilities include developing, deploying, and debugging intelligent agents and their associated tools to ensure scalable and cost-effective delivery in production environments.
  • The role requires designing, implementing, and maintaining CI/CD pipelines for Agentic AI applications using tools like AWS CodePipeline and CodeBuild.
  • The engineer will collaborate with ML/NLP engineers to develop modular AI agents and create debuggable agent architectures.
  • Monitoring and managing cost metrics associated with agentic operations is also a key responsibility.
  • The position emphasizes collaboration with product, backend, and AI teams to improve agentic infrastructure design and tool orchestration workflows.

Requirements:

  • A minimum of 2 years of experience in DevOps, MLOps, or Cloud Infrastructure with exposure to AI/ML systems is required.
  • Deep expertise in AWS serverless architecture, including hands-on experience with AWS Lambda, Amazon API Gateway, and Step Functions.
  • Strong proficiency in Terraform for building and managing serverless AWS environments.
  • Experience deploying and managing CI/CD pipelines for serverless applications using AWS CodePipeline, CodeBuild, or GitHub Actions is necessary.
  • Hands-on experience with agent and tool development in Python, including debugging and performance tuning.
  • A solid understanding of IAM roles and policies, VPC configuration, and least-privilege access control is essential.
  • Knowledge of monitoring, alerting, and distributed tracing systems is required.
  • The ability to manage environment parity across development, staging, and production using automated infrastructure pipelines is necessary.
  • Excellent debugging, documentation, and cross-team communication skills are essential.

Benefits:

  • The position offers an equity participation program.
  • Health insurance, paid time off (PTO), and leave time are provided.
  • Ongoing paid professional training and certifications are available.
  • The role allows for fully remote work opportunities.
  • A strong onboarding and training program is included to support new hires.