Remote Site Reliability Engineer

Posted

Apply now
Please, let Baseten know you found this job on RemoteYeah. This helps us grow 🌱.

Description:

  • As a Site Reliability Engineer, you will envision and build robust systems and processes that ensure our infrastructure is scalable, reliable, and efficient.
  • Your responsibilities will include automating deployments, monitoring systems, optimizing performance, and managing incidents.
  • You will work closely with users to learn from their past struggles in operationalizing machine learning and help onboard them onto our platform.
  • You will build and maintain scalable infrastructure to support the deployment and operation of machine learning models.
  • Establishing standards and best practices for reliability and performance across the infrastructure will be part of your role.
  • You will automate processes, particularly for managing CI/CD pipelines.
  • You will own products and projects end-to-end, functioning as both an engineer and a project manager, focusing on user empathy, project specification, and execution.
  • Collaborating with cross-functional teams to understand project requirements and translating them into technical solutions will be essential.
  • Mentoring junior team members and contributing to knowledge sharing within the organization will be expected.
  • You will navigate ambiguity and exercise good judgment on tradeoffs and tools needed to solve problems, avoiding unnecessary complexity.
  • Demonstrating pride, ownership, and accountability for your work, while expecting the same from your teammates, is crucial.

Requirements:

  • A Bachelor's, Master's, or Ph.D. degree in Computer Science, Engineering, Mathematics, or a related field is required.
  • You must have 3+ years of professional work experience in a fast-paced, high-growth environment.
  • Extensive experience with Kubernetes is necessary.
  • You should have experience in building and maintaining scalable infrastructure.
  • Experience with infrastructure-as-code tools (e.g., Terraform, CloudFormation, Pulumi) and CI/CD tooling (e.g., GitHub Actions, GitLab CI, Circle CI, Jenkins) is required.
  • Relevant open-source observability experience (Prometheus, ELK stack, Grafana stack, Opentelemetry) is a plus.
  • You must have the ability to own projects end-to-end, from project specification to execution.
  • No prior machine learning experience is required, but you should be open to learning about it.

Benefits:

  • You will receive a competitive compensation package that includes unlimited PTO, a 401k plan, and covered healthcare premiums.
  • This position offers a unique opportunity to be part of a rapidly growing startup in one of the most exciting engineering fields of our era.
  • You will be part of an inclusive and supportive work culture that fosters learning and growth.
  • Exposure to a variety of machine learning startups will provide you with unparalleled learning and networking opportunities.
Apply now
Please, let Baseten know you found this job on RemoteYeah . This helps us grow 🌱.
About the job
Posted on
Job type
Salary
$ 150,000 - 250,000 USD / year
Location requirements

-

Report this job

Job expired or something else is wrong with this job?

Report this job
Leave a feedback