Remote Senior Cluster Site Reliability Engineer

Posted 1 month ago

Share:

Please let The Voleon Group know you found this job on RemoteYeah. This helps us get more companies to post jobs here for you.

Description:

  • Voleon is a technology company specializing in machine learning for finance, seeking a Senior Cluster Site Reliability Engineer (SRE).
  • The role involves scaling research compute clusters, ensuring uptime and reliability, and supporting both on-prem and cloud infrastructure.

Requirements:

  • 5+ years of experience in SRE or DevOps roles, preferably as a senior engineer or tech lead.
  • Knowledge of HPC/batch compute frameworks and machine learning training systems.
  • Proficiency in scripting languages (Python, Ruby, etc.) and infrastructure-as-code tools (Terraform, Ansible).
  • Experience with cloud infrastructure (AWS or GCP) and modern observability stacks.
  • Familiarity with distributed storage technologies and a systematic engineering mindset.
  • Bachelor's degree in computer science.

Benefits:

  • Opportunity to work with cutting-edge technology in a leading asset management firm.
  • Potential for a referral bonus through the "Friends of Voleon" Candidate Referral Program.

Job type

Experience level

Required experience

5 years

Salary

$205,000—$235,000 / year

Degree requirement

Degree required

Location requirements

Report this job

Job expired or something else is wrong with this job?

Report job
SerpApi

SerpApi

Scrape Google and other search engines from our fast, easy, and complete API.

RemoteYeah Ads