Remote Site Reliability Engineer (The Reliability Guardian)

Posted

Apply now
Please, let Unreal Gigs know you found this job on RemoteYeah. This helps us grow 🌱.

Description:

  • The Site Reliability Engineer (The Reliability Guardian) will be responsible for building and maintaining resilient systems that ensure high availability and performance.
  • This role involves automating processes, troubleshooting complex issues, and creating systems that scale smoothly.
  • The engineer will collaborate with developers, DevOps engineers, and IT specialists to enhance system reliability, implement automation, and support a seamless user experience.
  • Key responsibilities include designing strategies to enhance system reliability and performance, developing automation scripts, implementing monitoring solutions, leading incident response efforts, collaborating on system architecture, integrating security practices, and maintaining CI/CD pipelines.

Requirements:

  • Candidates must have strong experience in ensuring system reliability and performance in complex, distributed environments.
  • Proficiency in automating tasks using scripting languages such as Python, Bash, or PowerShell is required, along with experience with automation tools like Ansible, Chef, or Puppet.
  • Familiarity with monitoring tools such as Prometheus, Grafana, ELK Stack, or Datadog is essential, as well as skills in setting up monitoring dashboards, alerts, and automated incident responses.
  • Experience in maintaining and optimizing CI/CD pipelines using tools like Jenkins, GitLab CI/CD, or CircleCI is necessary.
  • Knowledge of integrating security standards and practices into site reliability processes is required.
  • A Bachelor’s or Master’s degree in Computer Science, IT, or a related field is needed, with equivalent experience in reliability engineering or systems engineering considered.
  • Candidates should have 5+ years of experience in site reliability engineering, DevOps, or a similar field, with hands-on experience in building and managing high-availability and distributed systems.
  • Familiarity with cloud platforms (AWS, GCP, Azure) and container orchestration tools such as Kubernetes is highly desirable.

Benefits:

  • The position offers comprehensive medical, dental, and vision insurance plans with low co-pays and premiums.
  • Employees receive competitive vacation, sick leave, and 20 paid holidays per year.
  • Flexible work schedules and telecommuting options promote work-life balance.
  • Opportunities for training, certification reimbursement, and career advancement programs are available for professional development.
  • Access to wellness programs, including gym memberships, health screenings, and mental health resources is provided.
  • Life insurance and short-term/long-term disability coverage are included.
  • The Employee Assistance Program (EAP) offers confidential counseling and support services for personal and professional challenges.
  • Financial assistance for continuing education and professional development is available through tuition reimbursement.
  • Employees can participate in community service and volunteer activities.
  • Recognition programs are in place to celebrate employee achievements and milestones.
Apply now
Please, let Unreal Gigs know you found this job on RemoteYeah . This helps us grow 🌱.
About the job
Report this job

Job expired or something else is wrong with this job?

Report this job
Leave a feedback