Remote Senior Site Reliability Engineer - IOE: Cardano
Posted
Apply now
Please, let IO Global know you found this job
on RemoteYeah.
This helps us grow 🌱.
Description:
IOHK is a technology company focused on Blockchain research and development, emphasizing a scientific approach to ensure security, scalability, and sustainability.
The Senior Site Reliability Engineer (SRE) will be an integral part of the open-source project, ensuring the reliability, availability, and performance of production systems.
This role combines service operation, systems engineering, and software engineering principles to operate and monitor services, as well as create or maintain tools, automations, and infrastructure code.
Responsibilities include breaking down large tasks into manageable work packages, coaching and mentoring junior engineers, and taking ownership of projects to ensure timely delivery.
The SRE will design, write, and deliver tools and software primarily using Python, Bash, Terraform, or Nix to improve service availability, scalability, and efficiency.
The role involves engaging in the whole lifecycle of services, practicing sustainable incident response, and collaborating with development teams to enhance customer experience.
The SRE will analyze system performance and reliability, develop service-level objectives (SLOs), and participate in on-call rotations to respond to service interruptions.
Requirements:
A minimum of 10 years of experience in DevOps, with at least 3 years in blockchain.
Proficiency in Python, Bash, Terraform, and Nix for DevOps services is required.
Excellent understanding and experience with infrastructure as code, particularly with tools like Terraform and Helm.
Extensive experience with AWS services, specifically EKS and RDS, is essential.
Familiarity with container orchestration, particularly Kubernetes, is required.
Hands-on experience with PostgreSQL and its deployment on RDS is necessary.
Knowledge of monitoring tools such as Prometheus, Grafana, and Loki is important.
Solid troubleshooting and performance tuning capabilities are required.
An understanding of the needs of real-time critical systems is essential.
Exceptional communication skills and a strong team collaboration ethic are necessary.
Experience with CI/CD tools like GitHub Actions, Hydra, or Earthly is required.
Strong analytical and troubleshooting skills are essential for the role.
Familiarity with on-prem and cloud infrastructure and multi-tenant application deployment is important.
The ability to quickly learn new technologies and adapt to changing environments is required.
High attention to detail is necessary to ensure system reliability and performance.
Benefits:
The position offers remote work flexibility.
Employees receive a laptop reimbursement and a new starter package to buy hardware essentials such as headphones and monitors.
There are opportunities for learning and development to enhance skills.
Competitive paid time off (PTO) is provided to employees.
Apply now
Please, let IO Global know you found this job
on RemoteYeah
.
This helps us grow 🌱.