Remote Senior Site Reliability Engineer - Midnight
Posted
Apply now
Please, let IO Global know you found this job
on RemoteYeah.
This helps us grow 🌱.
Description:
IOG is a technology company focused on Blockchain research and development, emphasizing peer-reviewed research and formal methods for security, scalability, and sustainability.
The company aims to advance the capabilities and adoption of blockchain technology globally through projects in decentralized finance (DeFi), governance, and identity management.
As a Senior Site Reliability Engineer (SRE), you will shape the reliability and performance of systems across cloud infrastructure.
You will design and implement solutions to improve service reliability, automate routine tasks, and facilitate collaboration between development and operations teams.
Responsibilities include designing, building, and maintaining scalable systems on AWS, managing Kubernetes clusters, automating deployments using GitOps principles, and implementing CI/CD pipelines.
You will also implement monitoring solutions, participate in on-call rotations, lead incident response efforts, and collaborate with development teams to define SLOs/SLIs.
The role requires problem-solving skills to distill vague challenges into actionable plans and effective communication of technical solutions to stakeholders.
Continuous improvement and innovation are key, with an emphasis on evaluating new technologies and documenting best practices.
Requirements:
You must have 7+ years of experience in SRE, DevOps, or a related role.
A strong understanding of SRE best practices, architectures, and methods is required.
Good knowledge of resiliency patterns and cloud security is essential.
Proficiency in programming languages such as Python, Golang, or Javascript is necessary, with Rust experience being advantageous.
Demonstrated experience with AWS and modern cloud architectures is required.
Proficiency in Helm, Terraform, and CI/CD tools like Github Actions and ArgoCD is necessary.
Hands-on experience with Kubernetes/EKS and GitOps methodologies is required.
A proven track record with monitoring tools such as Prometheus and OpenTelemetry is essential, along with familiarity with the LGTM stack or comparable tools.
Blockchain experience is advantageous, providing a unique perspective on distributed systems and security.
Exceptional problem-solving skills and the ability to translate vague requirements into clear plans are necessary.
You should be able to engage in technical discussions and participate in decision-making processes.
Experience working in an Agile environment and with distributed teams is required.
Strong communication and collaboration abilities are essential for working across different teams.
A proactive and innovative mindset, with a passion for continuous improvement and operational excellence, is necessary.
Benefits:
The position offers remote work flexibility.
There is a laptop reimbursement program available.
New starters receive a package to buy hardware essentials such as headphones and monitors.
Learning and development opportunities are provided to enhance skills.
Competitive paid time off (PTO) is offered to employees.
Apply now
Please, let IO Global know you found this job
on RemoteYeah
.
This helps us grow 🌱.