Remote Senior Site Reliability Engineer - Midnight

at IO Global

Posted 2 days ago 0 applied

Description:

  • IOG is a technology company focused on Blockchain research and development, emphasizing peer-reviewed research and formal methods for security, scalability, and sustainability.
  • The company aims to advance the capabilities and adoption of blockchain technology globally through projects in decentralized finance (DeFi), governance, and identity management.
  • As a Senior Site Reliability Engineer (SRE), you will shape the reliability and performance of systems across cloud infrastructure.
  • You will design and implement solutions to improve service reliability, automate routine tasks, and facilitate collaboration between development and operations teams.
  • Responsibilities include designing, building, and maintaining scalable systems on AWS, managing Kubernetes clusters, automating deployments using GitOps principles, and implementing CI/CD pipelines.
  • You will also develop automation tools, implement monitoring solutions, participate in on-call rotations, and lead incident response efforts.
  • The role requires effective communication of technical solutions and incident retrospectives to both technical and non-technical stakeholders.
  • You will evaluate and adopt new technologies, document processes, and strive for continuous improvement in delivery and standards.

Requirements:

  • You must have 7+ years of experience in SRE, DevOps, or a related role.
  • A strong understanding of SRE best practices, architectures, and methods is required.
  • Good knowledge of resiliency patterns and cloud security is essential.
  • Strong programming proficiency in Python, Golang, or Javascript is necessary, with Rust experience being advantageous.
  • Demonstrated experience with AWS and modern cloud architectures is required.
  • Proficiency in Helm, Terraform, and CI/CD tools like Github Actions and ArgoCD is necessary.
  • Hands-on experience with Kubernetes/EKS and GitOps methodologies is required.
  • Proven track record with monitoring tools such as Prometheus and OpenTelemetry is essential.
  • Blockchain experience is advantageous, providing a unique perspective on distributed systems and security.
  • Exceptional problem-solving skills and the ability to translate vague requirements into clear plans are necessary.
  • You should be able to engage in technical discussions and participate in decision-making processes.
  • Experience working within an Agile environment and with a distributed team is required.
  • Strong communication and collaboration abilities are essential for working across different teams.
  • A proactive and innovative mindset with a passion for continuous improvement and operational excellence is necessary.

Benefits:

  • The position offers remote work flexibility.
  • Laptop reimbursement is provided to support your work setup.
  • A new starter package is available to buy hardware essentials such as headphones and monitors.
  • Learning and development opportunities are offered to enhance your skills.
  • Competitive paid time off (PTO) is provided to ensure work-life balance.

Get realtime job alerts

Be the first to know about new jobs