Please, let Kontakt.io know you found this job
on RemoteYeah.
This helps us grow 🌱.
Description:
Kontakt.io is building a platform that enhances care operations by reducing waste, cutting costs, and improving revenue through better throughput, asset utilization, and staff productivity.
The platform utilizes AI, RTLS, and EHR data to create self-learning agents that automate workflows and adapt in real-time.
The Lead Site Reliability Engineer will own the reliability, performance, and automation of the cloud-based, real-time platform.
This role focuses on maintaining 24/7 platform operation, minimizing downtime, and improving observability, incident response, and self-healing automation.
The engineer will lead and scale the SRE team to ensure infrastructure efficiency and meet the needs of growing healthcare customers.
Responsibilities include ensuring 99.99% uptime, writing maintainable code, designing self-healing systems, defining SLIs, SLOs, and SLAs, managing scalable cloud infrastructure, optimizing containerized environments, and leading incident response operations.
Requirements:
Candidates must have 10+ years of experience in Site Reliability Engineering or Cloud Infrastructure.
A minimum of 2+ years of experience as a software engineer is required.
Proven success in scaling high-traffic, mission-critical platforms in SaaS, IoT, or healthcare is essential.
Deep expertise in cloud platforms (AWS), Kubernetes, and distributed systems is necessary.
Strong background in monitoring, logging, and observability with tools like Prometheus and OpenTelemetry is required.
Hands-on experience with incident management and building resilient systems is expected.
Candidates should have deep knowledge of CI/CD automation, GitOps, and infrastructure as code (Terraform).
A mature leadership approach with the ability to drive technical strategy and mentor a high-performance SRE team is essential.
Strong understanding of network security, access management, and compliance frameworks (HIPAA, SOC 2) is required.
Bonus points for experience with healthcare IT, real-time distributed systems, and leading on-call rotations.
Benefits:
The role offers the opportunity to ensure mission-critical reliability for healthcare facilities with a 99.99% uptime platform.
Engineers will work on real-time automation and self-healing cloud systems that orchestrate care delivery.
The position allows for a significant impact in healthcare by optimizing resources and improving patient care with technology that delivers 10X ROI.
An automation-first culture minimizes manual operations through cutting-edge automation and incident response strategies.
The opportunity to join a high-performing team of top engineers, AI experts, and healthcare innovators is provided.
Apply now
Please, let Kontakt.io know you found this job
on RemoteYeah
.
This helps us grow 🌱.