Please let Xenon7 know you found this job on RemoteYeah. This helps us get more companies to post jobs here for you.
Description:
Xenon7 collaborates with leading enterprises and innovative startups on cutting-edge IT projects across various domains including Data, Web, Infrastructure, and AI.
The Senior Site Reliability Engineer (SRE) will design, implement, and maintain highly available, scalable, and secure infrastructure for critical banking applications, specifically Mobile Banking and Internet Banking platforms on on-premise infrastructure.
This role involves leading SRE initiatives, mentoring junior engineers, and driving continuous improvement in production support.
The SRE will implement observability strategies using OpenShift, Kubernetes, Prometheus, Grafana, and ELK Stack on on-premise data center infrastructure.
Responsibilities include designing and architecting infrastructure, leading monitoring strategies, overseeing logging infrastructure, mentoring engineers, defining SLIs, SLOs, and SLAs, and managing incident response strategies.
The position requires participation in a 24/7 on-call rotation for critical production incidents and ensuring compliance and security for financial systems.
Requirements:
A Bachelorβs degree in Computer Science, Information Technology, Software Engineering, or a related field is required.
Candidates must have 5+ years of hands-on experience in SRE, DevOps, or Production Engineering.
A minimum of 3 years of experience leading SRE teams or managing production support operations is necessary.
Candidates should have 3+ years of hands-on experience managing OpenShift and Kubernetes infrastructure on on-premise systems.
Expert-level experience with Prometheus for monitoring and alerting in production is required.
Candidates must have expert-level experience with Grafana for creating monitoring dashboards.
Advanced experience with the ELK Stack (Elasticsearch, Logstash, Kibana) for logging and log analysis is essential.
Proven experience in designing and scaling production systems for high-traffic banking applications is necessary.
Deep expertise in Linux/Unix system administration and container networking is required.
Advanced knowledge of CI/CD automation and deployment strategies is essential.
Hands-on experience with database management, tuning, and optimization on-premises is required.
Strong experience with infrastructure automation and Infrastructure as Code is necessary.
Proven 24/7 production support experience in mission-critical environments is required.
Experience managing on-premise data center infrastructure is necessary.
Candidates must demonstrate proven leadership skills and the ability to mentor junior engineers.
Excellent communication skills and the ability to present to executive stakeholders are required.
Experience in the financial services or banking sector is highly preferred.
Benefits:
The position offers the opportunity to work with elite tech talent and engage in transformative initiatives that drive innovation and business growth.
Employees will have the chance to collaborate with a premier financial institution known for its extensive suite of banking services.
The role provides a platform for professional growth through mentoring and leadership opportunities.
Employees will be involved in a groundbreaking digital transformation journey, leveraging the latest technologies.
The position includes participation in a 24/7 on-call rotation, ensuring a dynamic and engaging work environment.
The company is committed to delivering advanced, impactful solutions that meet complex challenges, providing a fulfilling work experience.