Remote Site Reliability Engineer lll at Emburse

Description:

Develop software and software fixes to integrate internal systems.
Ensure code quality, test and distribute code updates, and monitor the health and stability of the servers.
Meet and beat Key Performance Indicators, SLAs, maintain an error budget and adhere to it.
Identify, evaluate, and execute preventative measures to minimize and avoid impact to the customer experience.
Employ deep troubleshooting skills to improve the availability, performance, and security for CR and Emburse, ensuring services are designed with 24/7 availability and operational readiness and rigor.
Code and automate applications on cloud platforms.
Work with engineering leadership to build shared services that meet the requirements and needs of the platform and application teams.
Collaborate with cloud platform and operations leaders to develop narratives, backlog grooming, epic planning, and overall sprint planning processes.
Ensure the platform holds a high degree of reliability, achieving at least four 9s.
Define non-functional requirements as part of the product lifecycle to influence new designs, standards, and methods for scalable, highly available distributed systems.
Own technically intricate issues that cross between DevOps, databases, networking, code, infrastructure, and people; drive them to satisfactory completion.
Work closely with product stakeholders to align operational priorities and planning with the product and engineering roadmap.
Prepare and present engineering-related documents to key stakeholders.
Provide recommendations and feedback in review sessions, design reviews, and review sessions.
Mentor SRE I and II engineers.
Assist in guiding more junior engineers in best practices.
Conduct and assist with investigation, testing, and deployment activities, identifying and mitigating risks in development activities.

Requirements:

A Bachelor’s degree in Computer Science or a STEM field is required.
A minimum of 7 years’ experience in an engineering role is required.
A deep understanding of infrastructure as code, scripting, self-healing, containers, and DevOps tooling is highly desired.
Experience working with Ansible and Terraform tools is highly desirable.
Excellent written and verbal communication skills in English are required.
Experience with the full lifecycle of SaaS implementations as well as infrastructure as code is necessary.
Excellent follow-up and project management skills are essential.
Proven ability to create and maintain new tools is required.
Excellent troubleshooting skills are necessary.
Excellent technical skills are required, with up to 70% of the job being hands-on in a distributed Linux environment.
Strong scripting skills are necessary, and object-oriented programming (OOP) is a plus.
Ability to liaise between other teams to help prioritize and align priorities is required.
Experience working with an offshore team is necessary.

Benefits:

The position offers the opportunity to work in a remote environment.
Employees will have the chance to mentor junior engineers and contribute to their professional development.
The role provides the opportunity to work with cutting-edge technologies in cloud platforms and DevOps.
Employees will be part of a team that values collaboration and alignment with product and engineering roadmaps.
The position allows for significant hands-on experience in a distributed Linux environment, enhancing technical skills.
The role includes the potential for career growth and advancement within the organization.