The Senior Site Reliability Engineer (SRE) will be responsible for the continuous improvement and support of the Windows server environment, including Windows 2012, 2016, 2019, and 2022.
Knowledge of virtualization with VMware and HPE hardware products is required.
Experience with Ansible or BladeLogic is also necessary.
The role involves lifecycle management from deployment to retirement of the environment, ensuring incidents and problems are resolved swiftly in accordance with service level agreements and policies.
The SRE will act as a contact point for issues and provide mentorship to the team.
Onsite support for all changes and incidents will be provided.
The goal is to manage the server estate, ensuring supportability while improving the environment.
Responsibilities include creating automation plans, managing incident resolution, supporting post-incident processes, and ensuring permanent resolutions are implemented.
The SRE will provide expert technical input during the resolution of complex systems problems and mentor third-party engineers.
The role requires maintaining the server environment to known standards and identifying gaps in monitoring.
The SRE will develop and implement new support services and performance recommendations to meet user requirements.
Client queries must be handled reliably and efficiently to meet expectations.
Administrative and business-as-usual work will be undertaken on behalf of technical and business teams.
Remediation on HPE hardware, software, and firmware will be performed to improve performance.
Patching and vulnerability management of the server environment is required.
The SRE will review, maintain, and test upgrades to vendor software supporting the servers.
ITIL standards and processes must be followed, and the SRE will work outside normal hours as part of an on-call rotation.
Requirements:
A minimum of 5 years of experience working in a large Windows environment is required.
Proficiency in all versions of Windows is essential.
Experience working in a large shared environment, including server, converged, network, and VMware, is necessary.
Knowledge of VMware virtualization and VRealise Suite is required.
Scripting and automation technology experience in at least one of the following: Bash, Python, Perl, Ansible, PowerShell, or VBScript is needed.
Familiarity with HPE Server Hardware, specifically the ProLiant range and Synergy/OneView, is required.
Desirable skills include experience with Rapid 7 vulnerability management tools, McAfee security products, SAN storage and switches, and cloud technologies (AWS, Azure, Oracle, Google).
Knowledge of monitoring tools such as Dynatrace is a plus.
Understanding of Agile methodologies and ITIL processes is required, along with experience using ticket tracking software, specifically Service Now.
Excellent English verbal and written communication skills are essential.
The ability to implement technology and process improvements, innovate, and work without direction is necessary.
Flexibility in working hours, including on-call and out-of-hours delivery, is required.
Benefits:
Experian offers a people-first approach, focusing on diversity, equity, and inclusion, work/life balance, and employee development.
The company has received numerous awards, including recognition as one of the World's Best Workplaces™ and Great Place To Work™ in multiple countries.
Employees are encouraged to bring their whole selves to work, with support for those with disabilities or special needs.
The culture at Experian emphasizes collaboration, wellness, and recognition, contributing to a positive work environment.