Remote Site Reliability Technical Lead

Posted

This job is closed

This job post is closed and the position is probably filled. Please do not apply.  Automatically closed by a robot after apply link was detected as broken.

Description:

  • The Site Reliability Technical Lead will independently guide the technical direction and implementation by the whole team within defined architecture in all stages from conceptualization to deployment.
  • This role involves evaluating trade-offs between correctness, robustness, performance, and customer impact to ensure the development of the right solution, with client success at the forefront.
  • The lead will create and lead the team's technical documentation and repository management practices, including tasks such as creating branches, pull requests, merges, etc.
  • Collaboration with product, design, and engineering teams is essential to provide necessary oversight of architecture and dependencies influencing product strategy and direction.
  • The individual will contribute to code reviews, documentation, and addressing complex bug fixes with a focus on security, performance, and reliability.
  • The Site Reliability Technical Lead will be an active leader in the Engineering Practice community, mentoring Senior Engineers and others through Communities of Practice (CoPs) or on project teams, supporting the growth of technical capabilities.

Requirements:

  • A Bachelor's Degree in Information Technology, Computer Science, or equivalent work experience is required.
  • The candidate must have 3+ years of experience in an SRE engineering role for supporting highly available production systems in cloud environments.
  • Experience with the design and implementation of SRE functions implementing mature SRE best practices is necessary.
  • The candidate should have experience with defining SRE standards and supporting the implementation and adoption of these standards.
  • Proficiency in using and enabling monitoring and alerting tools and services, with a preference for Datadog, is required.
  • An understanding of cloud architectures, microservices, and distributed systems is essential.
  • The candidate must be adept in the development of automated tools, systems, and services in multiple technology domains.
  • Advanced knowledge of one or more infrastructure components (e.g., networking, cloud services, orchestration tools, containerization, compute, and storage systems) is required.
  • Proficiency in service-level changes to a system and troubleshooting components is necessary.
  • Experience in managing SLA and incident response calls is required.

Benefits:

  • 3Pillar Global offers a culture that emphasizes teamwork, open collaboration, and a commitment to building breakthrough software solutions.
  • The company has been recognized on the Inc. 5000 list for ten consecutive years and has won the Washington Post Top Workplaces Award three times.
  • Employees are part of an innovative software development partner that drives rapid revenue, market share, customer growth, and employee efficiency for industry leaders.
  • The company promotes a business-minded approach to agile development, ensuring alignment with client goals from the earliest conceptual stages through launch and beyond.
  • 3Pillar Global values continuous improvement and provides opportunities for professional growth and mentorship within the Engineering Practice community.
About the job
Posted on
Job type
Salary
-
Position
Experience level
Technology stack
Leave a feedback