Remote Site Reliability Engineer (Remote - US)

at Jobgether

Posted 1 day ago 0 applied

Description:

  • The Site Reliability Engineer position is a remote role based in the United States, posted by Jobgether on behalf of McAfee.
  • The engineer will maintain high service levels, including availability, latency, and reliability to meet customer needs while reducing friction in managing changes.
  • Responsibilities include collaborating closely with DevOps, Engineering, and support teams to ensure services are scalable, secure, and performant.
  • The role involves monitoring critical production environments, troubleshooting incidents, automating processes, and continuously improving service reliability.
  • The engineer will support mission-critical applications with a focus on observability, incident response, and seamless integration with IT service operations.
  • Key accountabilities include proactively monitoring production environments, troubleshooting and escalating problems, managing the incident lifecycle, and collaborating with teams to maintain service reliability.
  • The engineer will automate processes to reduce Mean Time to Detect (MTTD) and Mean Time to Restore (MTTR), maintain security event responsiveness, and participate early in the software development lifecycle.
  • Documentation of processes and regular updates to operational knowledge bases are required.
  • Effective communication with stakeholders and leadership regarding high-priority incidents and service status is essential.

Requirements:

  • Candidates should have 1 to 3+ years of experience in software development, SRE, DevOps, or systems engineering roles.
  • A proven track record managing large-scale, highly available production systems with a SLA of greater than 99.95%, preferably in cloud environments, is required.
  • Strong troubleshooting, debugging, and root cause analysis skills are necessary.
  • Experience with monitoring, logging, and application performance management tools such as Grafana, CloudWatch, or similar is expected.
  • Familiarity with CI/CD tools like Git, Jenkins, or Harness is required.
  • Hands-on experience with container technologies, including Kubernetes and Docker, is essential.
  • Candidates should be comfortable working with both Windows and Linux operating systems.
  • A solid understanding of AWS cloud services, including serverless and containerized workloads, is necessary.
  • Excellent communication skills and the ability to collaborate across teams and time zones are required.
  • Preferred certifications include ITIL, HDI, AWS, or other cloud-related credentials.
  • Willingness to work some non-standard hours to support global teams is expected.

Benefits:

  • The position offers competitive compensation and a bonus program.
  • Comprehensive medical, dental, and vision coverage is provided.
  • Paid time off and paid parental leave are included in the benefits package.
  • Pension and retirement plans are available for employees.
  • Flexible work hours and support for community involvement are offered.
  • The company promotes an inclusive work environment that embraces diversity and encourages authenticity.