Remote Site Reliability Engineer

Posted

This job is closed

This job post is closed and the position is probably filled. Please do not apply.  Automatically closed by a robot after apply link was detected as broken.

Description:

  • Egen is a fast-growing and entrepreneurial company with a data-first mindset.
  • The company focuses on bringing together the best engineering talent to work with advanced technology platforms, including Google Cloud and Salesforce.
  • Egen aims to help clients drive action and impact through data and insights.
  • The company is committed to being a place where top talent can apply their engineering and technology expertise.
  • Egen is dedicated to learning, solving tough problems, and innovating for fast, effective results.
  • The Site Reliability Engineer will ensure system reliability and infrastructure support.
  • Responsibilities include delivering scalability, performance optimization, incident management, and analysis.
  • The engineer will ensure system reliability and uptime of applications based on SLAs.
  • Monitoring system performance metrics and determining optimization approaches is essential.
  • Leading incident management efforts and documenting Root Cause Analysis (RCA) and lessons learned is required.
  • The role involves working closely with DevOps and Application teams to align priorities and drive continuous improvement initiatives.
  • Prioritizing response efforts based on issue severity and potential impact on users is crucial.
  • The engineer will evaluate and approve changes to production systems, balancing innovation with stability and reliability.
  • Optimizing resource usage and managing costs by identifying inefficiencies and implementing cost-saving measures is part of the job.

Requirements:

  • A minimum of 3 years of Site Reliability Engineering experience with Azure and/or AWS is required.
  • A Bachelor’s Degree is preferred, but relevant experience will be considered as an equivalent.
  • Proficiency in programming languages such as Java, SpringBoot, SQL, and Bash is necessary.
  • Experience with monitoring tools like DataDog, Splunk, and Grafana is required.
  • Familiarity with Docker, Kubernetes, and Linux is essential.
  • Knowledge of incident and alerts management tools such as VictorOps and PagerDuty is needed.
  • Experience with version control systems like Git and Bitbucket is required.
  • The ability to troubleshoot complex, intertwined distributed services is necessary.
  • Attention to detail is crucial for this role.
  • Skills in testing, monitoring, logging, alerting, and documentation are required.
  • Experience in incident management is essential.

Benefits:

  • Egen offers a dynamic and innovative work environment that encourages learning and growth.
  • Employees have the opportunity to work with advanced technology platforms and top engineering talent.
  • The company promotes a culture of continuous improvement and problem-solving.
  • Egen provides a flexible work environment, including remote work options.
  • Employees can expect to be part of a team that values their expertise and contributions.
  • The company is committed to helping employees envision how data and platforms can change the world for the better.
About the job
Posted on
Job type
Salary
-
Leave a feedback