Welcome to RemoteYeah 2.0! Find out more about the new version here.

Remote Lead Site Reliability Engineer (AZURE) - Empower Product Group

at Hitachi Solutions

Posted 12 hours ago 0 applied

Description:

  • This is a full-time role in the product organization for an expert in systems design with considerable skill and expertise in large software development in an AZURE dev environment.
  • The position involves designing and implementing Continuous Integration/Continuous Deployment (CI/CD) tooling using GitHub Actions / Azure DevOps, and related technologies.
  • Responsibilities include defining and implementing build and test pipelines for containerized architectures, infrastructure as code (IaC) for the stateful deployment of environments, Role-Based Access Control (RBAC), linting and other code quality controls, gitops and Kubernetes pipelines, and managing SaaS deployment APIs.
  • The individual will assist in the design, engineering, development, planning, and administration of Azure Kubernetes AKS clusters for critical business applications.
  • This role requires close collaboration with application, engineering, security, and operations teams to engineer and build Kubernetes and Azure PaaS & IaaS solutions within an agile and modern enterprise-grade operating model.

Requirements:

  • Candidates must have a strong background as a Site Reliability Engineer (SRE) supporting a 24x7 highly available production environment for a SaaS or cloud service provider.
  • Solid experience with Monitoring/APM/Observability tools such as Data dog, Application Insights, Prometheus, and Grafana is required.
  • A strong background with Azure Resources like Key Vault, Data Factory, Azure Databricks, and Storage Accounts is necessary.
  • Experience implementing observability plans around logs, metrics, and traces is essential.
  • Candidates should have experience in an agile development team developing software and implementing best practices for CI/CD.
  • Experience with cloud infrastructure environments, preferably Azure, and Infrastructure as code (Terraform, Bicep, ARM) is required.
  • Strong experience with containerization technology and/or Kubernetes is necessary.
  • Candidates should have experience with release automation, system administration, and configuration management.
  • Proficiency in programming languages such as Python and Go is required.
  • A strong understanding of Linux, Windows, software development, systems, networking, and cloud concepts is essential.
  • Strong interpersonal and teaming skills are necessary, with the ability to set and enforce processes and influence engineers who are not direct reports.
  • Strong analytical and programming skills are required.
  • Bonus points for experience with MLFlow and other MLOps pipeline technology.

Benefits:

  • The base salary pay range for this role is $142,500 - $198,750 USD.
  • In addition to the base salary, the successful candidate may be eligible to participate in a bonus plan.
  • Medical, dental, and vision coverage is provided.
  • Life insurance and disability programs are included.
  • Retirement savings with company match are offered.
  • Paid time off is available.
  • Flexible work arrangements, including remote work, are provided.