Remote Senior Site Reliability Engineer (UK Remote)

Posted

This job is closed

This job post is closed and the position is probably filled. Please do not apply.  Automatically closed by a robot after apply link was detected as broken.

Description:

  • Join NuOrder by Lightspeed team to support cross-cutting concerns like cloud infrastructure, reliability, incident management, data warehousing, analytics, and more.
  • Assist Dev teams in scaling by providing necessary infrastructure and tools.
  • Build and maintain multi-region infrastructures and networks for reliable, efficient, and secure product operation.
  • Implement DevOps principles to ensure product reliability and security.
  • Design, develop, and manage robust infrastructure on GCP using cloud-native technologies.
  • Establish CI/CD pipelines for efficient deployment and release processes.
  • Collaborate with development teams to monitor software health, define reliability metrics, and manage error budgets.
  • Apply software engineering principles to enhance software reliability and speed up delivery.
  • Support incident management and conduct post-mortem analysis to prevent future outages.
  • Mentor junior SREs and developers on cloud architecture, data management, and software development best practices.
  • Manage infrastructure changes through infrastructure as code (IaC) using Terraform.
  • Participate in the on-call rotation and stay updated on industry trends and emerging technologies.

Requirements:

  • Bachelor’s degree in Computer Science, Engineering, or equivalent experience.
  • 6+ years of experience in site reliability engineering, systems administration, or software engineering.
  • Proficiency in Kubernetes, relational (e.g., PostgreSQL, MySQL) and NoSQL databases (e.g., MongoDB, Cassandra, Redis).
  • Familiarity with network protocols, IP networking, and network troubleshooting.
  • Strong programming skills in Bash, Python, Go, etc.
  • Experience managing large-scale cloud infrastructure in Google Cloud, AWS, or Azure.
  • Knowledge of monitoring tools (e.g., Prometheus, Grafana, Datadog) and logging solutions (e.g., ELK stack).
  • Understanding of security best practices.
  • Excellent problem-solving skills and ability to troubleshoot complex issues under pressure.
  • Effective communication skills for collaboration with cross-functional teams.
  • Eagerness to learn and embrace challenges.

Benefits:

  • Opportunity to work in a talented global team with growth prospects.
  • Flexible Working policy.
  • Lightspeed share scheme participation.
  • Company pension program.
  • Private medical insurance.
  • Health and wellness benefits.
  • Mental health online platform, counseling, and coaching services.
  • Paid leave and assistance for new parents.
  • Language classes and LinkedIn Learning license.
  • Volunteer day.
About the job
Posted on
Job type
Salary
-
Leave a feedback