Remote Site Reliability Engineer - Observability

at Flinks

Posted 4 hours ago 1 applied

Description:

  • Flinks is a company focused on simplifying access to financial data and helping businesses create better financial products and experiences.
  • The Observability Site Reliability Engineer (SRE) will own the end-to-end observability, monitoring, and reliability strategy across all Flinks product lines.
  • Responsibilities include defining and maintaining an observability framework, ensuring coverage for various services, and establishing SLIs/SLOs aligned to client expectations.
  • The role involves building consistent alerting rules, integrating observability into incident management workflows, and leading cross-product root cause analysis.
  • The SRE will deliver reliability scorecards linking reliability to client outcomes, analyze trends, and translate data into executive insights.
  • Automation of anomaly detection and self-healing processes is a key responsibility, along with collaboration across teams to champion observability practices.

Requirements:

  • Candidates should have 5–8 years of experience in SRE, Observability, or Reliability roles, ideally in fintech, SaaS, or data platforms.
  • Strong technical skills in observability tooling such as Grafana, Prometheus, OpenTelemetry, and ELK are required.
  • Hands-on experience with tracing and profiling tools, distributed systems, APIs, and data pipelines is essential.
  • Strong automation skills, particularly with Kubernetes, are necessary.
  • Proficiency in at least one programming language is required, with C# and Go being preferred.
  • Candidates should possess a systems thinking mindset, be business-aware, proactive, and collaborative.

Benefits:

  • The role ensures consistent reliability and observability standards across all products.
  • It provides a single source of truth for performance and reliability metrics within the organization.
  • The position directly contributes to improving client trust, profitability, and operational efficiency.
  • It enables proactive stability management across Flinks’ core product lines.
  • The role supports the company's shift to a cohesive, reliable, platform-first mindset at scale.

Get realtime job alerts

Be the first to know about new jobs