@tarunsingh - Senior DevOps Engineer/Site Reliability Engineer

Senior DevOps Engineer/Site Reliability Engineer

DevOps Engineer

Available for hire

Years of experience

5+ years

Experience level

Mid-level

Available for

Full-time, Part-time, Contract, Freelance

Available from

03 Aug 2026

Download Resume / CV

I keep production systems alive and make sure they stay that way.

5+ years in DevOps and SRE, mostly in FinTech, IoT, and regulated enterprise environments where things breaking at 2am actually costs money. I don't just set up pipelines and walk away. I stick around for the on-call, the incident reviews, and the "why did this alert fire 47 times last night?" conversations.

What I've shipped that I'm proud of:

⚙️ Cut MTTR by 35% by writing incident runbooks that people actually follow, cleaning up alert noise, and running blameless postmortems 💰 Saved 20% on infrastructure spend through capacity planning, autoscaling tuning, and reserved instance strategies 🏗️ Standardized production AWS + Kubernetes environments with Terraform across 5+ enterprise programs in FinTech, IoT, and Litigation 🔐 Designed SOC 2-aligned architectures with proper access controls, centralized logging, and DR readiness 🤖 Built MLOps pipelines for model deployment and inference monitoring in production 📊 Set up observability stacks (CloudWatch, Prometheus, Grafana, Datadog) that caught problems before customers did 🚀 Governed CI/CD across multiple teams with compliance gates, security scans, and change traceability baked in

👥 Mentored 4 junior engineers on production readiness and on-call practices

Stack: AWS · Kubernetes · Docker · Terraform · GitHub Actions · GitLab CI · Jenkins · Prometheus · Grafana · Datadog · Python · Bash · Linux

Domains I've worked in: FinTech, IoT, Healthcare, Regulated Enterprise, Early-stage Startups

Currently looking at Senior DevOps, SRE, and Platform Engineering roles at tech companies, banks, and consulting firms where reliability actually matters.

Reach me at [email protected] | +91-7019261553/9457296121

Languages

English

Employment History

Senior DevOps Engineer/Site Reliability Engineer at The Scalers: Expian Technologies Pvt. Ltd. 2023 - 2025

• Owned end-to-end AWS platform delivery across 5+ concurrent client projects, standardizing infrastructure provisioning, scaling, DR runbooks, and deployment repeatability using Terraform, Kubernetes, and Helm. • Implemented GitOps workflows with ArgoCD and Flux, enabling declarative continuous deployment with automated drift detection, environment-specific promotion controls, and full audit traceability for production releases. • Deployed Istio service mesh for east-west traffic control, enforcing mutual TLS across all inter-service communication and implementing circuit-breaking and retry policies for distributed microservice workloads. • Defined and embedded SLIs/SLOs for availability and latency as release gate criteria, driving a 35% reduction in MTTR through structured on-call rotation, formalized incident runbooks, and blameless post-incident reviews with documented RCAs. • Governed CI/CD pipelines (GitHub Actions, GitLab CI) across multiple teams, enforcing automated security scanning, change traceability, and compliance gates at build and deploy stages to shift quality checks left. • Built enterprise observability stacks integrating Prometheus, Grafana, Datadog, ELK Stack, and OpenTelemetry, enabling distributed tracing, log correlation, and proactive anomaly detection across client environments. • Architected SOC 2 and PCI-DSS-aligned environments with HashiCorp Vault for dynamic secrets management, zero-trust network segmentation, audit-ready access controls, centralized logging, and documented DR readiness. • Delivered MLOps infrastructure covering model deployment pipelines, inference endpoint autoscaling, performance tracking dashboards, and drift alerting for production ML workloads. • Reduced infrastructure spend by ~20% through Karpenter-based node right-sizing, Reserved Instance procurement, and regular cost optimization reviews across multi-tenant AWS accounts. • Mentored 4 junior engineers on production readiness standards, on-call practices, and SRE fundamentals, enabling independent incident ownership and reducing escalation frequency.

DevOps Consultant at Freelance 2021 - 2023

• Delivered end-to-end DevOps engagements for early-stage startups, architecting AWS infrastructure, containerized deployments (Docker, Kubernetes), and Terraform-managed environments built for independent operation. • Built CI/CD pipelines using GitHub Actions and GitLab CI for automated build, test, and deployment workflows, cutting manual release overhead and rollback times for small engineering teams. • Implemented observability stacks (CloudWatch, Prometheus, Grafana, Datadog) with structured alerting, providing teams with operational visibility before and after production launches. • Introduced HashiCorp Vault for secrets management with automated vulnerability scanning and secrets rotation, reducing credential exposure risk across client environments. • Optimized cloud spend through right-sizing, autoscaling policies, and resource lifecycle management, delivering consistent cost reductions across multiple client accounts.

DevOps Engineer at Cheerz Sporting and Entertainment Pvt. Ltd. 2020 - 2021

• Designed AWS cloud infrastructure from scratch for a fantasy sports platform, implementing environment separation, deployment automation, and production-grade configurations ahead of a live product launch. • Established CI workflows using Bitbucket Pipelines, stabilized release pipelines, and resolved recurring build and deployment failures that were blocking engineering velocity. • Implemented CloudWatch monitoring and alerting for core platform metrics; maintained infrastructure documentation used by product and backend teams for cross-functional coordination.

Salesforce Developer (Intern) at Cloud Paradigm Inc. 2019 - 2020

• Built a Bill of Materials and invoicing module on Salesforce for a healthcare procurement platform, supporting B2B vendor management across Sales Cloud, Service Cloud, and Experience Cloud. • Integrated CRM data flows for end-to-end procurement lifecycle management and implemented Einstein Bot for automated query handling. • Created technical documentation covering module behavior, data models, and implementation logic.

Education

B.E.(CSE) at Visvesraya Technological University 2015 - 2020

Senior DevOps Engineer/Site Reliability Engineer

Skills

Languages

Employment History

Education

Get realtime job alerts