This job post is closed and the position is probably filled. Please do not apply.
🤖 Automatically closed by a robot after apply link
was detected as broken.
Description:
The company is seeking a Senior Staff Engineer specializing in Site Reliability to work remotely in Colombia.
The role involves being an experienced L3 SRE engineer responsible for a business-critical SaaS application.
The position requires expertise in full-stack capabilities, including infrastructure backend and front-end, to handle issues before escalating to the engineering business unit.
The candidate should be able to automate SRE tools for proactive L3 support in alignment with the tech monitoring strategy.
Ability to work under business pressure for business-critical applications and effectively communicate with various stakeholders during troubleshooting.
Requirements:
Must have expertise in Kubernetes, Github Actions, Terraform, and AWS.
Strong communication skills are essential.
Experience in incident and problem management, as well as working with multitenant applications.
Solid understanding of networking concepts such as TCP/IP, DNS, Routing, VPCs, subnets, firewalls, load balancing, TLS, and SSL.
Proficiency in CI/CD pipelines (e.g., Jenkins, Github Actions) and version control.
Familiarity with Python, react/next, monitoring, logging, Grafana, Prometheus, Loki, or ELK for resource utilization analysis and issue identification.
Experience with AWS services, especially EKS, serverless, queues, and various databases.
Solid knowledge of Kubernetes is required.
Benefits:
Full-time remote position in Colombia.
Opportunity to work for a Digital Product Engineering company that is rapidly scaling.
Dynamic and non-hierarchical work culture.
Chance to collaborate with a global team of 19000+ experts across 33 countries.