Please, let Rackspace know you found this job
on RemoteYeah.
This helps us grow 🌱.
Description:
Rackspace is building its Professional Services Center of Excellence on Application Performance Monitoring Suites.
The role involves solving complex business problems and contributing to the development of modern applications for customers.
The position focuses on helping customers understand the connections between application performance, user experience, and business outcomes.
Responsibilities include implementing Observability solutions, building and maintaining scalable systems, and developing monitoring tools.
The engineer will proactively gather and analyze metric and log data for anomaly detection, performance tuning, capacity planning, and fault isolation.
Collaboration with development teams is essential to implement and deploy new features while ensuring reliability, security, and performance standards.
The role requires maintaining a deep understanding of the customer’s business and technical environment.
Identifying performance bottlenecks and resolving root causes of service issues is a key responsibility.
Requirements:
Candidates must have 3+ years of experience designing, building, and maintaining AWS EKS and Azure AKS infrastructure with Terraform.
A minimum of 3 years' experience with Kafka in large-scale environments handling hundreds of terabytes to petabytes of data is required.
At least 3+ years of experience in designing, building, and maintaining SaaS environments is necessary.
Candidates should have 3+ years as a Site Reliability Engineer (SRE) with solid experience in Prometheus, Grafana, Datadog, ELK, etc.
Experience in building and running Kubernetes clusters with expertise in scaling, operators, istio, and troubleshooting for 3 years is required.
A minimum of 3 years' experience with observability, including monitoring, logging, tracing, and metrics is essential.
Candidates must have 3 years' experience with GitOps CI/CD processes.
Proficiency in scripting with Python, Go (Golang), bash, and AWS CLI tools for at least 3 years is required.
Experience with security operations, including security policies, infrastructure, key management, and encryption setup for 3 years is necessary.
Candidates should have 3 years of experience implementing and maintaining disaster recovery strategies, including MySQL and Zookeeper.
Benefits:
Rackspace offers a collaborative work environment that values diverse perspectives and innovation.
The company is recognized as a best place to work by Fortune, Forbes, and Glassdoor, attracting world-class talent.
Employees are encouraged to bring their whole selves to work and are supported in their unique perspectives.
Rackspace is committed to equal employment opportunities and provides accommodations for individuals with disabilities or special needs.
Apply now
Please, let Rackspace know you found this job
on RemoteYeah
.
This helps us grow 🌱.