Remote Staff Software Engineer - Compute Reliability and Efficiency
Posted
This job is closed
This job post is closed and the position is probably filled. Please do not apply.
π€ Automatically closed by a robot after apply link
was detected as broken.
Description:
The Staff Software Engineer position at Reddit focuses on lower-level (Linux and Kubernetes) systems engineering within the Compute Reliability and Efficiency team.
The role involves working on intra-cluster engineering problems related to performance, efficiency, and stability.
Responsibilities include tasks such as detection of node-level performance characteristics, developing schedulers for resource packing, and integrating lower-level Kubernetes components.
The position requires collaborating with a team to create and maintain the foundational platform for Reddit's infrastructure.
Daily tasks involve executing performance and reliability analysis on the Linux-based Kubernetes fleet, designing and delivering software to enhance Reddit's Compute Platform, and contributing to the technical and strategic direction of the platform.
The role also includes automating critical aspects of the development process and sharing on-call responsibilities with the Compute team.
Requirements:
7+ years of experience in infrastructure domain with a focus on lower-level systems like Linux.
Proficiency in Go (Preferred), Rust, or Python programming languages.
Understanding of kernel primitives, CPU scheduling, userspace concerns, and packet processing.
Experience developing on top of Kubernetes or similar distributed systems.
Strong troubleshooting skills from higher-level orchestration concerns to lower-level runtime issues.
Experience in designing large systems, scoping work, and building consensus with other engineers.
Excellent communication skills to collaborate effectively with a service-oriented team and company.