Remote Senior Site Reliability Engineer (Remote, US)
Posted
This job is closed
This job post is closed and the position is probably filled. Please do not apply.
π€ Automatically closed by a robot after apply link
was detected as broken.
Description:
Collective[i] is seeking a skilled and motivated professional to join their team as a Senior Site Reliability Engineer.
The position is fully remote and allows employees to work from anywhere in the United States.
The role involves managing AWS infrastructure across multiple accounts using Terraform, with a focus on deployment and automation.
Candidates should have proficiency in Linux and open-source tooling, including various Linux distributions, scripting languages, clustering technologies, database engines, and configuration management tools, preferably Ansible.
The engineer will develop and implement containerization strategies, ensuring the creation of original containers rather than relying on third-party containers.
Knowledge of Kubernetes is required, with an understanding of when and why to use it, although the environment is not Kubernetes-focused.
Collaboration with development teams is essential to support the building and optimization of distributed systems.
The role requires maintaining expertise in Git workflows and proficiency in CI/CD automation tools such as GitHub Actions.
The engineer will implement and manage monitoring and logging solutions, utilizing tools like DataDog and OpenTelemetry.
Proactive management of system stability and reliability is crucial to prevent issues such as log diving, incident response, root cause analysis, and late-night pages.
Requirements:
Candidates must have proficiency with AWS, Terraform, Packer, Ansible, and container technologies.
Expertise in AWS services is required, and experience with other cloud providers is a plus.
Strong knowledge of Ubuntu 24.04, Bash, Python, systemd, podman, docker, and auditd is necessary.
Familiarity with GitHub, GitHub Actions, GitHub Container Registry, and Copilot is expected.
Experience with monitoring and logging tools like DataDog, OpenTelemetry, and Graylog is essential.
Proficiency in working with databases and platforms such as Snowflake, Okta, Postgres, MongoDB, and ElasticSearch is required.
Familiarity with security tools like Snyk, Tenable.io, and 1Password is important.
Experience with SOC 2 or other compliance standards is highly desirable.
Benefits:
The salary for this position ranges from $150,000 to $185,000 per year, reflecting the diverse and complex nature of the job market.
Collective[i] offers a fully remote work environment, allowing employees to work from anywhere in the United States.
The company values diversity of experience and provides a platform for individuals to contribute their exceptional talents.
Employees have the opportunity to learn and grow alongside a talented and tenacious team.
Collective[i] is committed to building a company and community focused on helping people and companies prosper.