Remote Senior Site Reliability Engineer (Remote, Canada)
Posted
This job is closed
This job post is closed and the position is probably filled. Please do not apply.
🤖 Automatically closed by a robot after apply link
was detected as broken.
Description:
Collective[i] is seeking a skilled and motivated professional to join their team as a Senior Site Reliability Engineer.
The role involves managing AWS infrastructure across multiple accounts using Terraform, with a strong focus on deployment and automation.
Candidates should have proficiency in Linux and open-source tooling, with experience in various Linux distributions, scripting languages, clustering technologies, database engines, and configuration management tools, preferably Ansible.
The position requires developing and implementing containerization strategies, ensuring the creation of original containers rather than relying on third-party containers.
Candidates should have selective knowledge of Kubernetes, understanding its appropriate use in a non-Kubernetes-focused environment.
The role includes collaborating closely with development teams to support the building and optimization of distributed systems.
Expertise in Git workflows and CI/CD automation tools, such as GitHub Actions, is essential.
The Senior Site Reliability Engineer will implement and manage monitoring and logging solutions, with hands-on experience in tools like DataDog and OpenTelemetry.
Proactive management of system stability and reliability is crucial to prevent issues like log diving, incident response, root cause analysis, and late-night pages.
Requirements:
Candidates must have proficiency with AWS, Terraform, Packer, Ansible, and container technologies.
Expertise in AWS services is required, and experience with other cloud providers is a plus.
Strong knowledge of Ubuntu 24.04, Bash, Python, systemd, podman, docker, and auditd is necessary.
Familiarity with GitHub, GitHub Actions, GitHub Container Registry, and Copilot is expected.
Experience with monitoring and logging tools like DataDog, OpenTelemetry, and Graylog is required.
Proficiency in working with databases and platforms such as Snowflake, Okta, Postgres, MongoDB, and ElasticSearch is essential.
Familiarity with security tools like Snyk, Tenable.io, and 1Password is important.
Experience with SOC 2 or other compliance standards is highly desirable.
Benefits:
The position offers a competitive salary range of $150,000 to $185,000 per year, reflecting the diverse and complex nature of the job market.
Collective[i] is a private 100% remote company, providing flexibility in work location.
The company values diversity of experience and knowledge, fostering a culture of learning and growth alongside a talented team.
Employees are encouraged to thrive in an innovative environment and work with cutting-edge technology.
Collective[i] promotes core values such as curiosity, direct communication, delivering results, succeeding together, and striving for the extraordinary.