This job post is closed and the position is probably filled. Please do not apply.
🤖 Automatically closed by a robot after apply link
was detected as broken.
Description:
As a Site Reliability Engineer (SRE) at Chalice.AI, you will be responsible for managing reliability and quality assurance activities.
You will play a key role in building software and systems to enhance the reliability, quality, and time-to-market of the software solutions.
Ensure compliance of Chalice's solutions, data, and models with industry standard frameworks and regulations.
Collaborate with development teams to enhance services through rigorous testing and release procedures.
Maintain and extend the monitoring and notification platform.
Manage cloud infrastructure for repeatability and fault tolerance using Terraform and Cloud Formation scripts.
Requirements:
Bachelor’s degree or equivalent with 5-7 years of relevant experience.
Proficiency in Python, AWS, 3rd party API usage, large-scale databases, and working with large data sets.
Experience with Airflow, Databricks, and digital advertising technology.
Knowledge of monitoring and logging tools like CloudWatch, DataDog, and PagerDuty.
Hands-on experience with CI/CD pipelines and configuration management tools.
Strong analytical, problem-solving, collaboration, and communication skills.
Benefits:
Work with cutting-edge technologies in a fast-paced, innovative environment.
Opportunities for professional development and career advancement.
Join a diverse and inclusive team that values collaboration and creativity.
Competitive salary and benefits packages.
Inclusive workplace culture that celebrates diversity.
Medical, dental/vision, 401k options, unlimited PTO, and 11 Company Holidays.
Office-wide closure between Christmas Eve and New Years.