ObviouslyAI is seeking a creative and curious Backend & AI Agent Testing Engineer to join their team.
The role involves hands-on work with AI agents and backend services, including writing code, debugging, scripting tests, and evaluating LLM prompts.
This position is not a traditional backend or QA role; it requires quick learning of new tools and designing tests from a userโs perspective.
Responsibilities include writing, debugging, and maintaining backend code in Python or JavaScript, implementing APIs, and ensuring authentication workflows function correctly.
The engineer will design and execute test strategies for AI agent behavior, particularly for LLM-based agents, and evaluate AI outputs using tools like Deepchecks.
The role also involves scripting and automation to test AI agents and backend workflows, building automation frameworks, and developing test infrastructure.
The engineer will explore new apps and services, anticipate edge cases, and creatively manage AI systems that may behave inconsistently.
Collaboration with cross-functional teams in a startup environment is essential, along with continuous learning of new frameworks and tools.
Requirements:
Candidates should have approximately 0โ3 years of experience in backend development or testing, ideally in a startup or experimental role.
A basic understanding of backend engineering is required, including the ability to write backend code, debug, and create test cases in Python or JavaScript.
The ideal candidate should possess creative testing skills, approaching testing with innovation and experimentation.
Exposure to or experience with testing and prompting AI agents, especially GenAI/LLM-based systems, is necessary.
Comfort with writing backend scripts, automating tests, and creating testing frameworks is essential.
Familiarity with integrations, particularly with HubSpot and Salesforce, and an understanding of API interactions and authentication are required.
Knowledge of modern automation frameworks, such as Selenium, is preferred.
Experience in B2B product environments is beneficial.
Bonus points for experience with LLM evaluation frameworks, metrics, or MCPs.
A curious and fast-learning mindset is crucial, along with a willingness to experiment and tackle ambiguous problems.
Benefits:
The position offers a fast-moving, supportive startup culture that values experimentation and creativity.
Employees will have the opportunity to work with cutting-edge AI systems, tools, and frameworks.
There is a strong emphasis on learning and rapid growth alongside a talented and collaborative team.
The company promotes a healthy work-life balance with flexible working hours.