The position is for a Python Software Engineer specializing in AI, specifically in Code Evaluation & Training, and is fully remote.
The role involves training large-language models (LLMs) to write production-grade code across various programming languages.
Responsibilities include comparing and ranking multiple code snippets, explaining which is best and why, repairing and refactoring AI-generated code for correctness, efficiency, and style, and injecting feedback into the RLHF pipeline to ensure smooth operation.
The end goal is for the model to learn to propose, critique, and improve code in a manner similar to the engineer's approach.
The RLHF process involves generating code, having expert engineers rank, edit, and justify it, converting that feedback into reward signals, and using reinforcement learning to tune the model toward deployable code.
Requirements:
Candidates must have 4+ years of professional software engineering experience in one or more of the following languages: Python, Java, JavaScript, TypeScript, Go, C++, PHP, COBOL, C, Ruby, or Rust. Constraint programming experience is a bonus but not mandatory for all languages.
Strong code-review instincts are essential, with the ability to quickly identify logic errors, performance traps, and security issues.
Candidates must possess extreme attention to detail and excellent written communication skills, as much of the role involves explaining the rationale behind code choices.
A passion for reading documentation and language specifications is required, along with the ability to thrive in an asynchronous, low-oversight environment.
No prior experience in RLHF or AI training is necessary, nor is deep machine learning knowledge; the ability to review and critique code clearly is sufficient.
Benefits:
The position is fully remote, allowing candidates to work from anywhere.
Compensation is up to $30 per hour.
The role offers flexible hours, with a minimum of 15 hours per week and the possibility of up to 40 hours per week.
Engagement is structured as a 1099 contract, providing straightforward impact without unnecessary complexity.