Job Overview
The role involves collaborating with a leading AI research organisation to enhance the reasoning, actions, and communication of coding assistants within development workflows. The organisation is looking for technically skilled professionals—particularly those with experience in code review, testing, or documentation—to evaluate complete transcripts of user–AI coding interactions. This short-term project contributes directly to shaping the next generation of AI tools that assist developers.
Core Responsibilities
- Examine detailed transcripts of interactions between users and AI coding assistants.
- Evaluate the AI’s reasoning, execution, and declared actions with precision.
- Assign scores to each transcript using a 10-point rubric across multiple evaluation criteria.
- Provide concise justifications when needed, referencing specific examples from the conversations.
- Identify inconsistencies between the AI’s stated intentions and actual behaviour (e.g., promising to run tests but failing to do so).
Skills & Requirements
Preferred Backgrounds:
- Senior or Staff Engineers with extensive experience in code review and practical execution insight.
- QA Engineers with a strong focus on verification and consistency analysis.
- Technical Writers or Documentation Specialists adept at comparing instructions with actual implementation.
Also Well-Suited:
- Backend or Full-Stack Developers experienced with function calls, APIs, and testing workflows.
- DevOps or SRE professionals knowledgeable in tool orchestration and system behaviour analysis.
Languages and Tools:
- Proficiency in Python is advantageous, as most transcripts are Python-based.
- Familiarity with additional languages such as JavaScript, TypeScript, Java, C++, Go, Ruby, Rust, or Bash is beneficial.
- Experience with Git workflows, testing frameworks, and debugging tools is highly valuable.
More About the Opportunity
- Each transcript batch should be completed within five hours of starting, with unlimited batches available.
- Engagement is task-based and flexible, with the possibility of recurring assignments.
Application Process
- Candidates begin by submitting a resume.
- Selected applicants receive rubric guidelines and access to the evaluation platform.
- Most candidates are notified within a few business days.
- The organisation welcomes all qualified applicants, irrespective of legally protected characteristics, and provides reasonable accommodations upon request.
Contract and Payment Terms
- Engagement is on an independent contractor basis.
- The role is fully remote, allowing work to be completed on a flexible schedule.
- Project durations may be extended, shortened, or concluded early depending on organisational needs and individual performance.
- The work does not require access to any confidential or proprietary information from current or past employers, clients, or institutions.
- Payments are issued weekly via Stripe or Wise, based on services rendered.
