Location
toronto
Job Type
Full-time
Posted
June 06, 2026
Job Description
Join the forefront of AI with a Senior Research Scientist focusing on model evaluation methods. This innovative position centers on developing prototypes to measure LLM capabilities accurately.
This role is vital for advancing the evaluation techniques needed as AI models approach superhuman performance. You will be responsible for setting ambitious benchmarks and creating infrastructure that accurately assesses LLM performance. Your strong software engineering skills will be key as you engage with cross-functional teams to deliver reliable, repeatable evaluations.
Key Responsibilities:
• Innovate evaluation methods for large language models
• Establish benchmarks that challenge model capacities
• Collaborate closely with teams on evaluation metrics
• Conduct research to refine evaluation efficiency
• Build tools for in-depth analysis of model outputs
Requirements:
• Proficient in software engineering skills
• Experience analyzing complex LLM data
• Str...
This role is vital for advancing the evaluation techniques needed as AI models approach superhuman performance. You will be responsible for setting ambitious benchmarks and creating infrastructure that accurately assesses LLM performance. Your strong software engineering skills will be key as you engage with cross-functional teams to deliver reliable, repeatable evaluations.
Key Responsibilities:
• Innovate evaluation methods for large language models
• Establish benchmarks that challenge model capacities
• Collaborate closely with teams on evaluation metrics
• Conduct research to refine evaluation efficiency
• Build tools for in-depth analysis of model outputs
Requirements:
• Proficient in software engineering skills
• Experience analyzing complex LLM data
• Str...
Ready to Apply?
Submit your application for AI Evaluation Scientist – LLM Research at Cohere
Apply Now