AI in University Assessment: Evaluating the Opportunities and Risks of Automated Marking

Authors

Alexandru Marcoci

Research Associate in AI Risk and Foresight

Abstract

The ai@cam OpRaise project tested whether AI can reliably mark students' written exam responses, comparing AI scores to human assessor marks across three UK universities.

Download full report

Download

Talmi, D., Benn, Y., Tibon, R., Corsi, G. & Marcoci, A..(2026). AI in University Assessment: Evaluating the Opportunities and Risks of Automated Marking.. University of Cambridge. Available from https://www.ai.cam.ac.uk/reports/ai-in-university-assessment-evaluating-the-opportunities-and-risks-of-automated-marking/

Download

@TechReport{ai-in-university-assessment-evaluating-the-opportunities-and-risks-of-automated-marking,
  title = {AI in University Assessment: Evaluating the Opportunities and Risks of Automated Marking},
  author = {Talmi, Deborah and Benn, Yael and Tibon, Roni and Corsi, Giulio and Marcoci, Alexandru},
  year = {2026},
  publisher = {University of Cambridge},
  pdf = {https://www.ai.cam.ac.uk/assets/images/uploads/opraise-report-2026.pdf},
  url = {https://www.ai.cam.ac.uk/reports/ai-in-university-assessment-evaluating-the-opportunities-and-risks-of-automated-marking/},
  abstract = {The ai@cam OpRaise project tested whether AI can reliably mark students' written exam responses, comparing AI scores to human assessor marks across three UK universities. }}

Download

%0 Report
%T AI in University Assessment: Evaluating the Opportunities and Risks of Automated Marking
%A Deborah Talmi
%A Yael Benn
%A Roni Tibon
%A Giulio Corsi
%A Alexandru Marcoci
%D 2026
%F ai-in-university-assessment-evaluating-the-opportunities-and-risks-of-automated-marking
%I University of Cambridge
%U https://www.ai.cam.ac.uk/reports/ai-in-university-assessment-evaluating-the-opportunities-and-risks-of-automated-marking/
%X The ai@cam OpRaise project tested whether AI can reliably mark students' written exam responses, comparing AI scores to human assessor marks across three UK universities.

Download

TY  - RPRT
TI  - AI in University Assessment: Evaluating the Opportunities and Risks of Automated Marking
AU  - Deborah Talmi
AU  - Yael Benn
AU  - Roni Tibon
AU  - Giulio Corsi
AU  - Alexandru Marcoci
PY  - 2026/05/29
ID  - ai-in-university-assessment-evaluating-the-opportunities-and-risks-of-automated-marking
PB  - University of Cambridge
L1  - https://www.ai.cam.ac.uk/assets/images/uploads/opraise-report-2026.pdf
UR  - https://www.ai.cam.ac.uk/reports/ai-in-university-assessment-evaluating-the-opportunities-and-risks-of-automated-marking/
AB  - The ai@cam OpRaise project tested whether AI can reliably mark students' written exam responses, comparing AI scores to human assessor marks across three UK universities. 
ER  -

AI in University Assessment: Evaluating the Opportunities and Risks of Automated Marking

Generative Artificial Intelligence (AI) has disrupted the entire education sector. The ai@cam OpRaise project focuses on the ability of AI, particularly Large Language Models (LLMs), to evaluate students’ work, particularly their long-form responses to open-ended questions. Many students report using LLMs to seek feedback on essays, including in high-stakes situations. The research in question was whether AI-generated numerical feedback is sufficiently robust to support students and educators. The project contextualised evidence by considering the views of stakeholders on the broader opportunities and risks of integrating AI systems into University assessment practices.

Key Findings

AI marking accuracy was only moderate at best, with degree band agreement ranging from 35%–63% across universities — below the threshold needed for confident deployment.
AI systems showed a central tendency bias, compressing marks toward the middle and performing worst at grade boundaries and for the highest and lowest-performing students.
AI marks were oversensitive to surface features like essay length and vocabulary range, rather than the quality of academic reasoning that human markers prioritise.
Performance varied significantly across institutions, meaning results from one context cannot be used as evidence of readiness elsewhere.
Students and staff view human contact and judgement as fundamental to the social contract of higher education — many students said they would feel “cheated” if AI marked their work.

Recommendations

Proceed with caution — current AI systems are not accurate or valid enough for formal assessment use.
Evaluate AI locally before any deployment, using own materials, cohort, and marking practices.
Preserve human authority over final marks in all scenarios — AI should support human judgement, not replace it.
Engage staff and students openly before any adoption, building trust through transparency and concrete discussion of specific use cases.
Monitor continuously — AI model performance is unstable over time and institution-specific, making ongoing review essential.

AI in University Assessment: Evaluating the Opportunities and Risks of Automated Marking

Authors

Deborah Talmi

Yael Benn

Roni Tibon

Giulio Corsi

Alexandru Marcoci

Abstract

Cookie Preferences

Essential Cookies Always Required

Analytics Cookies

Essential Cookies
Always Required