A recent study suggests that artificial intelligence can outperform university students in exams. Researchers from the University of Reading conducted a limited study where they created 33 fictitious students and used the AI tool ChatGPT to generate answers for undergraduate psychology exams.
The results were surprising: on average, the AI-generated answers scored half a grade higher than those of real students. Additionally, 94% of the AI-generated essays did not raise concerns among the markers, making them nearly undetectable.
The study, published in the journal Plos One, highlights a significant issue. With a detection rate of only 6%, it is suggested that this might be an overestimate. This raises concerns about the integrity of educational assessments, as students using AI could potentially cheat and secure better grades than those who do not.
You may also like: New AI Tools for Google Workspace for Education
Associate Prof Peter Scarfe and Prof Etienne Roesch, who led the study, emphasized that their findings should alert educators worldwide. Dr. Scarfe pointed out the importance of understanding AI’s impact on assessment integrity and suggested that while traditional exams might not make a full comeback, the global education sector needs to adapt to the challenges posed by AI.
In the study, AI-generated answers were submitted for first-, second-, and third-year modules without the markers’ knowledge. The AI outperformed real students in the first two years but not in the third year, where human students excelled, indicating AI’s difficulty with more abstract reasoning.
This study is the largest and most robust blind study of its kind. It aligns with concerns raised by academics about AI’s influence in education. For example, Glasgow University recently reintroduced in-person exams for one course. Earlier this year, a study reported by the Guardian found that most undergraduates used AI programs to assist with their essays, although only 5% admitted to submitting unedited AI-generated text.