New report recommends better assessment design to tackle potential AI threat to academic integrity
A pioneering new report looking at the impact of generative artificial intelligence (gen AI) on different types of assessment has recommended that the performance of tools, such as ChatGPT, should be used to inform and upskill assessment design, rather than trying to detect misuse.
The large-scale investigation was conducted by the researchers from The Open University (OU), in partnership with the awarding organisation and leader in technical and vocational education NCFE.
Researchers analysed more than 900 scripts across 17 different types of questions and found that gen AI is capable of achieving a pass grade, and sometimes higher, on almost all types of assessments. It performed highly for the lower level (Level 3) answers, with performance reducing as the assessments increased in difficulty through levels four, five and six.
The study also found that the methods that have previously been used to detect plagiarism do not work for AI and although training the markers to improve detection did increase their ability to find AI use, it also increased the number of false positives – the incorrect identification of students answers as AI.
Jonquil Lowe, Senior Lecturer in Economics and Personal Finance at The Open University, and one of the project researchers, said:
“The research suggests that trying to detect misuse of generative AI is not effective. Instead, what we learn about gen AI can help us design more robust questions and marking guidance that focus on assessing the added value that humans bring to a task. This shifts us away from merely testing knowledge, towards what is often called ‘authentic assessment’ that requires explicit application of what has been learned in order to derive specific conclusions and solutions.”
Examples of the more robust ‘authentic assessment’ questions are audience tailored and include role play questions where students need to think critically and apply what they have learned to realistic scenarios, and reflection on work practice questions provided it requires evidence of specific examples.
The report concludes that rather than focusing on misuse detection which is not feasible and requires a lot of institutional capacity, institutions should instead use understanding of how gen AI tackles questions to inform question design and marking guidance, so that students have to complete tasks that are more difficult for gen AI tools to replicate well. It also highlights the importance of upskilling teaching staff in the use of gen AI to help strengthen students’ study skills.
Gray Mytton, Assessment Innovation Manager at NCFE, said:
“This report highlights the challenges in detecting genAI misuse in assessments, showing that training markers to spot AI-generated content can lead to an increase in the rate of false positives.
“To address this, educators could help students develop study skills, including genAI use where appropriate, while awarding bodies like NCFE can focus on creating more authentic assessments, which will also benefit learners as they enter the workforce.”
Liz Hardie, lead researcher on the project and Senior lecturer in Law at The Open University, said:
“We are grateful to NCFE for supporting this research and are excited to share our findings. This research will support higher education institutions in thinking about how to adopt AI-informed approaches to learning, teaching, and assessment.”
The report has been funded by NCFE’s Assessment Innovation Fund. Launched in 2021 to help provide evidence-based, alternative assessment solutions where the impact can be tested in real-life, it has already invested over £1 million across a range of innovative projects.
Responses