Where is test security going in the age of AI?
Where is security for tests and exams going in the age of AI? John Kleeman, EVP of Learnosity, shares five takeaways from attending the recent Conference on Test Security in Tempe, Arizona.
With the advent of generative AI, the world of test security is feverish. It seemed appropriate. then, that the recent Conference on Test Security was held in Tempe, Arizona at a time when the weather was crazily hot at 100°F (39°C).
The conference is an annual gathering of some of the world’s experts in test and exam security, sharing and learning about test security. It’s very friendly and collegial: a group of people seeking to do the best they can in this important field. I was fortunate to attend the conference this year and here is some of what I learned.
1. AI as an aide to test developers
The biggest discussion point at the conference was unsurprisingly AI, focusing often on its use in test development.
A big challenge to test security is pre-knowledge. If test takers get to see questions in advance of the exam, they get an advantage; some may pass the exam without having the necessary competence. Creating good questions is time-consuming and expensive, so test sponsors are concerned about the risk that their questions are copied and shared with or sold to test takers.
Increasing the number of questions in an item bank is very helpful for test security. With a large pool of questions available, the risk of test cheating is decreased as it reduces the odds of the leaked questions being the exact ones a test taker will be asked in an exam. And the effort to learn the answers to all questions may make it simpler just to gain the competence. Unfortunately making a large item bank is expensive, but with the new technological leaps we’ve seen recently, it might not remain so.
One of the most promising and current use cases for AI is to help generate such questions. Generative AI can create questions that can then be reviewed and adjusted as necessary by humans. A few organisations like Duolingo are already doing this, and many more are planning to do so.
2. Alignment on vocabulary to discuss test security
Less glamorous but also important is that the community seems aligned on the correct vocabulary needed in order to talk about test security accurately. There is consensus that test security needs to be risk-based. So, you consider the threats to your tests and exams and evaluate the risks of each threat. You can then focus your fraud prevention strategies on the most concerning risks.
The 2022 ITC/ATP Technology-based Assessment Guidelines suggest that strategies should be grouped into these three categories:
- Deter. Deterrence strategies deter test takers from conducting test fraud, e.g.,
- by encouraging them to consider the risks of getting caught.
- Prevent. Prevention strategies make it harder for test takers to cheat, e.g., by proctoring or using secure browsers.
- Detect and Respond. These strategies detect test fraud, e.g., by statistical analysis or tiplines, and then respond and deal with it.
Several speakers aligned on using these categories. Usually, it’s sensible to adopt a mix from all three categories. This seems a useful way of describing test security, to help communicate priorities and allocate effort effectively.
3. Documentation and preparedness is really important
It’s also not glamorous, but one of the keys to test security is to document your processes and procedures and to prepare in advance for all eventualities.
If there is a security issue, you need to plan for it in advance, as you don’t want to rely on ad hoc decision-making. If you suspect someone of cheating or test fraud, it’s best to follow pre-defined procedures on how you investigate or respond to the issue. This will both make your response more effective and more likely to be fair and defensible.
You also need to communicate the rules of the test very clearly to test takers. They need to know what counts as cheating and what is permissible. This should include what the rules are about them using AI to help prepare for the exam, and what tools (if any) they are allowed during the exam.
Documentation is part of the key to good test security.
4. Increasing focus on inclusivity when considering test security
There is increasing understanding that some test security measures negatively impact inclusivity. For example, in remotely proctored tests, it has been common that you have to be in front of the computer for the whole test, and if you need a break or to attend to a child, you will fail the test. This is a challenge for some test takers. However, two speakers shared that they permitted short breaks during remotely proctored tests, and that this hadn’t seemed to impact validity.
I was pleased to present with Liberty Munson of Microsoft and Ben Hunter of Caveon on this important issue.
Liberty Munson of Microsoft, Ben Hunter of Caveon and John Kleeman of Learnosity at Conference on Test Security
If someone passes a test because of cheating, the validity of the test is impacted negatively.
But if someone fails a test or cannot take it because it is not accessible or it is biased against a cultural group, then that also negatively impacts validity. For example, if a test requires a government ID to be permitted to sit the exam, then some disadvantaged groups who do not have easy access to a passport or driving licence may be unable to take the test.
We suggested that when selecting test security measures, you need to consider whether they had any diversity or inclusivity impact and make an appropriate balancing decision.
5. What else might AI be able to help with
As well as using AI in test development, there are many other opportunities for using AI in testing and test security.
There are exciting possibilities in using AI to identify test fraud as a supplement to more mechanistic analytic measures. AI is good at classifying and identifying information, but it’s also often not transparent and not explainable, and it can make mistakes. If AI flags possible test fraud, you probably cannot take action directly, but AI may be able to flag issues for further investigation.
There is also a lot of interest and some practical use in using AI to assist in proctoring/invigilation, including flagging possible issues for human review. AI may also be able to be used to help monitor human proctors and identify people who may need intervention or training to help them proctor better.
Other use cases for AI include automated scoring of short answers and other questions and automated blueprinting.
One of the presenters shared this quote from the Harvard Business Review: “AI won’t replace humans – but humans with AI will replace humans without AI”, this makes sense in testing for the next few years in my view.
I hope this report is helpful for creating your own test security strategy.
I’m also pleased to report that it wasn’t all hard work for me going there. As well as meeting lots of great people, I managed to visit the Desert Botanical Garden in Phoenix, which is stunning and well worth going to if you are in the area.
By John Kleeman, EVP of Learnosity
Responses