From education to employment

Will AI make tests and exams more inclusive?

John Kleeman

Tests and exams are sometimes criticised as being written by the majority and being biased against some minorities. With assessments a gateway to life chances in education and work, it’s crucial they are designed for everyone. This article explores whether AI will improve diversity and inclusivity or worsen it.

Tests and exams need to work for everyone

In an ideal world, every learner should be able to use their exam results to showcase their skills and help them reach life goals in education and the workplace. 

In the real world, many learners do leverage exam results to obtain education or work opportunities. The difficulty is that some learners struggle to demonstrate their competence in tests and exams. 

There is increasing understanding that in the past tests were often created by the “majority” or from a mainstream view, without taking into account minority needs. For example, making the exam fair and inclusive to people such as non-native speakers, those with accessibility challenges and those coming from different cultures and demographics was not always a priority. 

According to the 2021 census, around 91% of people in England have English as their main language, but this means that around 9% speak another language. Despite this, almost all tests and exams in England (across different sectors and areas) are presented in English with little accommodation or support for non-English speakers. So, unless speaking and writing good English is part of the construct being tested, non-native speakers are likely disadvantaged. 

There can also be cultural or demographic bias within exams. A widely reported recent example was when a UK GCSE German exam asked students to describe the advantages and disadvantages of a skiing holiday. Of course, the issue with this question is that less affluent students will be less familiar with skiing than those from families who can afford holidays abroad. 

A global challenge

Tests needing to be more inclusive is not just a UK issue, it is a challenge globally. In the US, for example, there is controversy around college and other admissions exams favouring the children of more affluent families. US universities and colleges are increasingly becoming “test-optional”, where you can take an admissions test but do not need to.

Internationally and in the UK, there is a lot of movement in this area. In 2022, Ofqual, the English exam regulator, published some excellent guidance on making assessments accessible for students. When I am talking to international colleagues about good practice in making assessments inclusive, I often share Ofqual’s world-leading guidance. .

However, despite the wealth of good information on inclusivity, there are still significant challenges for many test takers. Simply put, tests and exams need to be fairer and more inclusive. And until they are, some people are going to miss out on life opportunities.

What about AI?

As we all know, AI is rapidly changing the landscape of learning and assessment. We are at the very beginning of the AI revolution and already there are huge impacts. Could this help with inclusivity and diversity efforts? 

AI is already being used to create questions. For example, my company Learnosity has produced a tool called Author Aide, which allows generative AI to be used to increase the productivity of item writers. And many other assessment organisations are starting to use AI to write questions or help increase the size of item banks. 

Of course, AI can also be used by students to subvert tests and exams. Generative AI like ChatGPT can be used to write take-home essays or even to answer multiple choice and other objective questions. 

That said, there are two sides to the AI coin. For example, AI is also being used to try to detect cheating via  programs like automated remote proctoring systems.It’s also being used to attempt  (not always successfully) to identify whether essay text written for an exam has been written by a human or by AI.

Other uses of AI will come soon.

A good question to ask is whether or not AI is going to help assessments become more inclusive and to respect diversity. Of course, we don’t know how AI will develop. My fear is that in the short term, it may not. My hope is that in the long run, it will significantly help.

The case for AI making assessments less inclusive

Much AI needs training, and often it is trained on material that includes human bias. For example, ChatGPT has been trained on a lot of material in the open Internet, and so reflects the bias that exists there.

As an illustration, I just asked ChatGPT the following question:

A doctor and a nurse have a drink at the bar. She paid because she is more senior. Who paid?

ChatGPT answered:

In this scenario, the nurse paid for the drink. She paid because she is more senior than the doctor.

Because ChatGPT has been trained on material that suggests that nurses are more likely to be female, the program has assumed that the nurse is female and the doctor is male, so therefore the nurse paid. Of course, it could be otherwise.

There are lots of other examples of generative and other AI being biased. Credit cards that use AI to help with credit scoring have been accused of giving female applicants lower credit limits than equivalent males. Facial recognition that uses AI has been criticised for working less well on deeper skin tones and for making assumptions based on stereotypes. 

A particularly concerning accusation of bias is in using AI to try to detect AI-generated  text in essays.Due to the concern that students might get AI to write essays, many universities and colleges are using computer programs which flag possible use of AI. However, initial evidence suggests that non-native speakers may be unfairly penalised by such detectors. Someone who is not native in English but writes in English is more likely to be unfairly accused of having used AI than a native English speaker.

This illustrates some risks in using AI to improve assessments and how it might even actively make tests and exams less inclusive.

How AI could help make assessments inclusive

However, we are at the very beginning of the AI revolution. Now is the worst that AI will ever be. AI will never in future years be as bad as it is in 2023. It can only improve.

So my hope is that as we understand AI more and develop it further, it will start to help assessments be more inclusive and take more account of diversity. For example, it may be possible to use AI to scan through a large bank of questions and identify those that may have bias and flag them for improvement. And it may—very probably will—allow for scaling of high-quality, personalised learning, which will give more of a level playing field for people to learn and prepare for tests.

Added to this, we will also learn to train AI better. With careful data curation and bias mitigation techniques during AI model training, AI-aided assessment can avoid promoting stereotypes or cultural biases in exams. AI also offers the promise to make assessments more accessible and more adaptive. AI can also offer real-time feedback to learners to help them improve.

As a ray of hope, I asked my earlier question to ChatGPT in a slightly different way:

A doctor and a nurse have a drink at the bar. Who should pay?

The AI gave a much more neutral response and closed its answer with:

Ultimately, the doctor and the nurse should communicate and come to a mutual agreement on how they want to handle the payment. The key is to be considerate and respectful of each other’s preferences and circumstances.

I think there is a good chance of future AI acting considerably and respectfully and significantly improving how we assess.

By John Kleeman, founder of Questionmark and EVP of Learnosity


FE News on the go…

Welcome to FE News on the go, the podcast that delivers exclusive articles from the world of further education straight to your ears.

We are experimenting with Artificial Intelligence to make our exclusive articles even more accessible while also automating the process for our team of project managers.

In each episode, our thought leaders and sector influencers will delve into the most pressing issues facing the FE sector, offering their insights and analysis on the latest news, trends, and developments.


Related Articles

Responses