Scientists have been left shocked after discovering an artificial intelligence (AI) chatbot can write research paper abstracts that are so convincing that they are often unable to spot them. The chatbot, called ChatGPT, is a free-to-use tool created by software company OpenAI, based in San Francisco. It creates realistic and intelligent-sounding text in response to user prompts and is based on neural networks that learn to perform a task by digesting huge amounts of existing human-generated text.
Researchers at Northwestern University in Chicago used ChatGPT to generate artificial research-paper abstracts to test whether scientists could spot them. They asked the chatbot to write 50 medical research abstracts based on a selection published in reputable journals such as JAMA, The New England Journal of Medicine, The BMJ, The Lancet, and Nature Medicine. The researchers then compared these with the original abstracts by running them through a plagiarism detector and an AI-output detector, and they asked a group of medical researchers to spot the fabricated abstracts.
The results were worrying, as the ChatGPT-generated abstracts sailed through the plagiarism checker: the median originality score was 100%, which indicates that no plagiarism was detected. The AI-output detector spotted 66% of the generated abstracts, but the human reviewers didn’t do much better: they correctly identified only 68% of the generated abstracts and 86% of the genuine abstracts. They incorrectly identified 32% of the generated abstracts as being real and 14% of the genuine abstracts as being generated.
Sandra Wachter, who studies technology and regulation at the University of Oxford, said: “I am very worried. If we’re now in a situation where the experts are not able to determine what’s true or not, we lose the middleman that we desperately need to guide us through complicated topics.” She added that if scientists can’t determine whether research is true, there could be “dire consequences” not only for researchers but also for society at large, as scientific research plays such a huge role in society and incorrect research-informed policy decisions could be made.
However, other experts were less concerned. Arvind Narayanan, a computer scientist at Princeton University, said: “It is unlikely that any serious scientist will use ChatGPT to generate abstracts.” He added that whether generated abstracts can be detected is “irrelevant”, and that the tool cannot generate an abstract that is accurate and compelling, so the upside of using ChatGPT is minuscule and the downside is significant.
The authors of the study suggest that those evaluating scientific communications, such as research papers and conference proceedings, should put policies in place to stamp out the use of AI-generated abstracts. They also call for more research to be done to understand the ethical and acceptable use of large language models in scientific writing.