The world of literature is grappling with a crisis of identity: can we really tell whether a novel was written by a human or a machine? Allegations of Large Language Model (LLM) use are sparking heated debates and raising questions about how we distinguish between the two. But new research suggests that our ability to spot AI-generated text is not as sharp as we think.
Professor Claire Hardaker, a forensic linguist at the University of Lancaster, has been studying our ability to identify AI writing, with surprising results. Her online test, 'Bot or Not', shows that the average person can only correctly identify AI-generated text about 60% of the time. This is a far cry from the confidence many express in their ability to sniff out AI writing – as seen in swift social media condemnations following recent controversies.
But what's behind our lack of accuracy? Hardaker says it's down to simplistic rules of thumb, such as looking for clichés or the use of dashes. However, these stylistic traits are also deeply embedded in human writing, which LLMs are trained on. "You could go back to Charles Dickens and say he had AI, because he used the em dash too," Hardaker notes wryly, highlighting that rhetorical devices like the 'rule of three' have been used by humans for centuries.
This uncertainty has created a climate of suspicion within the literary world. Accusations of AI use are now affecting authors and publishers, with some books being withdrawn from circulation or apologised for in print. Media organisations are also receiving complaints from readers suspicious of AI-generated content – often citing specific phrases or grammatical errors as indicators.
The complexity is further compounded by a 'linguistic hall of mirrors' effect: not only does AI learn from human writing, but human writers are increasingly influenced by AI. This interplay makes definitive identification incredibly difficult without an author's admission. Hardaker also warns against relying on commercial AI screening tools, noting their unreliability. Some human writing styles might naturally be flagged as AI-like, while AI output can be modified to appear more human – leading to 'wacky results' from detectors.