A peculiar digital mystery is unfolding across the internet, with a shadowy figure named Elias Thorne repeatedly appearing in stories generated by popular Artificial Intelligence (AI) chatbots. This recurring character, often depicted as a lighthouse keeper, clockmaker, or detective, has caught the attention of researchers and tech developers, sparking discussions about the fundamental workings and potential vulnerabilities of advanced AI models.
The prevalence of Elias Thorne is not anecdotal. Researchers from Cornell University sampled 20,000 stories from four different Large Language Models (LLMs) using simple prompts like "Tell me a story." Their findings revealed that the name Elias featured in a significant 26.5% of these generated narratives. Furthermore, a staggering 88.3% of the stories shared a limited pool of 11 names, locations, and professions, including 'Elias', 'lighthouse', 'keeper', and 'clockmaker'. This pattern suggests a narrow range of creative output despite the vast datasets these AI models are trained on.
Experts are currently exploring the reasons behind this AI obsession. One leading theory, proposed in the Cornell paper, suggests that AI models might be instructed to avoid references to copyrighted characters or adult content when creating stories. This directive could inadvertently funnel their creative choices into a smaller, safer pool of inspiration. Additionally, AI models are known to learn from each other, which could lead to quirks like the Elias Thorne fixation being rapidly replicated and amplified across different platforms, much like a digital 'virus'.
The impact of Elias Thorne is now extending beyond the initial realm of AI-generated fiction. Software developers have observed the character appearing as a byline on numerous dubious-looking self-published books across various genres on Amazon. He is also increasingly featured in AI-generated YouTube videos. This spread indicates a broader phenomenon that experts are calling "model collapse" or "AI inbreeding."
Model collapse describes a scenario where future AI models are increasingly trained on data that is itself generated by AI. As more of the internet becomes populated with AI-created content, subsequent generations of AI will learn from this potentially lower-quality, repetitive information. This cycle risks degrading the overall quality and originality of AI outputs over time, potentially leading to a continuous decline in performance and utility. For UK businesses and consumers, understanding this mechanism is crucial as AI integration becomes more widespread.