Authors who don’t exist – Bigmouth Strikes Again: Carrie Marshall's blog

Meet Jason N. Martin N. Martin, the author of the exciting and dynamic Amazon bestseller “How to Talk to Anyone: Master Small Talks, Elevate Your Social Skills, Build Genuine Connections (Make Real Friends; Boost Confidence & Charisma)”

Except you can’t meet him, because he doesn’t exist. He’s an AI-generated character with an AI-generated face credited with writing an AI-generated ebook with an AI-generated cover. Both the cover and the content are likely based on content that’s been plagiarised: most of the large language and content models used for AI generation have been fed with real humans’ work in order for them to emulate it without credit or, of course, payment.

Once you’ve found Jason, Amazon will recommend another 11 just like him.

Between the synthetic faces, the use of repetitive and potentially AI-generated text, and art copied from other sources, the 57 books that these authors have published over the last two years may well contain almost no original human-generated content, and Amazon’s algorithms in their current state have the unfortunate effect of worsening the problem by recommending additional inauthentic books or authors once a customer stumbles upon one of them.

Amazon isn’t the only place this is happening, and books aren’t the only sector it’s happening in: there’s a flood of computer-generated content in everything from music to furniture listings. Just the other day Amazon’s listings were full of products called “I’m sorry but I cannot fulfill this request it goes against OpenAI use policy”. X/Twitter is already full of ChatGPT bots posting, and your search engine results are starting to fill up with AI-generated content too. I’ve been trying to research some products recently and it’s been like swimming through treacle: so much content returned by search engines is completely useless now.

The odd listings are most likely the result of dropship sellers using ChatGPT to write everything from product descriptions to product names in huge volumes, but they’re a good example of the pernicious creep of AI into almost everything online – partly due to tech platforms’ lack of interest in removing useless content. Sometimes it’s funny – ChatGPT confidently informed me that I died a few years ago – but it’s increasingly replacing actual information in your search results. And then that bad information becomes the source data for the next generation of AI articles.

That could mean AI is an ouroboros, a snake eating its own tail: the more AI-generated content there is, the more AI will use that content as its source – and that means the very many errors AI systems are currently making will cascade. AI researchers have a name for the potential outcome: model collapse. It means that the language models used by AI are so full of bad data that their results are useless at best and absolute gibberish at worst.

There’s a famous saying in tech: garbage in, garbage out. Thanks to AI, we’re currently seeing that happen on an epic scale.