CHOCHILINO From cloning romance authors to YouTube piracy, AI is transforming audiobooks %

News on AI and audiobooks is coming thick and fast. Australia-based audiobook producer Bolinda recently announced it will create a “bespoke” AI clone of romance bestseller Barbara Cartland’s voice, in partnership with her estate. (She died in 2000.)

Two days later, Spotify announced a tool (created by synthetic voice company ElevenLabs) that will allow self-published authors to create audiobooks voiced by AI on its platform, and publish them anywhere.

Meanwhile, a recent New York Times exposé revealed AI-enabled audiobook piracy on a massive scale on YouTube, with versions appearing of everything from literary fiction to Harry Potter, business bestsellers to John Grisham. A pirated version of his latest legal thriller, The Widow, accompanied by an “AI slop” video, has over 80,000 views. Listeners called the voice “boring” and “awful”.

“If you look up any best seller, you find a free audiobook on YouTube,” said the chief executive of the United States Authors Guild. A 2025 survey found that 35% of audiobook consumers had listened to a YouTube audiobook – and that AI-narrated audiobooks now account for 23% of new releases.

Around 17% of Australian audiobook listeners have (knowingly) listened to an AI audiobook, according to my own recent survey of over 500 Australian audiobook listeners. This rate is higher among listeners with vision impairments and other disabilities, who have long used AI for accessibility reasons – and should be centred in these discussions.

How have AI voices in audiobook listening evolved? And where is it heading?

a man signing a book for a woman — A pirated YouTube audiobook of John Grisham’s latest legal thriller has over 80,000 views.
Bruce Newman/The Oxford Eagle/AAP

The evolution of AI voices

The large language models behind ChatGPT and Claude map the relationship between words across billions of pieces of text. Similar models map sound patterns across recorded speech to produce contemporary “AI voices”.

AI voices were originally used for accessibility. The first automated text-to-speech system was created in 1968 by a Japanese research laboratory. The first screen reader technology was developed by IBM in the early 1980s. In 1986, it introduced its first screen reader for general use on personal computers.

This text-to-speech technology was originally for vision-impaired readers, who were the first to embrace it.

But as AI voices became more convincing, concern about their impact on human-narrated audiobooks grew. In 2009, the US Authors Guild blocked implementation of the Kindle 2’s text-to-speech function, claiming it infringed their audiobook rights.

Many high profile authors argued against the decision and its impact on accessibility. “The day that artificial intelligence gives us perfect Kindle readings, we’ll have bigger fish to fry than audiobook rights,” science fiction and tech author Cory Doctorow wrote in the Guardian. He called the idea that computer narration might ever seriously rival human narration “nonsensical”.

Voice clones and pirates

Swedish Storytel, the largest streaming platform in Nordic markets, reported in 2024 that nine out of ten listeners “could not tell which narration was human” when it tested the AI-generated voices in its Voice Switcher program.

Like Spotify, Storytel uses ElevenLabs AI technology. With Voice Switcher, listeners can choose between the original human narrator, three different AI-generated voices, or an AI version of popular Swedish actor and narrator Stefan Sauk, who has licensed his voice to Storytel.

Only a handful of Barbara Cartland’s 723 novels were available as audiobooks before her estate signed an exclusive agreement with Bolinda, the leading producer of Australian audiobooks. Bolinda started by distributing accessibility materials, such as large print and talking books, in 1986, and moved to audiobooks in 1995.

Cartland’s voice clone will be used to frame the beginning and end of her audiobooks, while human narrators will continue to narrate the books themselves. Even for this limited use, Cartland fans have described the announcement as “creepy”, “haunting”, “gross” and “disappointing” on social media.

an old woman with pearls and too much mascara — Barbara Cartland’s voice clone will frame her audiobooks.
Barbara Cartland/AAP

Voice clones are being put to worrying uses. Along with other “deepfakes”, this led to the UN publishing a “wake-up call” to organised fraud in March. Audiobook publishing is not immune to these deepfakes, or artificially generated imitations of real people.

Recordings of Stephen Fry reading the Harry Potter series were used to generate an illegal clone of his voice in 2023. And this year, author Shaun Rein discovered deepfakes of himself on YouTube, reading chapters of his book. “The voice clone was probably created from the author’s publicly available interviews,” wrote publishing commentator Jane Friedman.

Piracy is a problem for digital content in general – including audiobooks. YouTube addresses piracy by automatically scanning uploads to see if they match with material in their massive database of copyright content. Pirates alter or add bracketing material to try to circumvent it. Publishers told the New York Times that the program, built for music, is “less effective” with audiobooks, where “even slight changes – like shifts in speed, pitch or voice, or added background noise or music – can prevent a match”.

a man in a pink shirt and black suit jacket — Recordings of Stephen Fry reading Harry Potter audiobooks were used to generate an illegal clone of his voice.
Andy Rain/AAP

Audible, Spotify and Project Gutenberg

Audible, owned by Amazon, began implementing AI-voiced audiobooks in late 2023. A year later, it added a service that lets select narrators create and monetise replicas of their own voices.

The other major global player in audiobooks, Spotify, first offered AI-narrated audiobooks in 2023, the year it launched its audiobook business.

Last year, it began accepting audiobooks narrated using ElevenLabs’ AI voice technology, which lets self-publishers create an audiobook with a voice from a catalogue, or create their own voice clone. The catalogue includes trademarked clones of actors like Michael Caine. And now, self-publishers can create AI-voiced audiobooks on Spotify itself.

Commercial and pirate audiobooks sit alongside projects like public domain repository Project Gutenberg’s free catalogue of 5,000 AI-narrated audiobooks of out-of-copyright books, created by Microsoft and MIT. It was named one of the best inventions of 2023 by TIME magazine.

The future of audiobooks

Voice actors are concerned about the erosion of skilled jobs and the use of cloning technologies to infringe on their vocal rights. Unions and advocacy groups are actively campaigning for tighter regulatory controls. And authors and publishers want action on YouTube piracy.

These issues are intensified by the important ethical and environmental questions raised by AI use. Legislators, technology companies and major commercial players have a responsibility to ensure AI narration technologies are made and used transparently and ethically.

But there is no one way to read a book. Only a fraction of books published will ever be available as human-narrated audiobooks, due to the significant time and expense of making them. And for many readers – those with vision impairments or some forms of neurodivergence, for instance – audiobooks are an essential resource.

Human performance offers a gold standard listening experience: expressive, immersive and authentic. But AI narration has a growing role in the audiobook’s future.