The Most Lifelike AI Voices with Victoria Weller of Eleven Labs

The Deload Podcast

0:00

-25:59

The Most Lifelike AI Voices with Victoria Weller of Eleven Labs

A discussion about enabling "Her"-like AI experiences

Doug Clinton

Aug 21, 2023

The Deload Podcast interviews CEOs, founders, and builders of frontier technology companies that are transforming how we live. The mission of the podcast expands that of The Deload: To educate you about where the world is going to make smarter growth investments.

Disclaimer. The Deload is a collection of my personal thoughts and ideas. My views here do not constitute investment advice. Content on the site is for educational purposes. The site does not represent the views of Deepwater Asset Management. I may reference companies in which Deepwater has an investment. See Deepwater’s full disclosures here.

Additionally, any Intelligent Indices strategies referred to in writings on The Deload represent strategies tracked as indexes that are not investable. References to these strategies is for educational purposes as I explore how AI acts as an investor.

Eleven Labs and Lifelike AI Voices

Victoria Weller is the Chief of Staff at Eleven Labs. Eleven Labs is a generative voice AI platform that creates some of the most lifelike text to speech content on the market. You can even use Eleven Labs to create an AI clone of your own voice, which we talk about in the episode.

Here are my key notes from the conversation with Victoria:

LLMs were not only a breakthrough for better AI chat but also for creating AI voice. LLMs give AI voice products the ability to understand more context of text. Prior AI voices would just stitch together sounds of pre recorded words where current AI voice tools can predict what’s coming and adjust things like tone accordingly. The result is a more lifelike output.
An underrated use case for AI voice is accessibility. Eleven Labs has seen strong adoption from the visually impaired community as a better experience to engage with information. The same AI voice tech can be used for people who have impaired speech to give them a voice, maybe even one modeled off of their own.
Companionship is another use case. Giving a voice to a computer, like in the movie Her, makes the computer seem more human. As people integrate AI more into homes and as assistants, voice gives those tools a natural layer that should make them more engaging and friendly.
Other current use cases for voice AI include content creators that use it for editing and creation and gamers who use it to give voices to avatars. There’s also potential in media and entertainment and education. Call centers will be harder because there’s a lot of other elements going on including decision making that the voice AI alone can’t address. Real time responses are a challenge for use cases where people expect immediate interaction.
Eleven Labs offers a service to clone your own voice. You can do an instant clone with just a short recording of your voice or a professional cloning that requires at least 30 minutes of audio.
Deepfakes will happen with AI voice tools. Eleven Labs is working on guardrails to avoid the production of deepfakes as well as identification of deepfakes that are created. One way is via policing terms of service and banning users that violate the terms. They also released an AI speech classifier where users can upload a piece of audio to assess whether it was created by Eleven Labs.
Deepfakes are still so new that the public doesn’t default to the idea that “this might not be real.” Part of the solution will be that the public needs to be educated, and people will need to approach certain content with skepticism to be sure it’s real.

Visit Eleven Labs website to create your own AI voice: https://elevenlabs.io

Victoria’s Twitter: https://twitter.com/vic_weller

Victoria’s LinkedIn: https://www.linkedin.com/in/victoria-w-418b3449/

Episode Time Stamps

1:10 How Eleven Labs works
3:00 Why LLMs cracked the code for better AI voice
4:55 The unique value of accessibility in audio
8:08 Using voice as part of Personal AI products
11:00 Other use cases for voice AI: Creators, gamers, education
15:05 Cloning your own voice
19:25 Addressing deepfakes
24:41 How to get ahead in learning about AI

Disclaimer: My views here do not constitute investment advice. They are for educational purposes only. My firm, Deepwater Asset Management, may hold positions in securities I write about. See our full disclaimer.

Intelligent Indices Update: ChatGPT is Ahead of the S&P 500

Speaking of Victoria’s advice to experiment with AI as the best way to learn about it, I started experimenting with ChatGPT, Bard, and Claude about a month ago to see if the world’s best AI tools can outperform the market. The results so far are promising.

The flagship Intelligent Select is ahead of the S&P 500 by 50 bps since inception, and the Intelligent Tech Select outperformed the Nasdaq 100 by 90 bps. Most impressive so far is the Intelligent Select Equal which outperformed the S&P 500 Equal Weight index (RSP ETF) by 170 bps.

Yes, it’s one month of data, but the experience with Intelligent Indices has led me to a few conclusions:

AI is already capable of creating usable stock indices with intelligently engineered prompts, and it’s only going to improve over time.
As AI influences investment processes more, we need intelligent benchmarks as barometers for AI-enhanced investment strategies.
Like many other industries with entrenched dominant players, AI will create the potential for something new and dynamic to disrupt legacy indices like the S&P 500.

It’s hard to imagine a future where there isn’t an AI-powered index scrolling on CNBC all day next to the S&P.

The future of indexing is intelligent.

Follow Intelligent Indices updates on Thematic and LinkedIn.