Search results for

All search results
Best daily deals

Affiliate links on Android Authority may earn us a commission. Learn more.

Meta reveals AI tool that lets you edit audio, speak in six languages, and more

Meta believes the tool could be used to give virtual assistants natural-sounding voices in the future.
By

Published onJune 16, 2023

Facebook Logo App
TL;DR
  • Meta has debuted a new generative AI audio tool called Voicebox.
  • The tool can do a variety of tasks like audio editing, sound reduction, text-to-speech synthesis, and cross-lingual style transfer.
  • Meta says Voicebox is part of its generative AI research.

Although Microsoft and Google tend to dominate AI-related headlines, plenty of other companies are also rushing to develop AI products, including Meta. To that end, the social media giant has just introduced its first entry into the space.

Today, Meta revealed in a blog post that it has been working on a generative AI tool for speech. Called Voicebox, the firm says its tool can perform a variety of speech-generation tasks “that it wasn’t specifically trained to do through in-context learning.”

According to Meta, some of these tasks include in-context text-to-speech synthesis, speech editing, noise reduction, cross-lingual style transfer, and diverse speech sampling. Here’s how the company describes these features:

  • In-context text-to-speech: Uses audio samples as short as two seconds long to match the audio style and use for text-to-speech generation.
  • Speech editing and noise reduction: The tool can recreate a portion of speech that was interrupted by a noise or replace misspoke words without having to rerecord.
  • Cross-lingual style transfer: The tool can take a sample of speech and a passage of text to produce a reading of the text in English, French, German, Spanish, Polish, or Portuguese.
  • Diverse speech sampling: Uses diverse data to generate speech more representative of how people talk in the six languages mentioned previously.

The organization says Voicebox is a part of its research on generative AI. As for its utility, Meta states:

In the future, multipurpose generative AI models like Voicebox could give natural-sounding voices to virtual assistants and non-player-characters in the metaverse. They could allow visually impaired people to hear written messages from friends read by AI in their voices, give creators new tools to easily create and edit audio tracks for videos, and much more.

If you want to see an example of Voicebox, you can head over to Meta’s blog and watch the video posted there.