Brain Signals Are Converted Into Speech Using Artificial Intelligence - Alternative View

Table of contents:

Brain Signals Are Converted Into Speech Using Artificial Intelligence - Alternative View
Brain Signals Are Converted Into Speech Using Artificial Intelligence - Alternative View

Video: Brain Signals Are Converted Into Speech Using Artificial Intelligence - Alternative View

Video: Brain Signals Are Converted Into Speech Using Artificial Intelligence - Alternative View
Video: New AI Creates Human Speech From Brain Signals 2024, April
Anonim

In an effort to help people who cannot speak, neuroscientists have developed a device that can convert brain signals into speech. This technology is not yet sufficiently developed for use outside the laboratory, although it can be used to synthesize whole sentences that are mostly understood, writes "Nature".

In an effort to help people who cannot speak, neuroscientists have developed a device that can convert brain signals into speech.

This technology is not yet mature enough for use outside the laboratory, although it can be used to synthesize whole sentences that are generally understood. The creators of the speech decoder presented its description in an article published in the journal Nature on April 24th.

Scientists have used artificial intelligence in the past to convert brain signals into single words, mostly of one syllable, says Chethan Pandarinath, a neuroengineer at Emory University in Atlanta, Georgia, who wrote a commentary on the article. “Leaping from one syllable to sentences is technically challenging and, in part, that's why the work is so impressive,” he says.

Convert motion to sound

Many people who have lost the ability to speak communicate using a device that requires them to make small movements to use the cursor to select letters or words on the screen. One famous example was British physicist Stephen Hawking, who had motor neuron disease. He used a speech device that was activated by the muscle of the cheek, said study leader Edward Chang, a neurosurgeon at the University of California, San Francisco.

Because people using such devices must type words letter by letter, these devices can be very slow, “speaking” up to ten words per minute, Chang says. Natural speech involves an average of 150 words per minute. “This is due to the efficiency of the vocal tract,” he says. And so Chang and his team decided to simulate the voice system when building their speech decoder.

Promotional video:

The scientists worked with five people who were implanted with electrodes on the surface of the brain in the process of treating epilepsy. At first, when participants in the experiment read hundreds of sentences aloud, scientists recorded the activity of the brain. Chang and colleagues then combined these recordings with data from previous experiments that looked at how movements of the tongue, lips, jaw, and larynx produce sound.

Using this data, scientists "trained" a deep learning algorithm, and then included this program in their decoder. The device converts brain signals into specified movements of the vocal tract and converts these movements into synthetic speech. People who listened to the 101 synthesized sentences were able to understand an average of 70% of the words, Chang says.

In another experiment, scientists asked one of the participants to read sentences aloud and then mute the same sentences with their mouths. The sentences synthesized in this case were of poorer quality than those synthesized from the “spoken” speech, Chang says, but the results are still encouraging.

Understanding synthesized speech is a matter of the future

Speech, synthesized by converting brain signals into vocal tract movements and translating them into sound, is easier to understand than speech, which is synthesized by converting brain signals directly into sound, says Stephanie Riès, a neuroscientist at San Diego State University. in California.

But it's unclear if the new speech decoder will work with words that people only “speak” in their minds, says Amy Orsborne, a neuroengineer at the University of Washington in Seattle. “The article shows really well that the device works with mimic speech,” she says. "But how does it work if the person does not move their lips?"

Marc Slutzky, a neuroscientist at Northwestern University in Chicago, Illinois, agrees and says the speech decoder can be more efficient. He notes that listeners identified synthesized speech by choosing words from a set of options, but as the number of options increased, it became more difficult to understand the words.

This research "is a really important step, but there is still a lot to be done before synthesized speech can be easily understood," Slutsky says.

Georgia Guglielmi (Giorgia Guglielmi)