The Neural Network Heard The Voices Of People And Drew Their Portraits - Alternative View

Table of contents:

The Neural Network Heard The Voices Of People And Drew Their Portraits - Alternative View
The Neural Network Heard The Voices Of People And Drew Their Portraits - Alternative View

Video: The Neural Network Heard The Voices Of People And Drew Their Portraits - Alternative View

Video: The Neural Network Heard The Voices Of People And Drew Their Portraits - Alternative View
Video: 15 EXTREME Wild Animal Fights 2024, May
Anonim

Recently, neural networks have been surprising with their skills - could you have believed ten years ago that a computer could "animate" portraits of Dostoevsky and Marilyn Monroe? Prepare to be amazed further, because researchers at MIT have created a Speech2Face neural network that is capable of drawing portraits of people simply by listening to their voices. The technology is far from ideal, but its ability to determine the gender, nationality and age of a person is impressive.

To train the neural network, the AVSpeech kit was used with a million short videos with thousands of speaking people. Tracks with video and sound are separated, so the system was able to study each type of material in as much detail as possible. At the first stage of work, the VGG-Face algorithm studied video fragments and created portraits of the people on them in full-face and neutral facial expressions. Another part of the algorithm studied the spectrogram of the voice and applied additional changes to the resulting portraits - as a result, an approximate portrait of each person speaking was obtained.

A neural network for creating voice-based portraits is already a reality

If you compare a person's face with a video and the option proposed by the algorithm, you can find many differences. However, the researchers assure that they initially did not want to create the most similar portrait of a person - many factors affect the tone and intonation of the human voice, so they would not have got the ideal result anyway. But the neural network does an excellent job of what is important for researchers, namely, the precise determination of gender, nationality and age.

Image
Image

The authors of the work noted that at the moment the algorithm is weak in determining the age, but they can improve the accuracy. It was also found that the algorithm better recreates European and Asian faces, but this is only due to the fact that the training videos had an unequal number of faces of different nationalities.

Promotional video:

Why do you need a neural network?

How can this technology be useful in the future? Alternatively, with the help of it, a service may someday be created where a user's virtual avatar is created automatically, based on his voice. The new study also has great scientific benefits - by studying the data, scientists can find the relationship between a person's appearance and his voice. You can listen to voices and look at portraits recreated on their basis on the project's website.

Ramis Ganiev

Recommended: