Artificial Intelligence Has Learned To Correctly Recognize Speech Among Noise - Alternative View

Artificial Intelligence Has Learned To Correctly Recognize Speech Among Noise - Alternative View
Artificial Intelligence Has Learned To Correctly Recognize Speech Among Noise - Alternative View

Video: Artificial Intelligence Has Learned To Correctly Recognize Speech Among Noise - Alternative View

Video: Artificial Intelligence Has Learned To Correctly Recognize Speech Among Noise - Alternative View
Video: I Built a Personal Speech Recognition System for my AI Assistant 2024, May
Anonim

Virtual assistants and voice recognition systems have learned to “recognize” what a person says to them and to follow his commands. But for the correct operation of the same Siri and Cortana, extraneous noise can be a big problem. To cope with this technical flaw can be helped by experts from Mitsubishi Electric, who presented a new technology for separating the speech of one person from the general noise.

The technology of the Japanese company is called Deep Clustering, the functioning of which is built on the principles of machine learning. For a start, artificial intelligence learned to independently separate the speech of one person from the general stream of various sounds and noises. The neural network separates the incoming audio data into various elements and analyzes each separately, after which it can already process the human voice. Similar work is observed when two or more interlocutors are “connected”.

During a demonstration of the technology from a Japanese company, the system was able to successfully separate the speech of two people speaking the same sentence in different languages into one microphone. All processing was carried out in real time, and the delay did not exceed three seconds. The recognition accuracy was 90 percent, and when three people began to speak into the microphone, the percentage of "hits" dropped to 80, which is also a good result. According to the authors of the project Anthony Vetro and Yohei Okato,

“In contrast to separating speech from background noises, separating the speech of one person from the“voice”noise of people speaking at the same time is a very difficult task, since the sounds of the voice of different people have a lot of peculiarities. In most systems, the problem of voice separation is solved by installing two or more microphones, but in the case of using only one microphone, only artificial intelligence can handle the task of voice separation. This technology can be used wherever high accuracy of voice message recognition is required. For example, in voice control systems for cars, elevators, household and other electronic devices."

VLADIMIR KUZNETSOV