Now Artificial Intelligence Recognizes Your Voice Even In A Noisy Crowd - Alternative View

Video: Now Artificial Intelligence Recognizes Your Voice Even In A Noisy Crowd - Alternative View

Video: This AI Makes "Audio Deepfakes" 2024, May

2024 Author: Keith Bush | [email protected]. Last modified: 2023-12-16 14:01

Devices like Amazon Echo or Google Home usually obey orders if there is only one voice source. In a room filled with people, they are useless. Now scientists have decided to fix it.

AI can now separate the voices of many people simultaneously speaking in real time. This will give automatic speech recognition a significant development, and soon such systems may be in the elevator at your work.

The technology, developed by researchers at the Mitsubishi Electrical Research Laboratory in Cambridge, Massachusetts, was first demonstrated this month in Tokyo.

She uses a machine learning technique called "deep aggregation" to identify unique features in the "voiceprint" of different people. It then groups the different traits of each speaker together, allowing the individual voices to be distinguished from each other, accurately reconstructing what each person is saying. The system was trained on 100 English-speaking people, but it kept voices apart even if the person in question spoke Japanese.

The system can separate and reconstruct the speech of two people speaking into one microphone with 90% accuracy. With three speakers, accuracy drops to 80%. In either case, the system had never heard the people it analyzed before.

In preliminary tests, such an AI distinguished up to five voices simultaneously, and this can be used both in home systems and in automatic voice recognition / recognition systems.

Nikolay Kudryavtsev