Call Today 716.688.4675

Speech Separation with Microphone Arrays

Microphone array techniques can be largely classified into two broad areas-namely beamforming and blind signal separation (BSS). Both approaches share the commonality of filtering and combining the microphone signals to best extract the signal of interest. Traditional beamforming methods require information, such as array geometry and source localization, to form a beam toward the source of interest. In the second case, all of the sources are separated from their mixtures without priori knowledge of the sources or the arrays.

Currently, BSS is an emerging field of interest since the only assumption required is that the sources are statistically independent. Thus, unlike its counterpart beamforming, BSS has the ability to separate sources from the observed mixtures without source localization or array geometry information. Such flexibility has made BSS a popular technique in many applications. Moreover, the majority of the potential applications of BSS in the audio realm consider separation of simultaneous audio sources in reverberant or echo environments, such as a room or inside a vehicle. These applications deal with convolutive mixtures that often contain long impulse responses that are difficult to estimate or invert.

Most multiple-acoustic-source separation algorithms are based on independent component analysis (ICA) methods. A common assumption in ICA-based methods is that the sources have a particular statistical behavior, such that the sources are random stationary statistically independent signals. Using this assumption, ICA attempts to linearly recombine the measured signals so as to achieve output signals that are as independent as possible. Recently, several works proposed separation of multiple signals with short-time Fourier transform (STFT) scheme. The benefits of the STFT methods are that they directly exploit the nonstationary property of the signals for purposes of detecting and separating the individual sources. Recent reported results of BSS using various STFT methods show excellent performance for instantaneous mixtures.

More Information

VOCAL Technologies, Ltd.
520 Lee Entrance, Suite 202
Amherst New York 14228
Phone: +1-716-688-4675
Fax: +1-716-639-0713