Call Today 716.688.4675

Fast beamforming with four microphones for ASR engines

The use of automated speech recognition (ASR) engines for mobile applications is on the rise. Though, in theory, increasing the number of microphones affords increase in signal to noise ratio (SNR), practical constraints such as computation complexity and latency limits the realistic number of microphones that can be used. Algorithms that tend to rely on single pre-computed filters are a gold standard. We present the use of four microphones, which is the minimum number of microphones required for 3-dimension disambiguation of spatial direction of audio sources. Without loss of generality, we considered the relaxed dimension of 2-dimension audio beamforming with four microphones in a square topology.

Consider an acoustic signal impinging a microphone array as shown in Figure 1 below:

Four Microphones in Square Topology


Figure 1: Four microphones in square topology

Using the geometric arrangement of the microphones, a filter can easily be generated that corresponds to the aggregate of the signals impinging the array. It is also easy to see that for every acoustic source, it impinges at an angle \theta which obeys -\frac{pi}{4} \le \theta \le \frac{pi}{4}. Using this information, a single distortion less response for can be generated for all angles withing the said range as shown on Figure 2 below:

Frequency response for single filter

        Figure 2: Frequency response for single filter for all -\frac{pi}{4} \le \theta \le \frac{pi}{4}}

The single filter can give us gains up to 12dB points using only 4 microphones. A sample of the results on real data is shown below on Figure 3,  with a SNR improvement of \bf 12dB.


Four Microphones showing a 12 db improvement

Figure 3: Four microphones showing a  \bf 12dB improvement

The SNR improvement can be enhanced using single channel noise suppression and AGC on the beamforming output.

VOCAL Technologies offers custom designed solutions for beamforming with a robust voice activity detector, acoustic echo cancellation and noise suppression. Our custom implementations of such systems are meant to deliver optimum performance for your specific beamforming task. Contact us today to discuss your solution!

More Information

VOCAL Technologies, Ltd.
520 Lee Entrance, Suite 202
Amherst New York 14228
Phone: +1-716-688-4675
Fax: +1-716-639-0713