Statistical optimal beamforming are used for super-directive beamforming, where maximum energy of the steering vector is concentrated at fixed points. Due to the presence of noise and reverberation, the known or estimated steering vector may contain some errors, which may lead to attenuation of the desired signal in most cases. To deal with this problem, diagonal loading is employed to broaden the beamwidth of the of the principal beam.

Consider a far field source impinging N uniform linear array (ULA) microphones as shown in Figure 1 below:

N ULA Statistical optimal beamformers microphones

Figure 1: N ULA microphones

Suppose the signal at each microphone i \in \{1, \cdots, N\} is given as:

x_i(w) = s(w) W_{i}(w) +\sum\limits_{l=1}^L v_l(w) e^{\left(-jw (i-1)\frac{d}{c} \sin{\left(\psi_l \right)} \right)} +n_i(w)

where s(w) is the desired speech signal, \theta is the direction of arrival (DOA) of the speech signal with respect to the broadside,  v_l(w) is the correlated desired signal such that \mathbb{E}[v_l(w) v_l^*(w)] = \sigma_{v_l^2} and  \mathbb{E}[ s(w) e^{\left(-jw \frac{d}{c} \sin{\left((i-1)\psi - \theta\right)} \right)} v_l^*(w)] = 0, \forall i \in \{1, \cdots, N\}, \forall l \in \{1,\cdots,L\}. \psi_l is the direction of arrival (DOA) of the l^{th} correlated noise with respect to the broadside. Here,

W_{i}(w) = e^{\left(-jw (i-1)\frac{d}{c} \sin{\theta} \right)}


W(w) = [W_1(w), \cdots, W_N(w)]^T

The statistical optimal robust beamformer is posed as:

\underset{W(w)}{\mathrm{argmin}} ~~\mathbb{E}\left [{W^H(w) R_{x}(w) W^H(w)}\right] ~ ~ ~ s.t ~ ~ |s(\theta,w)^H W(w)|^2 \ge 1 , ~ ~ \theta \in [\theta_1, \theta_2], where R_{x}(w) is the signal plus noise covariance matrices.

To compensate for the mismatch, a diagonal penalty is imposed to transform the problem to;

\underset{W(w)}{\mathrm{argmin}} ~~{W^H(w) R_{x}(w) W^H(w)} + \gamma |W(w)|^2~ ~ ~ s.t ~ ~ |s(\theta,w)^H W(w)|^2 \ge 1 , ~ ~ \theta \in [\theta_1, \theta_2],

This can be interpreted as the covariance matrix subsuming the error in the weight matrix. Using \gamma to be too big will cause the algorithm to put all efforts at suppressing white noise and will result in a performance akin to a delay and sum beamformer.

VOCAL Technologies offers custom designed solutions for beamforming with a robust voice activity detector, acoustic echo cancellation and noise suppression. Our custom implementations of such systems are meant to deliver optimum performance for your specific beamforming task. Contact us today to discuss your solution!

More Information