Beamforming solutions for two microphone arrays are highly desirable in applications which require blind source separation with some apriori knowledge of the signal direction and the interferer direction. Such solutions are used in applications such as mobile phones, automobiles to pick driver speech whiles suppressing front passenger speech etc. Using two microphones limits the achievable gains from conventional approaches. Further, the chosen algorithm has to bear in mind the computational burden in real time operating systems. The use of minimum variance distortion-less responses are curtailed in most applications because of the computational burden incurred in inverting the noise co-variance matrix. We present an approach which is based on MVDR but pre-computes the portions of the weights which requires inversion leading very high signal to noise ratio (SNR) improvements at minimal computational burden.

Consider a two microphones array as shown in Figure 1:

Figure 1 Two microphone array

Consider the two single microphone speech signal:

$y_i(t) = s(t-\tau_i) + n(t), i \in \{1,2\}$

The frequency domain representation becomes:

$Y_i(w) = S(w)e^{-j w \tau_i} +N(w), N(w) \sim \mathbb{N} (0,\sigma_n^2)$

Now, suppose the desired direction of signals is $0^{\circ}$ whilst the interferer is from $180^{\circ}$, similar to the situation that pertains to picking driver speech and rejecting front passenger speech. An MDVR based filter can be derived such the optimal filter $w_{opt}$ can be approximated as:

$\hat{w}_{opt} = \frac{G(\Gamma(w)) }{1+\cos{(2w(\tau_2-\tau_1))}} [1+e^{-jw(\tau_2-\tau_1)}, 2 \cos{(w(\tau_2-\tau_1))}]^{T}$

where $G(\Gamma(w)) = 1 - \mathbb{R}_e \{ \Gamma(w) \} \cos{(w(\tau_2-\tau_1))} -\mathbb{I}_m \{ \Gamma(w) \} \sin{(w(\tau_2-\tau_1))}$.

The performance of such a filter is illustrated in Figure 2 with an SNR improvement of $18dB$. It can also be seen that the speech from the undesired direction is attenuated by more than $20dB$. It should be noted that no other nonlinear attenuations are applied to the processed speech.

Figure 2

