The use of a circular array topology for beamforming is on the rise mainly because it afford a $2\pi$ look direction for the array. Conventional derivations for the signal to noise ratio improvements using delay and sum beamformers indicate that you get $3dB$ gain for every doubling of the number of microphones being deployed. This however only holds for uncorrelated noise. In certain situations, a null is desired in one direction whilst simultaneously, a beam is desired in another direction. We derive the expected SNR gains for correlated noise on UCA microphones.Consider a far field source impinging N UCA microphones as shown in Figure 1:

Figure 1: N UCA microphones

Suppose the signal at each microphone $i \in \{1, \cdots, N\}$ is given as

$x_i(w) = s(w) e^{\left(-jw \frac{d}{c} \sin{\left((i-1)\psi - \theta\right)} \right)} + v(w) e^{\left(-jw \frac{d}{c} \sin{\left((i-1)\psi - \beta\right)} \right)}$

where $s(w)$ is the desired speech signal, $\theta$ is the direction of arrival (DOA) of the speech signal with respect to the normal to the axis joining all the microphones, $v_i(w)$ is the correlated noise such that $\mathbb{E}[v(w) v^*(w)] = \sigma_v^2$ and  $\mathbb{E}[ s(w) e^{\left(-jw \frac{d}{c} \sin{\left((i-1)\psi - \theta\right)} \right)} v^*(w)] = 0, \forall i \in \{1, \cdots, N\}$. Further, $\beta$ is the direction of arrival (DOA) of the directional noise with respect to the normal to the axis joining all the microphones and
The input SNR per frequency bin $w$, denoted $iSNR(w)$ is given as:

$iSNR = \frac{\mathbb{E}\left[|s(w)|^2 \right]}{\mathbb{E}\left[\left |v(w)\right|^2 \right]} =\frac{|s(w)|^2}{\sigma_v^2}$

where $\mathbb{E}[.]$ is the expectation operator.
After the delay and sum beamformer, the output becomes:

$x(w) = s(w) + v(w) \frac{1}{N} \sum\limits_{n =0}^{N-1} e^{\left(2jw \frac{d}{c} \cos{\left(n\psi - \frac{1}{2}(\theta+\beta)\right)} \sin{(\frac{\beta - \theta}{2})} \right)}$

Without loss of generality, suppose $\beta = \theta + \pi$, then

$x(w) = s(w) + v(w) \frac{1}{N} \sum\limits_{n =0}^{N-1} e^{\left(2jw \frac{d}{c} \sin{\left(n\psi - \theta\right)} \right)}$

The output SNR per frequency bin $w$, denoted $oSNR(w)$ is given as:

$oSNR = \frac{|s(w)|^2}{\left | v(w) \frac{1}{N} \sum\limits_{n =0}^{N-1} e^{\left(2jw \frac{d}{c} \sin{\left(n\psi - \theta\right)} \right)} \right|^2}$

But$\frac{\sigma_v^2}{N} + \frac{\sigma_v^2}{N^2} \sum\limits_{n =0}^{N-1} \sum\limits_{m \neq n}^{N}e^{\left(2jw \frac{d}{c} (\cos{\left((\frac{n+m}{2})\psi - \theta \right)}) \sin{(\frac{n-m}{2}\psi)}\right)}$
This leads to an oSNR of:

$oSNR = N \frac{|s(w)|^2 }{\sigma_v^2} \frac{1}{1 + \frac{1}{N} \sum\limits_{n =0}^{N-1} \sum\limits_{m \neq n}^{N}e^{\left(2jw \frac{d}{c} (\cos{\left((\frac{n+m}{2})\psi - \theta \right)}) \sin{(\frac{n-m}{2}\psi)}\right)}}$

The SNR improvement, SNRI then becomes

$SNRI = \frac{oSNR}{iSNR} = \frac{N}{1 + \frac{1}{N} \sum\limits_{n =0}^{N-1} \sum\limits_{m \neq n}^{N}e^{\left(2jw \frac{d}{c} (\cos{\left((\frac{n+m}{2})\psi - \theta \right)}) \sin{(\frac{n-m}{2}\psi)}\right)}} \le N$

VOCAL Technologies offers custom designed solutions for beamforming with a robust voice activity detector, acoustic echo cancellation and noise suppression. Our custom implementations of such systems are meant to deliver optimum performance for your specific beamforming task. Contact us today to discuss your solution!