The Combined Reduction of Echo and Noise

Hands-Free systems must address room reflections from loudspeaker signal and low SNR of near-end speaker

The combined reduction of echo and noise allows more efficient algorithms to be used in hands free communications devices. The two main acoustic signal processing challenges with hands-free devices are feedback of the loudspeaker signal reflected through the room to the microphone and the low signal-to-noise ratio (SNR) of the near-end speaker. The first requires an echo control system to be in place to ensure the echoes are not of a disturbance to the far-end user. The second is due to the increased distance between speaker and microphone. A combined approach to address these challenges yields equivalent system performance compared to applying separate algorithms for each problem.

It is more beneficial to the overall performance of the system if the echo canceller comes before the combined echo and noise reduction . The main disadvantage of the echo canceller preceding the combined echo and noise reduction is that the adaptive filter of the acoustic echo cancellation (AEC) algorithm has to process noisy signals, which puts a theoretical bound on the achievable attenuation of the echo canceller. In most scenarios, this disadvantage is outweighed by the fact that placing a noise reduction filter before the echo canceller adds variability to the echo path which significantly limits the ability of the adaptive filter to train to the echo path. An additional advantage to place the echo canceller before the combined echo and noise reduction is that the level of the echo (i.e. non-stationary noise sources) is greatly reduced.

In the combined echo and noise reduction approach, the signal model is y(n) = s(n) + b(n) + d(n), where y(n) is the microphone signal after the echo canceller, s(n) is the desired near-end speech, b(n) is the additive noise signal, and d(n) is the residual echo signal at time instance n. The goal is to design an adaptive filter such that Ŝ(ω,n) = H_c(ω,n) Y(ω,n)). In Post Filtering for Residual Echo Control, it was shown that the attenuation factor for residual echo was

H_res(ω,n) = 1 –	Ŝ_d(ω,n)
	S_y(ω,n)

(1)

where Ŝ_d(ω,n) is the estimate of the spectrum energy of the residual echo at frequency, ω at time n. Similarly,

H_b(ω,n) = 1 –	Ŝ_d(ω,n)
	S_y(ω,n)

(2)

can be used as the attenuation factor for the ambient noise components, where Ŝ_d(ω,n) can be estimated using techniques described in Noise Reduction of Non-stationary Noise Sources. Therefore, (1) and (2) can be combined to produce

H_c(ω,n) = max{H_min, H_b(ω,n) ⋅ H_res(ω,n)}

(3)

where H_min is maximum allowable attenuation.The method above describes a single channel estimation approach to combined echo and noise control. Other approaches also can modify the combined filter to psychoacoustic designs of human ear and/or use multiple channels to take advantage of the spatial coherence of the noise sources to further improve the perceptual quality of the system.

More Information

Echo Cancellation Design