Echo Control on Codec Parameters

Due to the limited processing power of most mobile devices, the advance signal processing required for voice quality enhancements (VQE), such as echo cancellation, have to be performed at a centralized network location. Performing echo cancellation further back in the transmission channel now puts low bit-rate coded speech (e.g. GSM-AMR) in the echo path. Typically in this scenario, the coded speech has to be decoded first, and then echo cancellation and other voice quality enhancement techniques are applied. Finally, the enhanced speech is re-encoded to be spent to the far-end user. Another approach is to perform VQE directly on the coded speech parameters, as seen in the figure below. The potential benefits of this method are the reduced complexity, delays, and quantization noise resulting from having to do the additional decode and encode to work on uncoded data.

Figure 1: AEC in Tandom Free Operation NetworkIn order to understand how echo cancellation can be performed on the speech codec parameters, let’s look at the codec parameters of GSM-AMR. The two main components are the adaptive codebook and the fixed codebook. The adaptive codebook represents the long-term synthesis filter, which predicts the long-periodicity of the speech signal. The adaptive codebook parameters are the pitch gain and the pitch period. The fixed codebook uses a 10th order linear predictor filter to model the spectrum envelope of the speech signal. Since even small amounts of quantization of filter coefficients can drastically affect the spectral shape of the filter, the coefficients are transformed to Line Spectral Frequencies(LSF). The fixed codebook parameters are the 10 LSF values, and the fixed codebook gain.

The main difficulty of performing echo cancellation in a centralized network and on codec parameters is the estimation of the bulk delay of the echo path. Due to possible transmission errors and packet losses, the bulk delay is more variable than in applications in which acoustic echo cancellation is performed with the loudspeaker microphone enclosure. The delay that maximizes the cross-correlation of the codec parameters from the far-end and near-end signals represents an estimate of the bulk delay of the echo path. Since the gains of the fixed and adaptive codebooks directly affect the signal energy in the decoder, these gains can be lowered by the probability of the echo being present. Conveniently, the cross-correlation used for the delay estimate can also be used as a probability measure. The higher the max correlation, the increased likelihood codec parameters represent echo. This direct modification of the codec parameters provides low complexity solution to voice quality enhancement in cellular networks.

More Information

Echo Cancellation Design

Complete Communications Engineering

More Information