Complete Communications Engineering

In double talk detection, we wish to find a metric that will separate single talk speech from double talk speech. To do so, we need to find a measure of similarity between the near end and far end speech. When this measure indicates the two signals are dissimilar, double talk is detected. One method of doing so is to compute speech features for the far end signal and compare them with the near end signal [1].

double_talk_featuresjpg_Page1_Image1
Figure 1: Generic Echo Cancellation System [1]

In [1], the authors used simple time domain criteria to create vectors as input to the similarity metric. As the metric, they used the standard Euclidean distance. The vector of features for the incoming far end speech frame x is given by:double_talk_feature_vector

Where En and Es are the frame energy and frame log-energy respectively, and the {ai} are optimized constants given in [1]. Computing a similar vector Vd for the near end signal, the decision variable was given as:

double_talk_decision_variable

Where ob is the estimated variance of the noise, and ß is a constant also given in [1]. Under voiced speech, this decision variable was compared with a given threshold, and under unvoiced speech, the variable as compared with another threshold. From here, the decision was made. The authors show that using speech features for double talk detection is a useful method, as their double talk detector outperformed both the normalized cross-correlation and Geigel algorithms.

VOCAL Improvements

The speech features given in [1] were largely dependent on the energy of the signal. This means that the performance of this double talk detector will severely degrade under noise. In addition, the reliance on an decision threshold that varies between voiced and unvoiced speech presents another dimension of error that may degrade the performance even more. Using VOCAL Technologies superior understanding of speech features, we can offer much superior performance in the presence of noise, reverberation, and non-linearities.

Product Offerings

VOCAL Technologies offers custom designed data adaptive echo cancellation solutions with a robust double talk detector. Our custom implementations of such systems are meant to deliver optimum performance for your specific signal processing task. Contact us today to discuss your solution!

References

[1] M. Hamidia and A. Amrouche, ”Double-talk detector based on speech feature extraction for acoustic echo cancellation,” in Software, Telecommunications and Computer Networks (SoftCOM), 2014 22nd International Conference on, Split, 2014, pp. 393-397.