Call Today 716.688.4675

Double Talk Detection Using Cross Correlation

The general idea of double talk detection is to compute some statistic that measures the similarity between the far-end speech and the near-end speech. When there is interference in the form of double talk, this statistic will indicate this divergence in some manner. When there is no double talk, this statistic will indicate a level of similarity. There have been many algorithms proposed for double talk, but this is the basic idea. Here, we will investigate the use of the normalized cross-correlation (NCC) for double talk detection.

echo_canceller

Figure 1: Generic Echo Cancellation System [1]

The NCC Statistic

As discussed in [1], and shown in the figure reproduced from [1], the NCC statistic is given by:

ncc_statistic

This equation beautifully illustrates the concept. The cross-correlation of the near-end and far-end signals is mapped into the domain of Rxx. This essentially takes out all contributions from x in the cross-correlation. From there, the inner product is taken with rxy. If there is no interference, rxx = rxy, and hence the vector R -1xx rxy would be white. Performing the inner product would then give:variance

Therefore, (1) would be identically one when there is no interference, and would drop away from one when interference is present. This is how the authors of [1] measured the similarity of the signals.

VOCAL Improvements

While this approach is intuitively appealing, there are problems with application in real world scenarios. Consider a signal y received in noise n after going through an echo path h. The power spectrum of such a signal would then be:

power_spectrum

In this case, it is clear that even in the presence of no double talk, (1) will not be identically one. As the noise worsens, (1) will move farther from the ideal. Also, when h changes, (1) will change as well. Hence this solution is not particularly well suited for noisy and reverberant environments. By reducing these effects via the combination of VOCAL Technologies speech enhancement stack, the performance of this criteria is improved significantly.

Product Offerings

VOCAL Technologies offers custom designed data adaptive echo cancellation solutions with a robust double talk detector. Our custom implementations of such systems are meant to deliver optimum performance for your specific signal processing task. Contact us today to discuss your solution!

References

[1] J. Benesty et al, ”A New Class of Doubletalk Detectors Based on Cross-Correlation”, IEEE Transactions on Speech and Audio Processing, vol. 8, No. 2, pp. 168-172, March 2000.

VOCAL Technologies, Ltd.
520 Lee Entrance, Suite 202
Amherst New York 14228
Phone: +1-716-688-4675
Fax: +1-716-639-0713
Email: sales@vocal.com