Modern Wakeup Word Detection (WWD) algorithms are fairly reliable when the SNR of the signal is just a few dB greater than 0. Thus, when using a voice assistant device to “Barge-In” during the audio playback, the near-end speech to playback echo ratio (NER) must be greater than 0dB. Simply due the proximity factor (the loudspeaker is closer to the microphone than the user is to microphone), the NER of a voice assistant device is much less than 0 dB, more often even less than -10dB. Therefore, acoustic echo cancellation(AEC) software is a required to allow the user to barge-in with voice commands to the voice assistant device . Once the wake-up word has been successfully detected, the audio playback can be attenuated or muted to allow for successful recognition of the voice command.
Figure 1. User Attempting Barge-In
The application of AEC for barge-in is different than standard full-duplex communication applications. WWD does not tolerate non-linear distortions to the wake-up word. This eliminates the use of the residual echo suppressor of an AEC algorithm, forcing the AEC to rely solely on the linear adaptive filter to improve the NER of the microphone signal. An AEC solution for a barge-in application must be robust to double-talk. The adaptive filter must not diverge and hold convergence when the wake-up word is uttered by the user. Therefore, the stepsize control logic and double-talk detection must react swiftly and effectively to prevent missed detects of the wakeup word.
VOCAL’s AEC software has been implemented as a pre-processor to the WWD algorithm. The AEC software was evaluated across a range of the NERs values. Table 1 shows the word detection rate different NER values.
Table 1. WWDR with AEC enabled.
VOCAL’s AEC algorithm is available with Barge-In mode and is optimized for leading DSPs and microprocessors from TI, ADI, Intel, ARM and other leading vendors. Please contact us today to learn more.
Please contact us to discuss your acoustic echo canceller application requirements.