There are many methods to digital audio watermarking, as discussed in Categories of the Digital Audio Watermarking Embeddings and Detection. This article will focus on the echo-based solutions which are informed time domain solutions. This approach adds watermark data to the host signal by adding echoes, and the watermark can be detected robustly through cepstral analysis. The imperceptibility is relatively high because of our inability to perceived echoes with a delay under 10ms.
The general equation for echo-based watermark embedding is:
Where d is the sample delay parameter, is called the echo kernel, and w(n) is the echo filter. Multiple delay values are used to encode the watermark. The echo delays introduced can be forward or backward, and positive or negative. For example, a delay of -10 can represent a 0 bit and a delay of 25 can represent a 1 bit. In the most basic form, the echo kernel is a Dirac delta function, but for added robustness, modified pseudo noise (MPN) sequences are used.
As mentioned earlier, cepstral analysis is used to detect and extract the watermark. Cepstrum is the IDFT transform of the logarithm spectrum power, and is useful for finding echoes of a signal. Therefore, comparing the values of cepstral coefficients at the delays that correspond to the watermark encoding will indicate which bit was set. For systems using MPN sequences, an additional autocorrelation step is needed to determine if the correct sequence was received.