The Kalman Filter was originally created in 1960 by Rudolf E. Kalman as a re-examination of the filtering and prediction problem using the Bode-Shannon formalism and state-space representation of dynamic systems [1]. This means that the random signals to be worked with are represented as the output of linear systems excited by white noise, and such linear systems are themselves described by first order difference equations. Kalman filtering is of particular importance to Acoustic Echo Cancellation (AEC) because the Kalman filter can be used to obtain a dynamic solution to the Wiener-Hopf equation [1,2]. To construct the solution, we consider the linear system shown in Figure 1:

Figure 1: Kalman’s Discrete Time Linear System [1]

## Wiener-Hopf Equation

The solution to the Wiener problem can be found by considering the special case of Figure 1, described by:

Where Φ is the transition matrix, is the Kalman state vector, is the input speech signal, *M* is the output matrix, and is the output speech. Specifically, the Wiener solution is the estimate where n_{1} ≥ n which minimizes the Mean-Squared-Error (MSE) As Kalman shows in [1], the minimizer is the orthogonal projection of onto the space Such a solution is generated by:

The transition matrix can be computed as:

Where Δ^{Δ} is the optimal input matrix, computed as:

Where *Cov*_{ε}^{Δ} is the optimal error covariance matrix, computed as:

Where *Q* is the covariance matrix of the speech signal input . As you can imagine, the Kalman filter in its native form is prohibitively expensive to implement in any real-time application. To apply the Kalman filter for application, we can turn to [2].

## Adaptive Kalman Filter

In an Acoustic Echo Cancellation application, we think of the Kalman state vector as the estimate of the echo path , and the output of the Kalman system as the microphone signal keeping the same notation there. Using Kalman’s state space model as input, we quite literally invert the system in Figure 1 and transform it into the frequency domain to create an adaptive estimate of the echo path as shown in Figure 2:

Figure 2: The Kalman Filter [2]

Where the uppercase denotes a frequency domain matrix. Here the transition matrix Φ is replaced by a constant matrix *A*, and we let *D* denote the frequency domain representation of the far-end speech convolved with the echo path, previously denoted in the time domain as Δ. Thus for clarity, we can rewrite the Kalman state-space equations in the frequency domain as:

Where *S* is the near end speech, and *C* is the frequency domain representation of *M* in Figure 1. In the Bode-Shannon formalism, *D* is fully characterized by its covariance matrix _{DD} and similarly for *S* we have _{SS}. Thus the echo path estimate is given by:

Where *W _{1}^{+}* denotes a correction to the current frame’s prediction computed as:

Where *K* is the Kalman gain, computed as:

Where ^{+}_{εε }is the covariance matrix for the frequency domain approximation error *ε* computed as:

Where εε is another correction computed as:

And *C* is computed as:

Where *F* is the DFT, *L *is the frame length, and *Z* is a projection matrix meant to linearize cyclic convolution in the frequency domain.

## Real Time Implementation

Now that we have an adaptive version of the Kalman filter, the last step for real time implementation is to reduce the computational complexity. To do so, [2] creates an entirely diagonal Kalman filter. First, we simplify the near end speech as:

Where SS is the easily computed power specrtral density of the near end speech. Then the microphone covariance can be described similarly as:

*C* can be approximated as:

And thus we can simplify the Kalman gain via:

## References

[1] R. E. Kalman, “A New Approach to Linear Filtering and Prediction Problems”, Journal of Basic Engineering, vol. 82, 1960, pp.35-45

[2] G. Enzner, P. Vary, “Frequency-domain adaptive Kalman filter for acoustic echo control in hands-free telephones”, Signal Processing, vol. 86, 2006, pp. 1140-1156