The room impulse response is the transfer function between the sound source and microphone. In order to recover the original sound source, the received microphone signal can be convolved with the inverse of the room impulse response function. Generally, the system can only be approximated because it is rarely minimum-phase, i.e. causal and invertible.
There are several approaches to obtaining an estimate of the room impulse response transfer function. One approach is to use cepstrum analysis. Cepstrum is the Fourier transform of the log spectrum, DFT(log(X(ω))). This is a measure of the frequency of variation in the log spectrum. Speech is considered slowly varying relative to the reverberant components in the log spectrum. Therefore, the speech and transfer function components can be separated.
Another approach to estimating the room impulse response transfer function is to use the linear prediction (LP) residuals. Clean speech components cause the LP residuals to remain close to zero, while reverberation causes the the residuals to be time-varying. Thus, reverberation lowers the kurtosis of the probability distribution of the LP residuals relative to clean speech. For dereverberation, the objective function of the adaptive filter to remove the reverberent speech components is to maximize the kurtosis of the linear prediction residuals.