Correlation Noise in Wyner-Ziv Video Coding

Distributed Video Coding (DVC) minimizes complexity and power usage at the transmitter side and transfers the encoding complexity to the receiver side. DVC principles are based on the Slepian-Wolf theorem (lossless case) and Wyner-Ziv theorem (lossy case).

DVC does intra-frame encoding of correlated video frames. The computationally expensive correlation between frames is done on the decoder side. In the decoder, motion information extraction is performed to build an estimate, called side information (SI), of some frames of the sequence. SI quality has a significant impact on the coding performance of the system.

At the encoder, an input video sequence is divided into key frames and Wyner-Ziv frames. Each key frame is encoded using a conventional intraframe encoder (for example H.264, JPEG2000, etc.) while each Wyner-Ziv frame is encoded using a distributed encoder to generate Wyner-Ziv bits.

Wyner–Ziv decoders are based on a model for the statistical dependence between the source and the SI. Accurate modeling of the correlation exploits the statistics between the source and side information, and has a strong impact on performance. The dependence between side information S_SIand source information S_Source is modeled as:

S_SI= S_Source + N,

where N is correlation noise. The statistics of correlation noise between the frame to be encoded and the motion-compensated side information available at the decoder in many cases are modeled as a Laplacian distribution.

The decoder constructs the SI using motion compensated interpolation of the key frames. The Noise variance is estimated from the residue obtained by motion compensating the key frames.

For Transform-Domain Wyner-Ziv coding, the estimated Laplacian parameter may be the same for all DCT blocks within a frame. For some implementations they may be adjusted individually for each DCT block. The coefficients of the DCT transform are quantized by biplanes and, for the entire Wyner-Ziv frame, are grouped together according to the position occupied by each DCT coefficient within a block. DCT coefficient bands are formed.

To reduce the number of parity bit requests made by the decoder, the encoder determines the minimum number of parity bits to be sent per bit plane and per band. The number of parity bits has a strong impact on decoding complexity. The minimum bitrate is defined by the conditional entropy H(S_Source| S_SI), that is a function of crossover probability Pr(S_Source≠S_SI).

The correlation Laplacian model between the Wyner-Ziv encoded subband samples and the corresponding SI samples is assumed to be known to the encoder. From the Laplacian correlation model the quantity crossover probability is estimated. At the output of the turbo decoder the error rate has to be estimated. If the error rate estimation is more than a particular threshold, the turbo decoder requests more parity bits from the encoder’s buffer using a feedback channel.

Complete Communications Engineering

Correlation Noise in Wyner-Ziv Video Coding

More Information