G.729AB Speech Coder

The G.729AB speech coder is a reduced complexity Conjugate-Structure Algebraic-Code-Excited Linear Prediction (CS-ACELP) speech compression algorithm that uses discontinuous transmission to reduce bandwidth. It is described in ITU-T G.729 Annex A and Annex B. G.729AB is especially suitable for VoIP applications, where normal conversation characteristics may be exploited in order to reduce bandwidth usage.

The Algorithm

G.729AB requires 10 ms input frames and generates frames of 80 bits in length. Since G.729AB is based on the Code-Excited Linear Prediction (CELP) model, each 80 bit frame produced contains linear prediction coefficients, excitation code book indices, and gain parameters that are used by the decoder in order to reproduce speech. The inputs/outputs of G.729AB are 16 bit linear PCM samples that are converted from/to an 8 kbps compressed data stream. G.729AB has the same total algorithmic delay of 15 ms as the G.729 speech coder.

The G.729AB speech coder contains the same reduced complexity modifications present in the G.729A speech coder. In addition, it also uses the G.729 Annex B specifications to reduce transmission during periods of silence. In normal conversation, it is normal that periods of silence may occur. During these times, voice is not present in the signal. It is possible to calculate the background noise characteristics and transmit these instead, and only update these values if the background noise changes significantly.

The G.729AB coder uses Voice Activity Detection (VAD) to determine if voice is present in the input signal. If so, the frame is constructed in accordance with G.729A. Otherwise, the noise characteristics are calculated and a Silence Insertion Description (SID) frame is sent instead; only 10 bits are present in a SID frame. The decoder will use this information for Comfort Noise Generation (CNG), as talkers expect some level of background noise during periods of silence. It is only necessary to transmit new SID frames when the background noise level changes. Discontinuous Transmission (DTX) reduces bandwidth usage by only transmitting voice or SID frames; during silence, only SID frames are sent when necessary to describe the current level of background noise.

Features

  • Compliant with G.729, Annex A, Annex B specifications
  • MIPS/memory requirements for various platforms are available
  • PSQM/PSQM+ values under different network conditions are also available.
  • Full and half duplex modes of operation
  • Passes ITU test vectors
  • Optimized for high performance on leading edge DSP architectures
  • Multichannel implementation
  • Multi-tasking environment compatible

Configurations

  • DAA interface using linear codec at 8.0 kHz sample rate
  • Direct interface to 8.0 kHz PCM data stream (A-law or U-law)
  • North American/International Telephony (including caller ID) support available
  • Simultaneous DTMF detector operation available - (less than 150 hits on Bellcore test tape typical)
  • MF tone detectors, general purpose programmable tone detectors/generators available
  • Data/Facsimile/Voice Distinction available
  • Common compressed speech frame stream interface to support systems with multiple speech coders
  • Dynamic speech coders selection if multiple speech codecs available
  • Can be integrated with G.168 Echo Canceller and Tone Detection/Regeneration modules
  • Multiple ports can be executed on a single DSP

Data Sheet

ITU Recommendation G.729

ITU Recommendation G.729 Annex A

ITU Recommendation G.729 Annex B