Speex

Speex is an open source voice codec developed specifically for voice over ip (VoIP) and file-based compression applications and may be freely distributed under the GNU public license. It yields good quality speech using Code Excited Linear Prediction (CELP) encoding techniques, is available with mulitiple bit-rates, and is robust to lost packets.

Overview

  • 3 different sampling rates: 8 kHz, 16 kHz, and 32 kHz (narrowband, wideband, and ultra-wideband)
  • Speex encoding is controlled most of the time by a quality parameter that ranges from 0 to 10
  • Possible to vary encoder complexity dynamically through through adjusting how lookup is performed
  • Dynamically adjust its bit-rate to adapt to the complexity of the audio being encoded
  • Voice activity detection (VAD) with comfort noise generation (CNG) to reproduce background noise
  • Discontinuous transmission (DTX) mode to transmit only when speech is present or comfort noise parameters must be updated
  • Perceptual enhancement by decoder which enhances sound quality subjectively (although not objectively)
  • Algorithmic delay is 30ms in narrowband mode, 34ms in wideband mode)

Features

  • Free software/open-source, patent and royalty-free
  • MIPS/memory requirements for various platforms are available
  • PSQM/PSQM+ values under different network conditions are also available.
  • Integration of narrowband and wideband using an embedded bit-stream
  • Wide range of bit-rates available (from 2 kbps to 44 kbps)
  • Dynamic bit-rate switching and Variable Bit-Rate (VBR)
  • VAD integrated with VBR
  • Variable complexity
  • Ultra-wideband mode at 32 kHz (up to 48 kHz)
  • Intensity stereo encoding option
  • Code Excited Linear Prediction (CELP) based
  • Optimized for high performance on leading edge DSP architectures
  • Multichannel implementation
  • Multi-tasking environment compatible

Configurations

  • DAA interface using linear codec at 8.0 kHz sample rate
  • Direct interface to 8.0 kHz PCM data stream (A-law or U-law)
  • North American/International Telephony (including caller ID) support available
  • Simultaneous DTMF detector operation available - (less than 150 hits on Bellcore test tape typical)
  • MF tone detectors, general purpose programmable tone detectors/generators available
  • Data/Facsimile/Voice Distinction available
  • Common compressed speech frame stream interface to support systems with multiple speech coders
  • Dynamic speech coders selection if multiple speech codecs available
  • Can be integrated with G.168 Echo Canceller and Tone Detection/Regeneration modules