VOCAL Print Logo
Speech Coders >  Wideband >  Speex

Speex

Speex is an open source wideband voice codec developed specifically for high definition voice over IP (HD VoIP) and file-based compression applications and may be freely distributed. It yields good quality speech using Code Excited Linear Prediction (CELP) encoding techniques, is available with mulitiple bit-rates, and is robust to lost packets.

Overview

  • 3 different sampling rates: 8 kHz, 16 kHz, and 32 kHz (narrowband, wideband, and ultra-wideband)
  • Speex encoding is controlled most of the time by a quality parameter that ranges from 0 to 10
  • Possible to vary encoder complexity dynamically through through adjusting how lookup is performed
  • Dynamically adjust its bit-rate to adapt to the complexity of the audio being encoded
  • Voice activity detection (VAD) with comfort noise generation (CNG) to reproduce background noise
  • Discontinuous transmission (DTX) mode to transmit only when speech is present or comfort noise parameters must be updated
  • Perceptual enhancement by decoder which enhances sound quality subjectively (although not objectively)
  • Algorithmic delay is 30ms in narrowband mode, 34ms in wideband mode)

Features

  • Free software/open-source, patent and royalty-free
  • MIPS/memory requirements for various platforms are available
  • PSQM/PSQM+ values under different network conditions are also available.
  • Integration of narrowband and wideband using an embedded bit-stream
  • Wide range of bit-rates available (from 2 kbps to 44 kbps)
  • Standard bit rates of 2.15, 3.95, 5.95, 8, 11, 15, 18.2 and 24.6 kbps
  • Dynamic bit-rate switching and Variable Bit-Rate (VBR)
  • VAD integrated with VBR
  • Variable complexity
  • Ultra-wideband mode at 32 kHz (up to 48 kHz)
  • Intensity stereo encoding option
  • Code Excited Linear Prediction (CELP) based
  • Optimized for high performance on leading edge DSP architectures
  • Multichannel implementation
  • Multi-tasking environment compatible

Configurations

  • DAA interface using linear codec at 8.0 kHz sample rate
  • Direct interface to 8.0 kHz PCM data stream (A-law or μ-law)
  • North American/International Telephony (including caller ID) support available
  • Simultaneous DTMF detector operation available - (less than 10 talkoff hits on Bellcore test tape set)
  • MF tone detectors, general purpose programmable tone detectors/generators available
  • Data/Facsimile/Voice Distinction available
  • Common compressed speech frame stream interface to support systems with multiple speech coders
  • Dynamic speech coders selection if multiple speech codecs available
  • Can be integrated with G.168 Echo Canceller and Tone Detection/Regeneration modules

Links

Audio Examples

PSQM/PSQM+ values

IETF Speex Draft 7 IETF Speex Draft 6 Speex Manual

RFC 3551 - RTP Packetization RTP Parameters