Speex
Speex is an open source wideband voice codec developed specifically for high definition voice over IP (HD VoIP) and file-based
compression applications and may be freely distributed. It yields good quality speech using Code Excited
Linear Prediction (CELP) encoding techniques, is available with mulitiple bit-rates, and is robust to lost packets.
Overview
- 3 different sampling rates: 8 kHz, 16 kHz, and 32 kHz (narrowband, wideband, and ultra-wideband)
- Speex encoding is controlled most of the time by a quality parameter that ranges from 0 to 10
- Possible to vary encoder complexity dynamically through through adjusting how lookup is performed
- Dynamically adjust its bit-rate to adapt to the complexity of the audio being encoded
- Voice activity detection (VAD) with comfort noise generation (CNG) to reproduce background noise
- Discontinuous transmission (DTX) mode to transmit only when speech is present or comfort noise parameters must be updated
- Perceptual enhancement by decoder which enhances sound quality subjectively (although not objectively)
- Algorithmic delay is 30ms in narrowband mode, 34ms in wideband mode)
Features
- Free software/open-source, patent and royalty-free
- MIPS/memory requirements for various platforms are available
- PSQM/PSQM+ values under different network conditions are also available.
- Integration of narrowband and wideband using an embedded bit-stream
- Wide range of bit-rates available (from 2 kbps to 44 kbps)
- Standard bit rates of 2.15, 3.95, 5.95, 8, 11, 15, 18.2 and 24.6 kbps
- Dynamic bit-rate switching and Variable Bit-Rate (VBR)
- VAD integrated with VBR
- Variable complexity
- Ultra-wideband mode at 32 kHz (up to 48 kHz)
- Intensity stereo encoding option
- Code Excited Linear Prediction (CELP) based
- Optimized for high performance on leading edge DSP architectures
- Multichannel implementation
- Multi-tasking environment compatible
Configurations
- DAA interface using linear codec at 8.0 kHz sample rate
- Direct interface to 8.0 kHz PCM data stream (A-law or μ-law)
- North American/International Telephony (including caller ID) support available
- Simultaneous DTMF detector operation available - (less than 10 talkoff hits on Bellcore test tape set)
- MF tone detectors, general purpose programmable tone detectors/generators available
- Data/Facsimile/Voice Distinction available
- Common compressed speech frame stream interface to support systems with multiple speech coders
- Dynamic speech coders selection if multiple speech codecs available
- Can be integrated with G.168 Echo Canceller and Tone Detection/Regeneration modules
Links
Audio Examples
PSQM/PSQM+ values
IETF Speex Draft 7
IETF Speex Draft 6
Speex Manual
RFC 3551 - RTP Packetization
RTP Parameters