Evaluating Speech Voice Coders

There are several factors which need to be considered when evaluating a particular voice codec. The characteristics of each codec affect its suitability for a particular application and/or platform. These factors invariably influence either the round-trip delay or system overhead and include codec frame size, processing delay, look-ahead delay, frame length, MIPS, and memory requirements.

Frame Size

A frame consists of a coder-dependent set number of bits generated by the coder from a corresponding set number of speech samples. This means that frame size (or frame delay) measures voice traffic length in time. Frame-based codecs process one frame at a time, in contrast to stream-based codecs which process voice traffic in a continuous stream.

Frame Length

Frame length is the number of bits generated by a coder when producing one frame. This factor will affect the codec’s bit rate.

Bit Rate

The codec’s bit rate will determine how much bandwidth is necessary to transmit voice traffic. In general, a higher bit-rate corresponds with better voice quality and higher MIPS and memory requirements.

Processing Delay

Processing delay is the inherent algorithmic delay determined by the properties of the codec, as measured by the time necessary for the codec to process one frame.

Look-ahead Delay

Look-ahead delay occurs when a coder’s algorithm requires that it examines a portion of the the frame in order to help determine how to code the current frame. For example, G.729A uses a 5ms look-ahead and as a result will have a 5ms processing delay. This allows the coder to use the close correlation between existing frames to reduce the number of bits necessary to closely reproduce a speech frame.

MIPS

This value reflects the amount of processing power required by the codec. A more complex codec will usually require more processing power. As a general rule, the MIPS required by a voice codec are roughly inversely proportional to the amount of bandwidth consumed, for approximately the same speech quality (MOS value).

Memory

This value reflects the amount of RAM required by the codec. A more complex codec will usually require more memory. Please note that there may be platform-dependent considerations, where limited amounts of high-speed memory may be valuable in reducing codec processing bottlenecks. Proper memory layout can yield a significant reduction in MIPS.

Performance

Resource requirements for each platform are being updated. Please contact us directly for specific information.

MIPS/memory requirements
PSQM/PSQM+ values

VOCAL’s embedded software libraries include a complete range of ETSI / ITU / IEEE compliant algorithms, in addition to many other standard and proprietary algorithms. Our software is optimized for execution on ANSI C and leading DSP architectures (TI, ADI, AMD, ARM, MIPS, CEVA, LSI Logic ZSP, etc.). These libraries are modular and can be executed as a single task under a variety of operating systems or standalone with its own microkernel.

Complete Communications Engineering

Evaluating Speech Coders