GSM 06.20 Half Rate (HR) Vocoder
GSM 06.20 GSM half rate codec uses the VSELP (Vector-Sum Excited Linear Prediction) algorithm. The VSELP
algorithm is an analysis-by-synthesis coding technique and belongs to the class of speech coding
algorithms known as CELP (Code Excited Linear Prediction).
GSM 06.20 GSM half rate codec's encoding process is performed on a 20 ms speech frame at a time. A speech
frame of the sampled speech waveform is read and based on the current waveform and the past history of the
waveform, the codec encoder derives 18 parameters that describe it. The parameters extracted are grouped
into the following three general classes:
- Energy parameters (R0 and GSP0)
- Spectral parameters (LPC and INT_LPC)
- Excitation parameters (LAG and CODE)
These parameters are quantised into 112 bits for transmission.
GSM 06.20 half rate codec is an analysis-by-synthesis codec, therefore the speech decoder is primarily a
subset of the speech encoder. The quantised parameters are decoded and a synthetic excitation is generated
using the energy and excitation parameters. The synthetic excitation is then filtered to provide the
spectral information resulting in the generation of the synthesised speech.
GSM 06.20 speech encoder takes its input as a 13 bit uniform Pulse Code Modulated (PCM) signal either from
the audio part of the MS or on the network side, from the PSTN via an 8 bit/A-law or U-law (PCS 1900) to
13 bit uniform PCM conversion. The encoded speech at the output of the speech encoder is delivered to the
channel coding function as defined in GSM 05.03 [3] to produce an encoded block consisting of 228 bits
leading to a gross bit rate of 11,4 kbit/s. In the RX direction, the inverse operations take place.
GSM 06.20 describes the detailed mapping between input blocks of 160 speech samples in 13 bit uniform PCM
form into encoded blocks of 112 bits and from encoded blocks of 112 bits to output blocks of 160
reconstructed speech samples. The sampling rate is 8 000 sample/s leading to an average bit rate for the
encoded bit stream of 5,6 kbit/s. The coding scheme is called Vector Sum Excited Linear Prediction (VSELP) coding.
GSM 06.20 Half-Rate Encoder
- The GSM half rate speech encoder uses an analysis by synthesis approach to determine the code to use to represent the excitation for each subframe.
- The codebook search procedure consists of trying each codevector as a possible excitation for the Code Excited Linear Predictive (CELP) synthesizer.
- The synthesized speech is compared against the input speech and a difference signal is generated.
- This difference signal is then filtered by a spectral weighting filter, to generate a weighted error signal.
- The codevector which generates the minimum weighted error power is chosen as the codevector for that subframe.
- The spectral weighting filter serves to weight the error spectrum based on perceptual considerations.
- This weighting filter is a function of the speech spectrum and can be expressed in terms of the a parameters of the short term (spectral) filter.
GSM 06.20 Half-Rate Decoder
- The speech decoder creates the combined excitation signal from the long term filter state and the VSELP codevector.
- The long term filter state is replaced by another VSELP codebook and the pitch prefilter is not used.
- The combined excitation is then processed by an adaptive pitch prefilter and gain.
- The prefiltered excitation is applied to the LPC synthesis filter.
- After reconstructing the speech signal with the synthesis filter, an adaptive spectral postfilter is applied followed by an automatic gain control which is the final processing step in the speech decoder.
Features
- Full and half duplex modes of operation
- Passes ETSI test vectors
- Compliant with GSM 06.20 Recommendation
- MIPS/memory requirements for various platforms are available
- PSQM/PSQM+ values under different network conditions are also available.
- Optimized for high performance on leading edge DSP architectures
- Multichannel implementation
- Multi-tasking environment compatible
Configurations
- DAA interface using linear codec at 8.0 kHz sample rate
- Direct interface to 8.0 kHz PCM data stream (A-law or U-law)
- North American/International Telephony (including caller ID) support available
- Simultaneous DTMF detector operation available - (less than 150 hits on Bellcore test tape typical)
- MF tone detectors, general purpose programmable tone detectors/generators available
- Data/Facsimile/Voice Distinction available
- Common compressed speech frame stream interface to support systems with multiple speech coders
- Dynamic speech coders selection if multiple speech codecs available
- Can be integrated with G.168 Echo Canceller and Tone Detection/Regeneration modules
Datasheet
ETSI Recommendation GSM 06.20