GSM 06.10 Full Rate Vocoder

GSM 06.10 FR Vocoder defines a reference configuration for the speech transmission chain of the digital cellular telecommunications system. The speech encoder takes its input as a 13 bit uniform PCM signal either from the audio part of the mobile station or on the network side, from the PSTN via an 8 bit/A-law to 13 bit uniform PCM conversion. The encoded speech at the output of the speech encoder is delivered to a channel encoder unit which is specified in GSM 05.03. In the receive direction, the inverse operations take place.

GSM 06.10 describes the detailed mapping between input blocks of 160 speech samples in 13 bit uniform PCM form to encoded blocks of 260 bits and from encoded blocks of 260 bits to output blocks of 160 reconstructed speech samples. The sampling rate is 8000 sample/s leading to an average bit rate for the encoded bit stream of 13 kbit/s. The coding scheme is the so-called Regular Pulse Excitation - Long Term prediction - Linear Predictive Coder, here-after referred to as RPE-LTP.

GSM 06.10 also specifies the conversion between A-law PCM and 13 bit uniform PCM. Performance requirements for the audio input and output parts are included only to the extent that they affect the transcoder performance. GSM 06.10 also describes the codec down to the bit level, thus enabling the verification of compliance to the recommendation to a high degree of confidence by use of a set of digital test sequences.

GSM 06.10 Full Rate Encoder

  • The input speech frame, consisting of 160 signal samples (uniform 13 bit PCM samples), is first pre-processed to produce an offset-free signal, which is then subjected to a first order pre-emphasis filter. The 160 samples obtained are then analyzed to determine the coefficients for the short term analysis filter (LPC analysis). These parameters are then used for the filtering of the same 160 samples. The result is 160 samples of the short term residual signal. The filter parameters, termed reflection coefficients, are transformed to log.area ratios, LARs, before transmission. The speech frame is divided into 4 sub-frames with 40 samples of the short term residual signal in each. Each sub-frame is processed blockwise by the subsequent functional elements.
  • Before the processing of each sub-block of 40 short term residual samples, the parameters of the long term analysis filter, the LTP lag and the LTP gain, are estimated and updated in the LTP analysis block, on the basis of the current sub-block of the present and a stored sequence of the 120 previous reconstructed short term residual samples.
  • A block of 40 long term residual signal samples is obtained by subtracting 40 estimates of the short term residual signal from the short term residual signal itself. The resulting block of 40 long term residual samples is fed to the Regular Pulse Excitation analysis which performs the basic compression function of the algorithm.
  • As a result of the RPE-analysis, the block of 40 input long term residual samples are represented by one of 4 candidate sub-sequences of 13 pulses each. The subsequence selected is identified by the RPE grid position (M). The 13 RPE pulses are encoded using Adaptive Pulse Code Modulation (APCM) with estimation of the sub-block amplitude which is transmitted to the decoder as side information. The RPE parameters are also fed to a local RPE decoding and reconstruction module which produces a block of 40 samples of the quantized version of the long term residual signal.
  • By adding these 40 quantized samples of the long term residual to the previous block of short term residual signal estimates, a reconstructed version of the current short term residual signal is obtained. The block of reconstructed short term residual signal samples is then fed to the long term analysis filter which produces the new block of 40 short term residual signal estimates to be used for the next sub-block thereby completing the feedback loop.

GSM 06.10 Full Rate Decoder

  • The decoder includes the same structure as the feed-back loop of the encoder. In error-free transmission, the output of this stage will be the reconstructed short term residual samples. These samples are then applied to the short term synthesis filter followed by the de-emphasis filter resulting in the reconstructed speech signal samples.
  • GSM 06.10 describes the detailed mapping between input blocks of 160 speech samples in 13 bit uniform PCM form to encoded blocks of 260 bits and from encoded blocks of 260 bits to output blocks of 160 reconstructed speech samples. The sampling rate is 8000 sample/s leading to an average bit rate for the encoded bit stream of 13 kbit/s.

Features

  • Full and half duplex modes of operation
  • Passes ETSI test vectors
  • Compliant with GSM 06.10 Recommendation
  • MIPS/memory requirements for various platforms are available
  • PSQM/PSQM+ values under different network conditions are also available.
  • Optimized for high performance on leading edge DSP architectures
  • Multichannel implementation
  • Multi-tasking environment compatible

Configurations

  • DAA interface using linear codec at 8.0 kHz sample rate
  • Direct interface to 8.0 kHz PCM data stream (A-law or U-law)
  • North American/International Telephony (including caller ID) support available
  • Simultaneous DTMF detector operation available - (less than 150 hits on Bellcore test tape typical)
  • MF tone detectors, general purpose programmable tone detectors/generators available
  • Data/Facsimile/Voice Distinction available
  • Common compressed speech frame stream interface to support systems with multiple speech coders
  • Dynamic speech coders selection if multiple speech codecs available
  • Can be integrated with G.168 Echo Canceller and Tone Detection/Regeneration modules

Datasheet

ETSI Recommendation GSM 06.10