
Many speech coders have been standardized under the auspices of the International Telecommunications Union. These have typically been speech coders first designed for additional bandwidth reduction purposes in the telephone network. As such, these speech coders often used G.711 μ-law or A-law TDM signals and multiplexed digital channels for the compressed speech signals. As such, many of these early speech coders did not use silence suppression techniques as the benefit could only be exploited on statistically multiplexed digital carrier systems. (A number of other proprietary speech coders were also developed for satellite links which very heavily relied on statistical multiplexing.)
With the advent of Voice over IP, many of these legacy speech coders were retrofitted with silence detection/comfort noise generation. Both G.723.1 and G.729 considered these functions as an add-on via Annex A and Amendment B respectively. G.711 was retrofitted with similar functionality though G.711 Appendix II. Packet loss concealment was also missing from many of these legacy speech coders. The G.711 packet loss concealment (Appendix I) and silence suppression techniques are also commonly used by other speech coders which lack such functionality, like G.722, G.726 and G.728 wideband and narrowband speech coders.
Certain speech coders were designed for specific applications. Early video conferencing systems used G.723.1 which has a frame rate similar to video (30 msec versus 33.333 msec). Since video and audio needed to be synchronized, relatively compatible frame rates were desirable. However for Voice over IP applications, large frames sizes tend to be strongly undesirable as they contribute very quickly to round-trip conversation latency. As a modern limit, one-way delay should be limited to 150 msec (100 msec preferred) for acceptable Voice over IP deployments and applications.
With the greater availability of network bandwidth, such as found in high-speed cable modem networks and fibre deployments like Verizon FiOS and AT&T U-verse (SM), less and less speech coding needs to be performed. In fact these systems would greatly appreciate simplicity with the use of ordinary G.711 μ-law or A-law. This is how the clarity of hearing "a pin drop" was claimed as a selling point by Sprint for their long distance services. Unlike other carrier, they had enough excess capacity on their fiber networks to carry voice in the native telephone network format. VoIP systems will not only meet this clarity standard, but it will exceed it by offering wideband voice capability by using G.722 or Speex. Wideband high-definition audio (HD audio) is further discussed here.
Using VOCAL's proprietary techniques, extensive code optimization is performed in such a manner that virtually all modern processors (DSP, RISC and CISC) are well supported. Benchmarks have shown that VOCAL's highly optimized C with limited assembly code compares well against other vendors implementations which typically require significantly more assembly language. Typically the difference in performance is within a couple of MIPS. However, the portability and maintainability of our code benefits our customers by lowering the initial costs, easing integration, reduced maintenance costs (fewer coder changes when upgrading compilers), and greater availability of optimized code for different modern processors.