
In a VoIP communication there are four major components that help bring the system together. They are:
Acoustic echo, background noise, and reverberation are some of the possible causes for the degradation of the voice signal. VOCAL's Voice Quality Enhancement System (VVQES) is designed to be incorporated with a VoIP stack. VQE can significantly improve the quality of conversations by removing echo and background noise. Especially, in hands-free applications, acoustics echoes can be particularly inhibiting to a conversation. Acoustic echo cancellation uses an adaptive filter to model the echo path between the loudspeaker and microphone. Noise reduction estimates the noise spectrum to improve the overall SNR.
Voice compression coders reduce the bandwidth of the VoIP call. Speech coders are also equipped with silence detection/comfort noise generation to allow for the transmission of silence injection description packets. Both G.723.1 and G.729 considered these functions as an add-on via Annex A and Amendment B respectively. G.711 was retrofitted with similar functionality though G.711 Appendix II. Packet loss concealment was also missing from many of these legacy speech coders. The G.711 packet loss concealment (Appendix I) and silence suppression techniques are also commonly used by other speech coders which lack such functionality, like G.722, G.726 and G.728 wideband and narrowband speech coders.
The Real-time Transport Protocol was designed to provide real-time transmission of data such as audio or video over a network. In VoIP, the compressed voice samples from the speech coders are packaged into a payload and framed as a RTP/User Datagram Protocol/Internet Protocol packet. RTP transmit packet transmission parameters. RTP can be used for quality of service monitoring, statistics collection, and minimal control of a related RTP stream. RTP is a UDP based protocol that provides services such as payload type identification, sequence numbering, and time stamping of packets. Since RTP is delivered over UDP, which is an unreliable transport, there is no guarantee that a packet will be delivered, that packets will be delivered in the order in which they were sent, or that packets will be delivered at a constant rate. The sequence numbers and time stamps allow for an application receiving RTP packets to reconstruct a sender's packet sequence and detect changes in network jitter and adjust accordingly. VOCAL's software fully supports RTP as defined by RFC 3550. VOCAL also offers an adaptive jitter buffer to ensure proper playout of out-of-order RTP packets and detection of changes in network jitter.
SIP is an application-layer control protocol that can establish, modify, and terminate multimedia sessions (conferences) such as Internet telephony calls. SIP is used for set up, handshaking and tear down of VoIP session.
SIP can also invite participants to already existing sessions, such as multicast conferences. Media can be added to (and removed from) an existing session. SIP transparently supports name mapping and redirection services, which supports personal mobility.
Like all of VOCAL's software libraries, the VoIP stack is available in a variety of forms, including optimized ANSI C and assembly language optimized implementations for leading DSP architectures (including but not limited to processors from TI, ADI, AMD, ARM, MIPS, CEVA, LSI Logic ZSP, etc.). These libraries are modular and can be executed as a single task under a variety of operating systems or standalone with its own microkernel. To find out if your desired platform and processor is supported, please contact us.