VoIP, or Voice over IP, enables mobile and fixed telephones, fax machines, and other communications devices to initiate and receive calls over an IP based packet network. VoIP works with Internet and mobile services to send/receive voice calls as digital signals over the Internet using voice over IP technology.  VoIP is often thought of as Voice over wired networks, such as Ethernet based networks, but modern mobile networks use VoIP for all voice calls – even voice is just data in a cell network.  Voice over IP in the 4G/5G networks is sometimes VoLTE – or Voice over LTE.  This is not in contrast to VoIP, but rather rules on how standard VoIP protocols will be handled in a LTE network.  Other frameworks can fall under the VoIP umbrella as well.  For example, WebRTC is also based on the same VoIP media protocols – but sessions are generally managed using web protocols, rather than SIP. 

Voice over IP (VoIP)
An example of a VoIP network with different end points, including telephones, fax machines and mobile VoIP terminals

There is a variety of software/hardware solutions and applications that can adapt mobile or landline phones and other devices to access VoIP services. The figure shows an example of a hybrid network depicting different ways of accessing an IP network for the purpose of using voice services.

Although there a number of protocol ecosystems that could be and have been called VoIP – such as those in the past based on H.323, Megaco, and MGCP – today, the primary protocol stack that VoIP systems are built around is the Session Initiation Protocol (SIP). SIP provides a flexible standard for initiating multimedia sessions between endpoints, including video, chat, interactive games, and virtual reality.  A SIP based VoIP system is built on a variety of IETF protocols beyond RFC3261 itself, some of the major pieces being:

RTP is the protocol used to take the media stream, most often audio and possible video in a VoIP system, and packetize it, adding sequencing and timing information, as well as information on the type of the payload.  This information is used to reconstruct the pieces of the stream on the receiving end, and recreating a constant real-time stream.

SDP is carried in the SIP messages themselves, conveying the parameters used to negotiate a media session.  A process called Offer/Answer, based on RFC3264, is used to do the negotiation.

TLS is the current follow up to SSL for transport layer security.  SIP uses TLS to encrypt the SIP messages between endpoints and servers, on a hop by hop bases.  This is often indicated by using the sips:// form of the sip:// URL.