This note’s purpose is to provide a brief refresher on the VoIP packet structure and offer a brief reference to other notes on topics related to VoIP.  The information in this note is related to the IPv4. As migration to IPv6 does not occur very rapidly, IPv4 and IPv6 hosts will yet coexist for several years to come. A part of the planning for the seamless migration from IPv4 to IPv6 results in developing real-time translators, as described in Ref. [1].

VoIP packet structure reflects to a great extent the hierarchical structure of the OSI (cf. Ref. [2]). A VoIP packet is composed of the IP header, followed by the UDP header, followed by RTP header, and finally followed by the payload (see Figure 1).

voice over ip packet structure
Figure 1: Structure of the VoIP packet (as in IPv4)

By noting the sizes of the individual headers, the minimum size of the IP/UDP/RTP packet’s header is 40 bytes, which of course is a tangible overhead (for example, for a 20ms VoIP packet with G.711 PCM, the overhead is 25%, which is not negligible from the viewpoint of the network traffic). The RTP header includes several fields that are closely related to the nature of the VoIP packet’s payload. Thus, it is worth examining the header more closely. Its structure is shown in Figure 2 (cf. Ref. [3]).

VoIP packet structure RTP header
Figure 2: Structure of the RTP header, according to RFC 3550

The first 4 bytes include:

Then the following fields:

The last field in the VoIP packet structure is the payload field which carries the encoded voice data. The number of bytes constituting the entire packet comes from the pre-defined packet size. Typically VoIP packets are 10ms, 20ms or 40ms packets (where the size in milliseconds corresponds to the payload only). Other packet sizes are permissible although they are not frequently used.

VOCAL’s software modules form the foundation for our VoIP Reference Design and can be used to provide secure, real-time unified communications for voice, video, radio and data over the Internet or any other IP network. Contact us to discuss your VoIP application requirements with our engineering staff.

More Information


  1. IPv4-IPv6 translator for VoIP and video conferencing, Davis, A.K. et al., 2011 International Conference on Communications and Signal Processing, 10-12 Feb. 2011; pp 367 – 369.
  2. Internetworking with TCP/IP, Vol.1: Principles, Protocols and Architectures, Douglas E. Comer; Prentice Hall; 3rd Edition.
  3. RFC 3550: A Transport Protocol for Real-Time Applications.
  4. Playout Buffering for Conversational Voice over IP, Gong, Q; McGill Univ. 2012.