In H.264/AVC, each macroblock (P or B) is split into partitions for inter macroblock encoding. For each partition, a reference partition of the same size is located in one (for P-macroblock) or two (for B-macroblock) adjacent reference frames. The criteria for the optimal reference partition is that the difference between the current partition and the reference partition (for P-macroblock), or the weighted sum of two reference partitions (for B-macroblock), will be encoded using a minimum number of bits. The reference partition may be offset from the current partition by some 2-D offset, called a motion vector (MV). Each P partition has one MV and each B partition has two.
To provide a good reference, H.264/AVC uses quarter-pixel precision for the MV. If the reference partition does not exist directly in the reference frame, it must be interpolated from the existing pixels of the reference frame. This is equivalent to scaling up the reference frame by a factor of 4 vertically and horizontally, and searching for the best reference partition within the scaled frame using a subsampling factor of 4.
For luma pixels, the interpolation procedure has two stages:
In the 1st stage, the pixels are upscaled by 2 vertically and horizontally. For this, a 2-phase, 6-tap interpolation filter is used. The filter has the following coefficients:
In the 2nd stage, the pixels are upscaled one more time by 2 vertically and horizontally using linear interpolation, which is equivalent to the following polyphase filter:
For chroma pixels, 2-D linear interpolation is used between four neighbouring pixels:
Sout = round((Dy–y)(Dx–x) SA + (Dy–y) x SB + y (Dx–x) SC+ y x SD)