Chroma Subsampling Standards

Most video standards use some form of chroma subsampling, in which the color information for a frame is sampled at a lower spatial resolution than the luma information. This is reasonable because the human eye itself has less spatial resolution in color than in luminance.

On account of this, we get code words like "4:2:2" and "4:1:1" to describe how the subsampling is done. Roughly, the numbers refer to the ratios of the luma sampling frequency to the sampling frequencies of the two chroma channels (typically CB and CR, in digital video). I say "roughly" because this formula doesn't make any sense for things like "4:2:0".

In the following diagrams, the black/gray dots indicate positions of luma samples (the luma sampling grid). Colored dots indicate the positions of chroma samples (the chroma sampling grid).


No subsampling here: for every luma (Y') sample, you also grab a pair of chroma samples (one for each channel).

4:2:2 (SMPTE)

Chroma is sampled at half the horizontal frequency as luma, but the vertical frequency is the same. The chroma samples are horizontally aligned with luma samples.

Three Varieties of 4:2:0

Chroma is sampled at half the horizontal frequency as luma, and also at half the vertical frequency.

It turns out that there are actually three varieties of 4:2:0 subsampling in common use, differing in where the chroma samples are located relative to the luma samples. For field-based (interlaced) video, it is not obvious how to perform the 2:1 vertical subsampling, and each variety differs in how this is done.

JPEG/MPEG-1/MJPEG: Chroma samples are centered between luma samples both horizontally and vertically. JPEG and MPEG-1 have no notion of fields, and thus this applies to the whole frame. For MJPEG, interlaced fields are subsampled and compressed independently/sequentially; this pattern thus applies to the individual fields.

MPEG-2: Chroma samples are vertically centered between, but horizontally aligned with, luma samples in the complete frame. Unlike MPEG-1, MPEG-2 does handle interlacing. For interlaced fields, each field is subsampled independently, however:

Thus, the chroma samples from each field are sited in the same vertical spots relative to the complete frame, and that spot is midway between pairs of lines.

full frame top field bottom field

SMPTE DV-PAL: Subsampling is performed on each field separately. Chroma samples are sited on top of luma samples, but CB and CR samples are sited on alternate lines. The diagram applies to a single field; red dots indicate CR samples, and blue dots indicate CB.

top field bottom field

4:1:1 (DV-NTSC)

Chroma is sampled at one-fourth the horizontal frequency as luma, but at full vertical frequency. The chroma samples are horizontally aligned with luma samples. This mode uses the same bandwidth as 4:2:0.

In summary...

Chroma subsampling modes are defined by a pair of ratios and a pair of offsets. The ratios tell how the spatial frequency of the chroma sampling grid is related to the luma sampling grid, in both the horizontal and vertical directions. The offsets tell how the chroma sampling grid is positioned relative to the luma sampling grid.

Mode Subsampling
Grid Offset (H,V)
frame upper field lower field
4:4:4   1/1 1/1 0,0
4:2:2 SMPTE 1/2 1/1 0,0
4:2:0 JPEG/MPEG-1/MJPEG 1/2 1/2 +1/2, +1/2 - -
MPEG-2 0, +1/2 0, +1/4 0, +3/4
PAL-DV - 0, +1 0, 0 0, +1 0, 0
4:1:1 NTSC-DV 1/4 1/1 0, 0

Subsampling mode.
Horizontal(H) and vertical(V) subsampling ratios, i.e. the ratio of the number of chroma pixels to the number of luma pixels along that direction.
Horizontal(H) and vertical(V) offset of the chroma sampling grid relative to the luma grid, expressed in units luma sample distance. This may be dependent on both the chroma plane (CB or CR), or the field in question.


return to the video library

maximum impact research
Digital Media Group
<dmg at>
Last modified: Mon Mar 31 22:30:56 EST 2003

©2003 Matthew Marjanovic.
This material may not be republished in any form without express written consent of the author.