Chroma Subsampling Standards

Most video standards use some form of chroma subsampling, in which the color information for a frame is sampled at a lower spatial resolution than the luma information. This is reasonable because the human eye itself has less spatial resolution in color than in luminance.

On account of this, we get code words like "4:2:2" and "4:1:1" to describe how the subsampling is done. Roughly, the numbers refer to the ratios of the luma sampling frequency to the sampling frequencies of the two chroma channels (typically C_B and C_R, in digital video). I say "roughly" because this formula doesn't make any sense for things like "4:2:0".

In the following diagrams, the black/gray dots indicate positions of luma samples (the luma sampling grid). Colored dots indicate the positions of chroma samples (the chroma sampling grid).

4:4:4

No subsampling here: for every luma (Y') sample, you also grab a pair of chroma samples (one for each channel).

4:2:2 (SMPTE)

Chroma is sampled at half the horizontal frequency as luma, but the vertical frequency is the same. The chroma samples are horizontally aligned with luma samples.

Three Varieties of 4:2:0

Chroma is sampled at half the horizontal frequency as luma, and also at half the vertical frequency.

It turns out that there are actually three varieties of 4:2:0 subsampling in common use, differing in where the chroma samples are located relative to the luma samples. For field-based (interlaced) video, it is not obvious how to perform the 2:1 vertical subsampling, and each variety differs in how this is done.

JPEG/MPEG-1/MJPEG: Chroma samples are centered between luma samples both horizontally and vertically. JPEG and MPEG-1 have no notion of fields, and thus this applies to the whole frame. For MJPEG, interlaced fields are subsampled and compressed independently/sequentially; this pattern thus applies to the individual fields.

MPEG-2: Chroma samples are vertically centered between, but horizontally aligned with, luma samples in the complete frame. Unlike MPEG-1, MPEG-2 does handle interlacing. For interlaced fields, each field is subsampled independently, however:

top-field samples are sited 1/4 sample below the luma samples,
bottom-field samples are sited 1/4 sample above the luma samples.

Thus, the chroma samples from each field are sited in the same vertical spots relative to the complete frame, and that spot is midway between pairs of lines.

full frame	top field	bottom field

SMPTE DV-PAL: Subsampling is performed on each field separately. Chroma samples are sited on top of luma samples, but C_B and C_R samples are sited on alternate lines. The diagram applies to a single field; red dots indicate C_R samples, and blue dots indicate C_B.

top field	bottom field

4:1:1 (DV-NTSC)

Chroma is sampled at one-fourth the horizontal frequency as luma, but at full vertical frequency. The chroma samples are horizontally aligned with luma samples. This mode uses the same bandwidth as 4:2:0.

In summary...

Chroma subsampling modes are defined by a pair of ratios and a pair of offsets. The ratios tell how the spatial frequency of the chroma sampling grid is related to the luma sampling grid, in both the horizontal and vertical directions. The offsets tell how the chroma sampling grid is positioned relative to the luma sampling grid.

Mode		Subsampling Ratio		Grid Offset (H,V)
		Subsampling Ratio		frame		upper field		lower field
		H	V	C_B	C_R	C_B	C_R	C_B	C_R
4:4:4		1/1	1/1	0,0
4:2:2	SMPTE	1/2	1/1	0,0
4:2:0	JPEG/MPEG-1/MJPEG	1/2	1/2	+1/2, +1/2		-		-
	MPEG-2			0, +1/2		0, +1/4		0, +3/4
	PAL-DV			-		0, +1	0, 0	0, +1	0, 0
4:1:1	NTSC-DV	1/4	1/1	0, 0

Mode:: Subsampling mode.
Ratio:: Horizontal(H) and vertical(V) subsampling ratios, i.e. the ratio of the number of chroma pixels to the number of luma pixels along that direction.
Offset:: Horizontal(H) and vertical(V) offset of the chroma sampling grid relative to the luma grid, expressed in units luma sample distance. This may be dependent on both the chroma plane (C_B or C_R), or the field in question.

References:

The Pixel Rosetta Stone: Packings and Colorspaces
Adam Wilt's DV Pix - Sampling Methods
MPEG-2 FAQ, v3.4, Question #57 (v3.8 is newer, but that information is mis-formatted.)
IEC 61834-2:1998, Recording - Helical-scan digital video cassette recording system using 6,35 mm magnetic tape for consumer use (525-60, 625-50, 1125-60 and 1250-50 systems), Part 2: SD format for 525-60 and 625-50 systems

return to the video library

maximum impact research
Digital Media Group
<dmg at mir.com> Last modified: Mon Mar 31 22:30:56 EST 2003

©2003 Matthew Marjanovic.
This material may not be republished in any form without express written consent of the author.