Chroma Subsampling
What are 4:2:0, 4:2:2, 4:4:4:4?
These numbers refer to Chroma Subsampling. It is a method video encoders use to reduce file size by throwing away color information while keeping brightness information intact.
To understand this, you first have to know that digital video splits an image into two parts:
- Luma (Y): The brightness (the "black and white" version of the image).
- Chroma (C): The color information.
Because human eyes are much more sensitive to changes in brightness than changes in color, engineers realized they could "average out" the color across multiple pixels without the viewer noticing.
The Number System Explained
The numbers represent a grid of pixels (usually 4 pixels wide and 2 pixels high).
- The 1st number: The width of the sample (always 4).
- The 2nd number: How many color samples are in the top row.
- The 3rd number: How many color samples are in the bottom row.
- The 4th number (if present): The Alpha channel (transparency).
4:4:4 (The "Perfect" Image)
Every single pixel has its own unique color data. No color information is thrown away.
- Quality: Perfect. No "bleeding" or blocky edges.
- Use Case: High-end Visual Effects (VFX), heavy color grading, or high-end animation/CGI.
- File Size: Massive.
- Example: ProRes 4444.
4:2:2 (The "Pro" Standard)
In every row of 4 pixels, only 2 color samples are taken. This means two horizontal pixels "share" the same color, but they keep their own brightness levels.
- Quality: Excellent. This is the professional standard for filming and editing.
- Use Case: Professional cameras (Sony A7S III, Canon R5), broadcast television, and green screen work. It provides enough color data to "key" a green screen without the edges looking jagged.
- File Size: Large, but manageable.
- Example: ProRes 422, DNxHR HQX.
4:2:0 (The "Delivery" Standard)
In the top row, 2 color samples are taken. In the bottom row, zero new samples are taken—the bottom row simply copies the colors from the top row. A 2x2 square of pixels shares a single color.
- Quality: Good for viewing, but poor for heavy editing.
- Use Case: YouTube, Netflix, Blu-ray, and almost all internet video.
- File Size: Small and efficient.
- Note: If you try to do heavy color grading on 4:2:0 footage, the colors may start to look "blocky" or "noisy" because there isn't enough data.
4:4:4:4 (The "Graphics" Format)
The fourth "4" stands for the Alpha Channel (transparency).
- Quality: Perfect color + transparency.
- Use Case: Lower-thirds, titles, or logo animations that need to be "see-through" when placed on top of other video.
- Example: ProRes 4444.
Comparison Table
| Format | Color Samples | Best For... | Used By... |
|---|---|---|---|
| 4:4:4 | 100% Color | VFX & CGI | Arri Alexa, High-end VFX |
| 4:2:2 | 50% Color | Green Screen & Color Grading | Professional Mirrorless Cameras |
| 4:2:0 | 25% Color | Final Watching / Uploading | YouTube, iPhones, Netflix |
Which should you use?
- When Filming: If your camera allows it, shoot 4:2:2. It gives you more flexibility to fix colors later without the image "breaking."
- When Exporting for YouTube: Exporting 4:2:0 is perfectly fine. YouTube is going to convert your video to 4:2:0 anyway, so there is no visual benefit to uploading a 4:4:4 file unless you have a very specific high-end workflow.
- When Exporting Graphics: If you need a logo with a transparent background, you must use a codec that supports 4:4:4:4 (like ProRes 4444).
File Size Calculation
To calculate the file size of a single uncompressed frame, we first determine the total pixel count and then multiply it by the "data weight" of each pixel based on its bit depth and chroma subsampling ratio.
For 4K UHD (16:9 aspect ratio), the resolution is 3,840 × 2,160, resulting in 8,294,400 pixels.
The Math of Chroma Subsampling
Chroma subsampling reduces size by "sharing" color data across pixels while keeping the brightness (Luma) unique for every pixel.
- 4:4:4: 3 samples per pixel (1 Luma, 2 Chroma).
- 4:2:2: 2 samples per pixel (1 Luma, 0.5 + 0.5 Chroma).
- 4:2:0: 1.5 samples per pixel (1 Luma, 0.25 + 0.25 Chroma).
1. Calculation for 8-bit Video
8-bit depth means 1 byte per sample.
| Subsampling | Calculation (Pixels × Samples × 1 Byte) | Frame Size (Bytes) | Frame Size (MB) |
|---|---|---|---|
| 4:4:4 | 24,883,200 | ~24.88 MB | |
| 4:2:2 | 16,588,800 | ~16.59 MB | |
| 4:2:0 | 12,441,600 | ~12.44 MB |
2. Calculation for 10-bit Video
10-bit depth means 1.25 bytes per sample (or 10 bits / 8).
| Subsampling | Calculation (Pixels × Samples × 1.25) | Frame Size (Bytes) | Frame Size (MB) |
|---|---|---|---|
| 4:4:4 | 31,104,000 | ~31.10 MB | |
| 4:2:2 | 20,736,000 | ~20.74 MB | |
| 4:2:0 | 15,552,000 | ~15.55 MB |
Why this matters (The "Real World" View)
If you look at the 10-bit 4:2:2 frame (~20.74 MB), consider what that means for a full video:
- At 24 fps: One second of uncompressed footage is ~497 MB.
- One minute: The file would be ~29.8 GB.
This is exactly why we use codecs like ProRes or H.265. A ProRes 422 HQ file for that same 4K minute would be about 5.5 GB, and an H.265 "YouTube quality" file would be about 450 MB. You are seeing the massive amount of "weight" that chroma subsampling and compression remove to make video files manageable for our hard drives.