What Is H.264?
H.264, also known as AVC (Advanced Video Coding) and MPEG-4 Part 10, is the most widely deployed video compression standard in history. Developed jointly by the ITU-T Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG) under the Joint Video Team (JVT), H.264 was first standardized in 2003 and achieved mainstream adoption by 2005–2008.
H.264 dramatically improved compression efficiency over its predecessor MPEG-2 (used on DVDs): H.264 can store video at half the bitrate of MPEG-2 at equal quality, enabling HD video streaming over broadband connections, Blu-ray disc formats, web video (YouTube, Netflix), video calling, and virtually every modern video application.
H.264 Encoding Architecture
H.264 encodes video by predicting how each frame can be derived from previously decoded frames, then encoding only the difference (residual) between the prediction and the actual frame.
Frame Types
| Frame Type | Description |
|---|---|
| I-frame (Intra) | Compressed without reference to other frames; standalone. Larger than P or B frames. |
| P-frame (Predictive) | Predicted from one previous reference frame. Uses motion vectors to describe how blocks moved. |
| B-frame (Bidirectional) | Predicted from both past and future frames. Smallest; highest compression but requires decoder buffering. |
A Group of Pictures (GOP) is the sequence of frames between two I-frames. Common GOP structures:
IBBBPBBBP...— typical film encoding (high compression)IPPPPP...— low-latency encoding (live streaming)- Keyframe-only (all I-frames) — editing-optimized codecs
Motion Estimation and Compensation
Motion compensation is H.264's primary compression tool. For each block in a P or B frame, the encoder finds the best matching block in reference frames:
- Divides each frame into macroblocks (16×16 pixels)
- Searches reference frames for similar blocks (motion search)
- Records the displacement vector (motion vector)
- Computes residual = actual block - predicted block
- Applies DCT transform to the residual
- Quantizes DCT coefficients (discards high-frequency detail)
- Entropy codes the quantized coefficients
Sub-pixel motion compensation: H.264 allows motion vectors with 1/4-pixel precision (quarter-pixel), achieved through fractional-pixel interpolation filters. This provides much better prediction accuracy than integer-pixel motion vectors.
Intra Prediction
Even within an I-frame, H.264 uses spatial prediction: each 4×4 or 8×8 luma block is predicted from already-decoded neighboring blocks. H.264 defines 9 intra prediction modes for 4×4 blocks:
- Mode 0: Vertical (propagate top edge downward)
- Mode 1: Horizontal (propagate left edge rightward)
- Mode 2: DC (use mean of top and left neighbors)
- Modes 3-8: Diagonal and off-diagonal directions
The encoder chooses the mode that minimizes the residual energy (Rate-Distortion optimization).
Transform and Quantization
The prediction residual undergoes:
- 4×4 DCT — converts spatial residuals to frequency domain
- Quantization — divides DCT coefficients by a Quantization Parameter (QP), rounding to integers. Higher QP = more compression, lower quality.
- Inverse quantization + IDCT on decoder side to reconstruct residual
CRF (Constant Rate Factor): x264 and x265 use CRF instead of fixed QP. CRF targets a consistent perceptual quality across the video, automatically adjusting QP per frame based on complexity. CRF 18 ≈ transparent; CRF 23 = default; CRF 28 = lower quality.
CABAC Entropy Coding
H.264 High Profile uses CABAC (Context-Adaptive Binary Arithmetic Coding), a powerful entropy coder that models the probability of each symbol based on the context of neighboring symbols. CABAC typically reduces bitrate by 10–15% vs. CAVLC (used in Baseline/Main Profile) for the same quality.
H.264 Profiles
| Profile | Features | Typical Use |
|---|---|---|
| Baseline | No B-frames, no CABAC, limited to 2 reference frames | Old mobile devices, video calls |
| Main | B-frames, CABAC, up to 16 reference frames | Standard broadcasting |
| High | All Main features + 8×8 intra DCT, custom quantization matrices | High-quality video, Blu-ray, web |
| High 10 | 10-bit color depth | Professional production, HDR |
| High 4:4:4 | Full chroma sampling, lossless mode | Professional production |
Most consumer content (YouTube, Netflix 1080p, Blu-ray) uses High Profile.
H.264 Levels
Levels define maximum decoder complexity (resolution × frame rate × bitrate):
| Level | Max resolution | Max bitrate | Example |
|---|---|---|---|
| 3.0 | 720×480 @ 30fps | 10 Mbps | SD video |
| 3.1 | 1280×720 @ 30fps | 14 Mbps | 720p HD |
| 4.0 | 1920×1080 @ 30fps | 20 Mbps | 1080p HD |
| 4.1 | 1920×1080 @ 30fps | 50 Mbps | Blu-ray |
| 5.0 | 1920×1080 @ 60fps | 168 Mbps | High-frame-rate HD |
| 5.1 | 4096×2048 @ 30fps | 300 Mbps | 4K |
| 5.2 | 4096×2160 @ 60fps | 480 Mbps | 4K/60fps |
H.264 Encoding with x264
x264 is the gold-standard open-source H.264 encoder. Key FFmpeg commands:
# CRF 23 (default quality) — balanced quality/size
ffmpeg -i input.mp4 -c:v libx264 -crf 23 -preset medium output.mp4
# High quality (CRF 18, slower encoding)
ffmpeg -i input.mp4 -c:v libx264 -crf 18 -preset slow output.mp4
# Specific bitrate (2-pass encoding for accurate file size)
ffmpeg -i input.mp4 -c:v libx264 -b:v 2000k -pass 1 -f null /dev/null
ffmpeg -i input.mp4 -c:v libx264 -b:v 2000k -pass 2 output.mp4
# Web-optimized (faststart moves moov atom to front)
ffmpeg -i input.mp4 -c:v libx264 -crf 23 -preset medium -movflags +faststart output.mp4
# 1080p @ 60fps for social media
ffmpeg -i input.mp4 -c:v libx264 -crf 22 -preset slow -vf scale=1920:1080 -r 60 -c:a aac -b:a 192k output.mp4
x264 Presets
The preset controls encoding speed vs. compression efficiency trade-off:
| Preset | Encoding speed | File size at same quality |
|---|---|---|
| ultrafast | Very fast | Largest |
| superfast | Fast | Large |
| veryfast | Fast | Medium-large |
| faster | Medium | Medium |
| fast | Medium | Medium |
| medium | Default | Default |
| slow | Slow | Smaller |
| slower | Very slow | Smaller |
| veryslow | Extremely slow | Smallest |
For most use cases, medium or slow is recommended. veryslow takes much longer for marginal quality improvement.
H.264 Compatibility
H.264 High Profile is supported by:
- All modern web browsers (Chrome, Firefox, Safari, Edge)
- All smartphones (iOS, Android)
- Smart TVs, gaming consoles (PS4/5, Xbox)
- Blu-ray players
- Streaming services (Netflix, YouTube, Twitch, etc.)
- Video editing software (Premiere, DaVinci Resolve, Final Cut Pro)
This universal support makes H.264 the safest choice for video distribution even in 2024, despite H.265 and AV1 offering better compression.
H.264 Containers
H.264 video is almost always stored in one of these containers:
| Container | Extension | Common use |
|---|---|---|
| MPEG-4 Part 14 | .mp4 / .m4v | Universal — web, streaming, mobile |
| Matroska | .mkv | Open container, PC media players |
| QuickTime | .mov | Apple ecosystem, production |
| Flash Video | .flv | Legacy — Adobe Flash (obsolete) |
| AVI | .avi | Legacy — Windows Media Player |
| MPEG-2 Transport Stream | .ts | Broadcasting, DVB, HLS segments |
Related conversions
Common video conversions that pair well with this guide: