# Decoding: The Other Half of a Codec
When you open a JPEG in your photo app, the encoder's work happened long ago, perhaps on a camera, a web server, or your friend's phone. What matters now is the **decoder**: the algorithm that reconstructs viewable pixels from a compact bitstream.
This document explores decoding for both PNG and JPEG, explaining how each encoding step has a corresponding decoding step.
## What Is a Codec?
The word **codec** combines "coder" and "decoder" — the two halves of any compression system:
```text
Original ┌──────────┐ Compressed ┌──────────┐ Reconstructed
Data ────▶│ Encoder │──── Stream ────▶│ Decoder │────▶ Data
└──────────┘ └──────────┘
```
For **lossless** codecs (like PNG), the reconstructed data is identical to the original. For **lossy** codecs (like JPEG), it's an approximation — close enough that humans don't notice the difference, but not bit-for-bit identical.
### Symmetry of Operations
Every encoding operation has a corresponding decoding operation:
| Predict pixel, store difference | Add prediction to difference |
| Build Huffman codes, encode symbols | Rebuild Huffman tree, decode symbols |
| Quantize (divide and round) | Dequantize (multiply) |
| Forward DCT | Inverse DCT |
| Convert RGB → YCbCr | Convert YCbCr → RGB |
The decoder reverses each step, but in opposite order: what was encoded last is decoded first.
## Asymmetric Complexity
An important insight: **decoders are often simpler than encoders**.
Consider JPEG:
- The **encoder** must decide how aggressively to quantize, which Huffman tables to use, whether progressive mode helps, and which filter strategy minimizes output size
- The **decoder** just follows instructions: read the tables from the file, apply them mechanically
```text
Encoding decision: "Should I use Q=75 or Q=80? Let me try both..."
Decoding action: "The file says Q=75. Multiply by these values."
```
This asymmetry is intentional. Encoders run once (when creating the file), but decoders run millions of times (every time someone views the image). A complex encoder that produces smaller files is worth the effort; the decoder should be fast and simple.
### Implications
1. **Decoders must be robust**: They'll encounter files from buggy encoders, partially corrupted data, and edge cases the encoder never tested
2. **Decoders can't optimize**: They must faithfully reproduce what the encoder intended, even if a better reconstruction is theoretically possible
3. **Standards specify decoding**: JPEG and PNG standards precisely define decoder behavior; encoder behavior is more flexible
## PNG Decoding Pipeline
PNG decoding reverses the encoding pipeline:
```text
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ PNG File │───▶│ INFLATE │───▶│ Unfilter │───▶│ Raw Pixels │
│ (chunks) │ │ (decompress)│ │ (add back │ │ (RGB/RGBA) │
└─────────────┘ └─────────────┘ │ predictions)│ └─────────────┘
│ │ └─────────────┘
│ │
├── Parse IHDR ├── Rebuild Huffman trees
├── Find IDATs ├── Decode literals & lengths
└── Check CRCs └── Copy from sliding window
```
### Step 1: Parse Chunks
A PNG file is a sequence of chunks. The decoder reads:
```text
PNG Signature: 89 50 4E 47 0D 0A 1A 0A (validates this is a PNG)
IHDR chunk: width, height, bit depth, color type
PLTE chunk: palette (if indexed color)
IDAT chunk(s): compressed pixel data (may be split across multiple chunks)
IEND chunk: end marker
```
Each chunk includes a CRC-32 checksum. A decoder should verify these to detect corruption.
### Step 2: INFLATE (Decompress DEFLATE)
The IDAT chunks contain zlib-wrapped DEFLATE data. Decompression involves:
1. **Parse zlib header** — verify compression method is DEFLATE
2. **Process DEFLATE blocks** — each block is stored, fixed Huffman, or dynamic Huffman
3. **Decode Huffman symbols** — rebuild trees from transmitted code lengths
4. **Reconstruct data** — emit literal bytes or copy from sliding window
5. **Verify Adler-32 checksum** — ensure decompressed data is correct
#### Huffman Decoding
The encoder transmitted Huffman code lengths, not the codes themselves. The decoder rebuilds the codes using the canonical Huffman construction:
```text
Code lengths received: A=2, B=1, C=3, D=3
Rebuild codes (shorter codes get smaller values):
B: 0 (length 1)
A: 10 (length 2)
C: 110 (length 3)
D: 111 (length 3)
```
When decoding, read bits until you match a code:
```text
Input bits: 1 1 0 1 0 0 ...
─────
C A B
Decoded: C, A, B, ...
```
#### LZ77 Back-References
When the decoder sees a length/distance pair instead of a literal byte, it copies from already-decoded data:
```text
Output so far: "the quick brown "
Decode: (length=3, distance=16) ← "copy 3 bytes from 16 positions back"
Result: "the quick brown the"
^^^
copied from position 0
```
The "sliding window" is just the recent output — typically 32KB for DEFLATE.
```rust,ignore
// From src/decode/inflate.rs
let start = output.len() - distance;
for i in 0..length {
let byte = output[start + (i % distance)];
output.push(byte);
}
```
### Step 3: Unfilter (Reverse Predictions)
After INFLATE, we have filtered scanlines. Each row starts with a filter type byte (0-4), followed by filtered pixel data.
To unfilter, we **add back** the prediction:
| None (0) | X | X |
| Sub (1) | X - A | stored + A |
| Up (2) | X - B | stored + B |
| Average (3) | X - floor((A+B)/2) | stored + floor((A+B)/2) |
| Paeth (4) | X - Paeth(A,B,C) | stored + Paeth(A,B,C) |
Where A = left pixel, B = above pixel, C = upper-left pixel.
**Worked example** (Sub filter):
```text
Stored values: [1, 100, 2, 2, 2, 2, 2, 2]
↑ filter type = Sub
↑ first pixel (no left neighbor, A=0)
Decode:
Pixel 0: 100 + 0 = 100
Pixel 1: 2 + 100 = 102
Pixel 2: 2 + 102 = 104
Pixel 3: 2 + 104 = 106
...
Reconstructed: 100, 102, 104, 106, 108, 110, 112, 114
```
## JPEG Decoding Pipeline
JPEG decoding is more complex due to its lossy, block-based nature:
```text
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ JPEG File │───▶│ Huffman │───▶│ Dequantize │───▶│ Inverse DCT │
│ (markers) │ │ Decode │ │ (multiply) │ │ (IDCT) │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
│ │
│ ▼
├── Parse SOI, APP0 ┌─────────────┐ ┌─────────────┐
├── Read DQT tables │ Raw Pixels │◀───│ YCbCr → │
├── Read DHT tables │ (RGB) │ │ RGB │
└── Parse SOS scan └─────────────┘ └─────────────┘
```
### Step 1: Parse Markers
JPEG uses marker segments starting with 0xFF:
```text
FFD8 SOI (Start of Image)
FFE0 xxxx APP0 (JFIF metadata)
FFDB xxxx DQT (Quantization tables)
FFC0 xxxx SOF0 (Frame header: dimensions, components)
FFC4 xxxx DHT (Huffman tables)
FFDA xxxx SOS (Start of Scan, followed by entropy-coded data)
FFD9 EOI (End of Image)
```
The decoder extracts quantization tables, Huffman tables, and image dimensions before decoding pixel data.
### Step 2: Entropy Decoding
The scan data is Huffman-encoded. The decoder:
1. **Reads DC coefficient** — decoded differentially from previous block's DC
2. **Reads AC coefficients** — (run, value) pairs until end-of-block (EOB)
3. **Handles byte stuffing** — 0xFF 0x00 in data becomes just 0xFF
```text
Bitstream: [Huffman codes...]
DC decoding:
1. Decode category (number of bits for the difference)
2. Read that many bits for the actual difference
3. Add to previous DC value
AC decoding:
1. Decode (run, size) symbol
2. If EOB (0,0): remaining coefficients are zero
3. If ZRL (15,0): skip 16 zeros, continue
4. Otherwise: skip 'run' zeros, read 'size' bits for next coefficient
```
The result is 64 quantized DCT coefficients in zigzag order.
### Step 3: Dequantization
The encoder divided DCT coefficients by quantization values and rounded. The decoder multiplies to recover approximate coefficients:
```text
Quantized: [60, -2, 1, 0, 0, ...]
Quantization table: [16, 11, 10, ...]
Dequantized:
60 × 16 = 960
-2 × 11 = -22
1 × 10 = 10
0 × 16 = 0
...
```
**Note**: Information lost during quantization cannot be recovered. The decoder gets an approximation, not the original DCT coefficients.
### Step 4: Inverse DCT
The Inverse DCT (IDCT) transforms frequency coefficients back to spatial pixels:
```text
DCT coefficients (8×8): Spatial pixels (8×8):
┌─────────────────────┐ ┌─────────────────────┐
│ 960 -22 10 0 ... │ IDCT │ 128 130 132 134 ... │
│ -15 8 0 0 ... │ ───────▶ │ 129 131 133 135 ... │
│ ... │ │ ... │
└─────────────────────┘ └─────────────────────┘
```
The 2D IDCT is separable — apply 1D IDCT to rows, then to columns:
```text
IDCT formula: x[n] = Σ X[k] × cos(π(2n+1)k / 16) × c[k]
k=0..7
where c[0] = 1/√2, c[k] = 1 for k > 0
```
### Step 5: Color Conversion
JPEG typically stores images in YCbCr color space. The decoder converts back to RGB:
```text
Y = brightness (luma)
Cb = blue chrominance
Cr = red chrominance
R = Y + 1.402 × (Cr - 128)
G = Y - 0.344136 × (Cb - 128) - 0.714136 × (Cr - 128)
B = Y + 1.772 × (Cb - 128)
```
If chroma subsampling (4:2:0) was used, the decoder upsamples Cb and Cr to full resolution first.
### Step 6: Block Reassembly
The decoder processes the image in 8×8 blocks. The final step is assembling these blocks into the complete image, handling edge blocks that may extend beyond image boundaries.
## Bit-Level Reading
Both PNG and JPEG require reading individual bits from a byte stream, but with different conventions:
| PNG/DEFLATE | LSB-first | None |
| JPEG | MSB-first | 0xFF 0x00 → 0xFF |
```rust,ignore
// From src/decode/bit_reader.rs
// DEFLATE: LSB-first, new bytes fill upper bits
self.bit_buf |= (self.data[self.pos] as u64) << self.bits_in_buf;
// JPEG: MSB-first, new bytes shift left
## Common Pitfalls
### 1. Ignoring Checksums
Both formats include checksums (CRC-32 for PNG chunks, Adler-32 for zlib). Skipping verification seems faster, but means corrupt data produces garbage instead of errors.
**Recommendation**: Always verify checksums and report meaningful errors.
### 2. Assuming Valid Input
A robust decoder must handle:
- Truncated files (unexpected end-of-stream)
- Invalid Huffman codes (no matching symbol)
- Out-of-bounds back-references (distance > output length)
- Malformed chunk lengths
```rust,ignore
// From src/decode/inflate.rs
if distance > output.len() {
return Err(Error::InvalidDecode("distance too far back".into()));
}
```
### 3. Forgetting Filter State
PNG filtering is row-by-row, but each row depends on the previous row (for Up, Average, Paeth filters). A decoder must maintain the previous row's unfiltered data.
### 4. Mishandling JPEG Restart Markers
JPEG can include restart markers (0xFFD0-0xFFD7) within scan data to enable error recovery and parallel decoding. Decoders must reset DC prediction at each restart.
### 5. Integer Overflow in Dequantization
Dequantized DCT coefficients can exceed the range of a single byte. Use appropriate integer types during IDCT, then clamp to 0-255 at the end.
## Implementation in pixo
The library includes decoding infrastructure in `src/decode/`:
- **`bit_reader.rs`** — `BitReader` for DEFLATE (LSB-first) and `MsbBitReader` for JPEG (MSB-first with byte stuffing)
- **`inflate.rs`** — Complete DEFLATE decompression with fixed and dynamic Huffman support
```rust,ignore
// From src/decode/inflate.rs
pub fn inflate_zlib(data: &[u8]) -> Result<Vec<u8>> {
// Parse zlib header
// Inflate DEFLATE blocks
// Verify Adler-32 checksum
}
```
You can access decoding support through the CLI feature.
## Summary
Decoding completes the codec by reversing every encoding step:
| Filter (subtract predictions) | Unfilter (add predictions) |
| DEFLATE (LZ77 + Huffman) | INFLATE (reverse LZ77 + Huffman) |
| Write chunks with CRC | Parse chunks, verify CRC |
| RGB → YCbCr | YCbCr → RGB |
| Forward DCT | Inverse DCT |
| Quantize (divide) | Dequantize (multiply) |
| Huffman encode | Huffman decode |
Understanding both halves of a codec gives you:
1. **Deeper algorithm understanding** — Seeing the reverse illuminates the forward
2. **Debugging ability** — You can trace exactly where corruption or errors occur
3. **Transferable skills** — These same patterns appear in audio, video, and network protocols
## Next Steps
Return to [PNG Encoding](./png-encoding.md) or [JPEG Encoding](./jpeg-encoding.md) with fresh eyes, or explore [Performance Optimization](./performance-optimization.md) to see how both encoding and decoding can be accelerated.
---
## References
- [RFC 1951 - DEFLATE](https://www.rfc-editor.org/rfc/rfc1951) — Section 3.2.6 covers decompression
- [RFC 2083 - PNG](https://www.w3.org/TR/PNG/) — Chapter 9: Filtering, Chapter 10: Compression
- [ITU-T T.81 - JPEG](https://www.w3.org/Graphics/JPEG/itu-t81.pdf) — Annex F: Sequential DCT-based decoding
- See implementation: `src/decode/bit_reader.rs`, `src/decode/inflate.rs`