zpdf
Pure-Rust PDF parsing and rendering, with interchangeable CPU (tiny-skia) and GPU (wgpu) renderers whose output matches within <1% of pixels.
Features
- Pure Rust — zero C/C++ dependencies, fully safe.
- PDF parsing — header, traditional xref + xref/object streams + hybrid
/XRefStm, trailer chains, lazy xref repair, object model, stream filters (Flate / LZW / ASCII85 / ASCIIHex / RunLength / DCT / CCITT G3-G4 + predictors) with corrupt-stream salvage. - Malformed-input robustness — opens corrupt, headerless, or garbage-tail
files via full-file object-scan recovery (catalog inside an
/ObjStm, page-tree synthesis from/Type /Pagescan, byte-flipped/Typetolerance, lenient header/dict parsing). Never panics or hangs on adversarial input: path/raster/clip budgets plus interpret/render time backstops degrade to a partial render instead. - Encryption — RC4 (40/128-bit) and AES-128 / AES-256 (V5 R5/R6) standard
security handler with crypt filters; opens with the user, owner, or empty
password (
open_with_password, CLI--password). - Content interpretation — graphics state, paths, clipping, text (incl.
render modes and rise), inline & XObject images, Form XObjects (full
resources,
/BBoxclip), axial/radial shadings, shading patterns, all 16 blend modes, dash patterns. - Color — DeviceGray/RGB/CMYK, ICCBased (
/N), Indexed, Lab, Separation/DeviceN via a full PDF function evaluator (types 0/2/3/4). - Fonts — embedded TrueType, Type1, Type1C/CFF, CID/Type0 (Identity-H,
/W,/CIDToGIDMapstreams), Type3, the standard-14 fonts; encodings +/Differences; Quartz-subset recovery;/ToUnicodetext extraction. - Images — 1/2/4/8/16-bpc,
/Decode, soft masks, stencil & color-key masks, Indexed palettes, CMYK JPEG; bilinear sampling with box-filter minification. - Page geometry — CropBox-aware rendering, page-tree attribute
inheritance (
/Rotate,/Resources, boxes), page rotation. - Annotations & forms —
/APappearance streams (/ASstates, Hidden/NoView); an AcroForm field model (acro_form(), CLIforms) that generates text/choice field appearances when the producer left none. - CPU rendering — tiny-skia backend, PNG output at any DPI.
- GPU rendering — wgpu backend (fills, strokes, clips, text, images, blend groups); matches the CPU renderer within <1% pixels.
- Tooling — CLI (
info/render/text/forms/compare/dump/debug-stream), an interactive winit viewer example, and a native GPUI desktop reader (zpdf-viewer-gpui).
Documentation
- docs/user-guide.md — the
zpdfcommand-line tool. - docs/library.md — using zpdf as a Rust library + architecture.
- docs/CHANGELOG.md — release notes.
- ROADMAP.md — development plan.
Install
Command-line tool — cargo install builds the zpdf binary:
Library — add the zpdf facade crate to your project:
Published on crates.io · API docs on docs.rs.
Quick start
Run from a checkout with cargo run (or drop the cargo run -p zpdf-cli --
prefix once the CLI is installed):
# Inspect a document
# Render page 1 at 150 DPI (CPU)
# Render on the GPU (requires the `gpu` feature)
# Extract text, and compare two renders
# Interactive viewer (pan/zoom/page-flip)
# Native desktop reader (GPUI: page list, zoom, fit-width, keyboard nav)
Library usage
use ;
let data = read.map_err?;
let doc = open?;
let page = doc.page?; // 0-based
let mut fonts = doc.load_page_fonts;
let mut images = new;
let content = doc.page_content_bytes?;
let display_list = new // CropBox ∩ MediaBox
.with_page_rotation
.with_fonts
.with_document
.with_images
.interpret;
let mut renderer = new
.with_fonts
.with_images;
let page_img = renderer.render_display_list?; // 150 DPI
page_img.save_png?;
Switch zpdf::cpu::CpuRenderer for zpdf::gpu::WgpuRenderer (with features = ["gpu-render"])
to render on the GPU — everything upstream is identical. See docs/library.md.
Architecture
14-crate workspace with a strict one-direction dependency flow. Render backends
depend only on zpdf-display-list, never on the parser — parsing and rendering stay
fully decoupled.
zpdf-core Shared types: ObjectId, PdfObject, Matrix, Rect, Error, ParseLimits
├─ zpdf-parser Lexer, xref/trailer, object & stream decoding, filters
│ └─ zpdf-document Catalog, page tree, resource inheritance, font loading
│ └─ zpdf-content Content-stream interpreter → DisplayList
├─ zpdf-font Type1 / TrueType / CID / Type3 fonts, CMap, encodings
├─ zpdf-image JPEG / Flate / CCITT / masks / palettes → RGBA
├─ zpdf-color Device / Indexed / Lab color + PDF function evaluator
├─ zpdf-display-list Flat RenderCommand sequence (the backend contract)
├─ zpdf-render RenderBackend trait
│ ├─ zpdf-render-cpu tiny-skia backend
│ └─ zpdf-render-wgpu wgpu backend (+ winit viewer example)
├─ zpdf-cli CLI tool
├─ zpdf-viewer-gpui Native desktop reader (GPUI; depends on the facade)
└─ zpdf Facade crate (re-exports; feature-gates cpu / gpu)
Supported PDF features
| Feature | Status |
|---|---|
| PDF 1.0–2.0 header | ✅ |
Traditional xref + xref/object streams + hybrid /XRefStm |
✅ |
Incremental update (/Prev) chains, lazy xref repair |
✅ |
Corrupt/headerless recovery (object scan, catalog-in-/ObjStm, page-tree synthesis) |
✅ |
| Adversarial-input safety (no panics, no hangs; budget-bounded partial render) | ✅ |
| Flate / LZW / ASCII85 / ASCIIHex / RunLength + predictors | ✅ with corrupt-stream salvage |
| DCTDecode (JPEG, incl. CMYK) / CCITTFaxDecode (G3/G4) | ✅ |
| Encryption: RC4 40/128, AES-128, AES-256 (R5/R6), crypt filters | ✅ user / owner / empty password |
Page tree + attribute inheritance (/Rotate, /Resources, boxes) |
✅ |
| CropBox-aware rendering + page rotation | ✅ |
| Graphics state, paths, painting, clipping, dash patterns | ✅ |
DeviceGray / DeviceRGB / DeviceCMYK / ICCBased (/N) |
✅ |
| Indexed / Lab / Separation / DeviceN (tint transforms) | ✅ |
| PDF functions (sampled / exponential / stitching / PostScript) | ✅ |
Axial & radial shadings (sh + shading patterns) |
✅ |
| Mesh shadings: free-form/lattice Gouraud (type 4/5), Coons/tensor patches (type 6/7) | ✅ |
| Tiling patterns (PatternType 1, colored/uncolored cell replication) | ✅ |
16 blend modes (/BM) |
✅ both backends |
| Text + text state operators, render modes, rise | ✅ (text-as-clip approximated) |
| Type3 / TrueType / Type1 / Type1C / CID-Type0 / standard-14 fonts | ✅ |
Encodings + /Differences, /ToUnicode extraction |
✅ |
/CIDToGIDMap streams, OpenType-wrapped CID CFF |
✅ |
Inline & XObject images: 1–16 bpc, /Decode, SMask, /Mask, palettes |
✅ |
Form XObjects (full resources, /BBox clip, recursion guards) |
✅ |
| CPU rendering (PNG) | ✅ |
| GPU rendering (wgpu) | ✅ |
ExtGState soft masks (/SMask), transparency groups |
✅ (knockout + non-isolated) |
ICC color profiles (real color management, moxcms) |
✅ |
Annotation appearance streams (/AP, /AS, Hidden/NoView) |
✅ |
| Interactive forms (AcroForm): field model + generated text/choice appearances | ✅ |
| Non-embedded font fallback (incl. CJK via system fonts) | ✅ |
| JBIG2 / JPX (JPEG 2000) filters | ✅ |
Optional content groups / layers (/OCG, /OCMD, /VE) |
✅ |
Predefined + embedded CMaps, vertical writing (WMode 1) |
✅ |
Dependencies
All pure Rust:
| Crate | Purpose |
|---|---|
ttf-parser |
TrueType / OpenType / CFF font parsing |
tiny-skia |
CPU 2D rasterization |
flate2 (rust_backend) |
FlateDecode |
zune-jpeg |
JPEG (DCTDecode) |
aes + cbc + sha2 |
AES decryption (RustCrypto) |
image |
PNG I/O |
winnow |
Parsing helpers |
wgpu + lyon + pollster |
GPU rendering (gpu-render feature) |
winit |
Viewer example only (dev-dependency) |
Development
Roadmap
See ROADMAP.md.
- Phase 1 — PDF parsing — done
- Phase 2 — Content interpretation + CPU rendering — done
- Phase 3 — wgpu GPU rendering — done
- Phase 4 — Advanced features — done (encryption incl. AES + user/owner passwords, shadings + mesh shadings, blend modes, spot color, CropBox/rotation, tiling-pattern cells, soft masks & transparency groups, annotation appearance streams, interactive forms (AcroForm), optional content, ICC color management, JBIG2 + JPEG 2000, system-font fallback, composite-font CMaps + vertical writing)
- Robustness — corrupt/adversarial-corpus pass: opens 426/618 of a malformed-PDF corpus (from 166), zero render panics, zero timeouts/hangs
License
MIT