pdf-dump

A CLI tool for inspecting and debugging the internal structure of PDF files.

pdf-dump shows you what's actually inside a PDF — objects, streams, fonts, images, form fields, bookmarks, annotations, tagged structure, and more. Useful for debugging PDF generation, understanding why a PDF looks wrong, or exploring the format.

Installation

cargo install pdf-dump

Requires a Rust toolchain that supports edition 2024.

Quick Start

# Overview: metadata, validation summary, stream stats, feature indicators
pdf-dump file.pdf

# Extract text
pdf-dump file.pdf --text
pdf-dump file.pdf --text --page 3

# Search for text across pages
pdf-dump file.pdf --find-text "invoice"

# Page info: dimensions, resources, fonts, annotations, text preview
pdf-dump file.pdf --page 3

# List fonts or images
pdf-dump file.pdf --fonts
pdf-dump file.pdf --images

# Explain a specific object
pdf-dump file.pdf --inspect 5

# Find all font objects
pdf-dump file.pdf --search Type=Font

# Structural validation
pdf-dump file.pdf --validate

# One-line listing of every object
pdf-dump file.pdf --list

Modes

Document-level modes (combinable)

These can be used together — output gets section headers automatically:

Flag	Description
`--text`	Extract readable text from content streams
`--operators`	Show content stream operators
`--find-text "pattern"`	Case-insensitive text search with context
`--fonts`	List all fonts with encoding and embedding details
`--images`	List all images with dimensions, color space, filters
`--forms`	List AcroForm fields with names, types, values
`--bookmarks`	Show the document outline tree
`--annotations`	Show annotations with link targets
`--tags`	Show tagged PDF structure tree (accessibility)
`--tree`	Show the object graph as an indented reference tree
`--validate`	Structural checks: broken refs, unreachable objects, required keys
`--list`	One-line-per-object table
`--detail <view>`	Detail views: `security`, `embedded`, `labels`, `layers`

# Combine freely
pdf-dump file.pdf --fonts --images --validate

Standalone modes (one at a time)

Flag	Description
`--object N`	Print object(s) by number (`5`, `1,5,12`, `3-7`)
`--inspect N`	Full explanation of an object's role and relationships
`--search <expr>`	Find objects matching criteria (`Type=Font`, `key=MediaBox`, `stream=text`)
`--extract-stream N --output file`	Extract a decoded stream to a file

Modifiers

Flag	Effect
`--page N` or `--page N-M`	Filter to specific pages; shows page info when used alone
`--json`	Structured JSON output (works with every mode)
`--decode`	Decompress stream contents
`--deref`	Inline-expand references (with `--object`)
`--depth N`	Limit traversal depth (with `--tree`, `--tags`, `--json`)
`--hex`	Hex dump for binary streams
`--raw`	Raw undecoded stream bytes (with `--object`)
`--truncate N`	Limit binary output to N bytes
`--dot`	GraphViz DOT output (with `--tree`)

JSON Output

Every mode supports --json for structured output:

pdf-dump file.pdf --json                    # Overview as JSON
pdf-dump file.pdf --fonts --json            # Font list as JSON
pdf-dump file.pdf --fonts --images --json   # Combined modes wrapped in a JSON object
pdf-dump file.pdf --validate --json         # Validation results as JSON

Supported Stream Filters

FlateDecode, ASCII85Decode, ASCIIHexDecode, LZWDecode, RunLengthDecode — applied sequentially for multi-filter pipelines.

Acknowledgments

Built on lopdf, a pure-Rust PDF parsing library.

Related Projects

medpdf — Medium-level PDF API over lopdf (includes medpdf-image for image embedding)
pdf-maker — CLI tool for merging, watermarking, and manipulating PDF files

License

Licensed under either of

Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

pdf-dump 0.12.6