pdf_oxide 0.3.0

The Complete PDF Toolkit: extract, create, and edit PDFs. Rust core with bindings for Python, Node, WASM, Go, and more.
Documentation

PDFOxide

The Complete PDF Toolkit for Rust and Beyond

Extract, create, and edit PDFs with one library. Rust core with bindings for every language.

                         ┌──────────────┐
                         │  Rust Core   │
                         └──────┬───────┘
          ┌──────────┬─────────┼─────────┬──────────┐
          ▼          ▼         ▼         ▼          ▼
      ┌───────┐  ┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐
      │Python │  │ Node  │ │ WASM  │ │  Go   │ │  ...  │
      │  ✅   │  │ Soon  │ │ Soon  │ │ Soon  │ │       │
      └───────┘  └───────┘ └───────┘ └───────┘ └───────┘

Crates.io Documentation Build Status License: MIT OR Apache-2.0 Rust

📖 Documentation | 📝 Changelog | 🤝 Contributing | 🔒 Security

Quick Start

Extract text from PDF

let mut doc = PdfDocument::open("input.pdf")?;
let text = doc.extract_text(0)?;
let markdown = doc.to_markdown(0, Default::default())?;

Create a new PDF

let mut builder = DocumentBuilder::new();
builder.add_page(612.0, 792.0)
    .text("Hello, World!", 72.0, 720.0, 24.0);
builder.save("output.pdf")?;

Edit an existing PDF

let mut editor = DocumentEditor::open("input.pdf")?;
editor.add_highlight(0, rect, Color::yellow())?;
editor.add_text_field("name", rect)?;
editor.save("output.pdf")?;

Why pdf_oxide?

  • 📄 One library - Extract, create, and edit with unified API
  • Fast - Rust performance, 53ms average per PDF
  • 🦀 Pure Rust - Memory-safe, no C dependencies
  • 🌍 Multi-language - Rust core, bindings for Python, Node, WASM, Go

Features

Extract Create Edit
Text & Layout Documents Annotations
Images Tables Form Fields
Forms Graphics Bookmarks
Annotations Templates Links
Bookmarks Images Content

v0.3.0 Highlights: PDF/A conversion, PDF/X & PDF/UA validation, encryption, digital signatures, barcode generation, Office document conversion. See CHANGELOG.md for details.

Installation

Rust

[dependencies]
pdf_oxide = "0.3"

Python

pip install pdf_oxide

Examples

Rust - Extraction

use pdf_oxide::PdfDocument;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut doc = PdfDocument::open("paper.pdf")?;

    // Extract text
    let text = doc.extract_text(0)?;

    // Convert to Markdown
    let markdown = doc.to_markdown(0, Default::default())?;

    // Extract images
    let images = doc.extract_images(0)?;

    // Get annotations
    let annotations = doc.get_annotations(0)?;

    Ok(())
}

Python

from pdf_oxide import PdfDocument

doc = PdfDocument("paper.pdf")
text = doc.extract_text(0)
markdown = doc.to_markdown(0, detect_headings=True)

For more examples, see the examples/ directory.

Performance

Metric Result
Average Per PDF 53ms
Success Rate 100%
Quality Score 8.5+/10

Benchmarked on 103 diverse PDFs including forms, financial documents, and technical papers.

Building from Source

# Clone and build
git clone https://github.com/yfedoseev/pdf_oxide
cd pdf_oxide
cargo build --release

# Run tests
cargo test

# Build Python bindings
maturin develop

Documentation

# Generate local docs
cargo doc --open

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

# Development setup
cargo build
cargo test
cargo fmt
cargo clippy -- -D warnings

License

Dual-licensed under MIT or Apache-2.0 at your option.

Citation

@software{pdf_oxide,
  title = {PDF Oxide: High-Performance PDF Parsing in Rust},
  author = {Yury Fedoseev},
  year = {2025},
  url = {https://github.com/yfedoseev/pdf_oxide}
}

Built with 🦀 Rust + 🐍 Python | Status: ✅ Production Ready | v0.3.0 | 🚀 53ms per PDF