pdf_oxide 0.3.8

The Complete PDF Toolkit: extract, create, and edit PDFs. Rust core with bindings for Python, Node, WASM, Go, and more.
Documentation
# PDF Oxide — The Fastest PDF Library for Python and Rust

> The fastest PDF library for Python and Rust.
> Text extraction, image extraction, PDF creation, editing, and markdown conversion.
> Mean 2.1ms per document. 100% pass rate on 3,830 real-world PDFs.
> MIT / Apache-2.0 license. Current version: 0.3.6.

- Python: `pip install pdf-oxide`
- Rust: `cargo add pdf_oxide`
- Documentation: https://pdf.oxide.fyi
- Full documentation in one file: https://pdf.oxide.fyi/llms-full.txt

## Getting Started

- [Python Quick Start](https://raw.githubusercontent.com/yfedoseev/pdf_oxide/main/docs/getting-started-python.md): Install with pip, extract text, create PDFs, edit documents
- [Rust Quick Start](https://raw.githubusercontent.com/yfedoseev/pdf_oxide/main/docs/getting-started-rust.md): Add to Cargo.toml, PdfDocument and Pdf APIs, error handling

## API Reference

- [Rust API (docs.rs)](https://docs.rs/pdf_oxide): Complete Rust API reference with all public types and methods
- [README](https://raw.githubusercontent.com/yfedoseev/pdf_oxide/main/README.md): Project overview, installation, quick start examples
- [Changelog](https://raw.githubusercontent.com/yfedoseev/pdf_oxide/main/CHANGELOG.md): Version history and release notes

## Guides

- [PDF Creation Guide](https://raw.githubusercontent.com/yfedoseev/pdf_oxide/main/docs/PDF_CREATION_GUIDE.md): DocumentBuilder, fonts, images, form fields, annotations
- [Markdown Converter](https://raw.githubusercontent.com/yfedoseev/pdf_oxide/main/docs/MARKDOWN_CONVERTER_USAGE.md): PDF to Markdown conversion options

## Extraction

- extract_text(page_index): Extract plain text from a page
- extract_spans(page_index): Extract text with font, size, color, position metadata
- extract_chars(page_index): Per-character extraction with bounding boxes
- extract_images(page_index): Extract images from content streams, XObjects, inline
- extract_paths(page_index): Extract vector graphics and paths
- to_markdown(page_index) / to_markdown_all(): Convert pages to Markdown
- to_html(page_index) / to_html_all(): Convert pages to HTML
- FormField::extract_fields(): Extract form field values, export FDF/XFDF
- get_annotations(page_index): Get all annotation types
- get_outline(): Get document bookmarks/outline
- TextSearcher: Regex and case-insensitive text search

## Creation

- Pdf::from_markdown(text): Create PDF from Markdown
- Pdf::from_html(html): Create PDF from HTML
- Pdf::from_image(path) / from_images(paths): Create from PNG/JPEG/TIFF
- Pdf::from_qrcode(data) / from_barcode(data, type): QR codes and barcodes
- PdfBuilder: Fluent API — title, author, page_size, margins, font_size, add_text, add_image
- DocumentBuilder: Low-level — pages, text positioning, form fields, annotations, tables

## Editing

- DocumentEditor::open(path): Open PDF for editing
- PdfPage DOM: find_text_containing, set_text, replace text
- Page operations: rotate, crop, merge, extract, media_box, crop_box
- Form editing: get/set field values, add/remove fields, flatten
- Annotation editing: modify, flatten, redact
- Image manipulation: reposition, resize, set_bounds
- Encryption: save_encrypted with AES-256, password protection

## Compliance

- PdfAValidator / PdfAConverter: PDF/A validation and conversion
- PdfUaValidator: PDF/UA accessibility checks
- PdfXValidator: PDF/X print production validation

## Optional

- [Architecture](https://raw.githubusercontent.com/yfedoseev/pdf_oxide/main/docs/ARCHITECTURE.md): Internal architecture and design decisions
- [Development Guide](https://raw.githubusercontent.com/yfedoseev/pdf_oxide/main/docs/DEVELOPMENT_GUIDE.md): Contributing and development setup

## Links

- GitHub: https://github.com/yfedoseev/pdf_oxide
- PyPI: https://pypi.org/project/pdf-oxide/
- crates.io: https://crates.io/crates/pdf_oxide
- docs.rs: https://docs.rs/pdf_oxide
- Documentation: https://pdf.oxide.fyi