pdf_oxide 0.3.38

The fastest Rust PDF library with text extraction: 0.8ms mean, 100% pass rate on 3,830 PDFs. 5× faster than pdf_extract, 17× faster than oxidize_pdf. Extract, create, and edit PDFs.
Documentation
1
2
3
4
5
6
7
8
9
10
11
//! Text processing utilities for the extraction pipeline.
//!
//! This module provides various text post-processing operations that improve
//! extraction quality by normalizing whitespace, detecting citations, and
//! enhancing encoding fallback chains.

pub mod citations;
pub mod whitespace;

pub use citations::{Citation, CitationDetector, CitationType};
pub use whitespace::WhitespaceNormalizer;