Crate djvu_rs

Expand description

Pure-Rust DjVu decoder written from the DjVu v3 public specification.

This crate implements the full DjVu v3 document format in safe Rust, including IFF container parsing, JB2 bilevel decoding, IW44 wavelet decoding, BZZ decompression, text layer extraction, and annotation parsing. All algorithms are written from the public DjVu spec with no GPL code.

§Key public types

DjVuError — top-level error enum (wraps IffError, etc.)
IffError — errors from the IFF container parser
PageInfo — page metadata parsed from the INFO chunk
Rotation — page rotation enum (None, Ccw90, Rot180, Cw90)
DjVuDocument — high-level document model (IFF/BZZ/IW44 based)
DjVuPage — lazy page handle
DjVuBookmark — NAVM bookmark (table of contents)
DocError — error type for the document model
djvu_render::RenderOptions — render parameters
djvu_render::RenderError — render pipeline error type
text::TextLayer — text layer from TXTz/TXTa chunks
text::TextZone — a zone node in the text layer hierarchy
annotation::Annotation — page-level annotation
annotation::MapArea — clickable area with URL and shape
Pixmap — RGBA pixel buffer returned by render methods
Bitmap — 1-bit bitmap for JB2 mask layers
Document — owned DjVu document (high-level std API, requires std feature)
Page — a page within a Document

§Quick start

use djvu_rs::Document;

let doc = Document::open("file.djvu").unwrap();
println!("{} pages", doc.page_count());

let page = doc.page(0).unwrap();
println!("{}x{} @ {} dpi", page.width(), page.height(), page.dpi());

let pixmap = page.render().unwrap();
// pixmap.data: RGBA bytes

§IFF parser

use djvu_rs::iff::parse_form;

let data = std::fs::read("file.djvu").unwrap();
let form = parse_form(&data).unwrap();
println!("form type: {:?}", std::str::from_utf8(&form.form_type));

Re-exports§

pub use error::DjVuError;
pub use djvu_document::DjVuBookmark;
pub use djvu_document::DjVuDocument;
pub use djvu_document::DjVuPage;
pub use djvu_document::DocError;
pub use text::TextLayer;
pub use text::TextZone;
pub use text::TextZoneKind;

Modules§

annotation: Annotation parser for DjVu ANTz/ANTa chunks — phase 4.
bzz_encode: BZZ compressor — encoding counterpart to bzz_new. Backwards-compatible BZZ compressor module.
bzz_new: BZZ decompressor — clean-room implementation.
djvm: DJVM document merge and split operations. DJVM document merge and split operations.
djvu_async: Async render surface for DjVuPage — phase 5 extension.
djvu_document: New document model — phase 3.
djvu_encode: High-level page encoder — composes the codec primitives into a complete FORM:DJVU page.
djvu_mut: In-place document mutation — byte-preserving rewrite primitive (PR1 of #222).
djvu_render: Rendering pipeline for DjVuPage — phase 5.
epub: DjVu to EPUB 3 exporter.
error: Typed error hierarchy for the new implementation (phase 1).
ffi: C FFI bindings for foreign language integration.
fgbz_encode: FGbz foreground-palette encoder — produces FGbz chunk payloads.
iff: IFF container parser (phase 1, written from spec). Backwards-compatible IFF module.
image_compat: image::ImageDecoder integration — allows DjVu pages to be used as first-class image sources in the image crate ecosystem.
iw44_encode: IW44 wavelet encoder — produces BG44/FG44/TH44 chunk payloads.
iw44_new: IW44 wavelet image decoder — clean-room implementation (phase 2c).
jb2: JB2 bilevel image decoder — clean-room implementation.
jb2_encode: JB2 bilevel image encoder — produces Sjbz chunk payloads.
metadata: Document metadata parser for METa/METz chunks — phase 4 extension.
navm_encode: NAVM bookmark encoder — serializes djvu_document::DjVuBookmark trees to BZZ-compressed binary. NAVM bookmark encoder.
ocr: Pluggable OCR backend trait and error types.
ocr_export: hOCR and ALTO XML export for the text layer.
ocr_neural: Neural OCR backend via Candle (requires ocr-neural feature). Neural OCR backend (requires ocr-neural feature).
ocr_onnx: ONNX OCR backend via tract (requires ocr-onnx feature). ONNX OCR backend via tract (requires ocr-onnx feature).
ocr_tesseract: Tesseract OCR backend (requires ocr-tesseract feature). Tesseract OCR backend (requires ocr-tesseract feature).
pdf: DjVu to PDF converter — phase 6.
segment: Photometric foreground/background segmentation — splits an RGBA page into a bilevel ink mask and a sub-sampled background pixmap.
smmr: Smmr chunk codec — ITU-T G4 (MMR) bilevel image compression.
text: Text layer parser for DjVu TXTz/TXTa chunks — phase 4.
text_encode: TXTa/TXTz text layer encoder — writes text::TextLayer back to DjVu binary format. TXTa/TXTz text layer encoder.
tiff_export: DjVu to TIFF exporter — phase 4 format extension.
wasm: WebAssembly bindings for djvu-rs.

Structs§

Bitmap: A 1-bit-per-pixel packed bitmap image.
Document: A parsed DjVu document. Owns the parsed structure.
GrayPixmap: An 8-bit grayscale image, 1 byte per pixel.
Page: A page within a DjVu document.
PageInfo: Metadata from the INFO chunk of a DjVu page.
Pixmap: An RGBA pixel image, 4 bytes per pixel.

Enums§

BzzError: BZZ compression decoding errors.
Error: Original error type used by the legacy implementation.
IffError: Errors that can occur while parsing the IFF container.
Iw44Error: IW44 wavelet image decoding errors.
Jb2Error: JB2 bitonal image decoding errors.
Rotation: Page rotation encoded in INFO flags bits 0–1.

Type Aliases§

Bookmark

Crate djvu_rs

Crate djvu_rs Copy item path

§Key public types

§Quick start

§IFF parser

Re-exports§

Modules§

Structs§

Enums§

Type Aliases§

Crate djvu_rs