Expand description
Pure-Rust DjVu decoder written from the DjVu v3 public specification.
This crate implements the full DjVu v3 document format in safe Rust, including IFF container parsing, JB2 bilevel decoding, IW44 wavelet decoding, BZZ decompression, text layer extraction, and annotation parsing. All algorithms are written from the public DjVu spec with no GPL code.
§Key public types
DjVuError— top-level error enum (wrapsIffError, etc.)IffError— errors from the IFF container parserPageInfo— page metadata parsed from the INFO chunkRotation— page rotation enum (None, Ccw90, Rot180, Cw90)DjVuDocument— high-level document model (IFF/BZZ/IW44 based)DjVuPage— lazy page handleDjVuBookmark— NAVM bookmark (table of contents)DocError— error type for the document modeldjvu_render::RenderOptions— render parametersdjvu_render::RenderError— render pipeline error typetext::TextLayer— text layer from TXTz/TXTa chunkstext::TextZone— a zone node in the text layer hierarchyannotation::Annotation— page-level annotationannotation::MapArea— clickable area with URL and shapePixmap— RGBA pixel buffer returned by render methodsBitmap— 1-bit bitmap for JB2 mask layersDocument— owned DjVu document (high-level std API, requires std feature)Page— a page within aDocument
§Quick start
use djvu_rs::Document;
let doc = Document::open("file.djvu").unwrap();
println!("{} pages", doc.page_count());
let page = doc.page(0).unwrap();
println!("{}x{} @ {} dpi", page.width(), page.height(), page.dpi());
let pixmap = page.render().unwrap();
// pixmap.data: RGBA bytes§IFF parser
use djvu_rs::iff::parse_form;
let data = std::fs::read("file.djvu").unwrap();
let form = parse_form(&data).unwrap();
println!("form type: {:?}", std::str::from_utf8(&form.form_type));Re-exports§
pub use error::DjVuError;pub use djvu_document::DjVuBookmark;pub use djvu_document::DjVuDocument;pub use djvu_document::DjVuPage;pub use djvu_document::DocError;pub use text::TextLayer;pub use text::TextZone;pub use text::TextZoneKind;
Modules§
- annotation
- Annotation parser for DjVu ANTz/ANTa chunks — phase 4.
- bzz_
encode - BZZ compressor — encoding counterpart to
bzz_new. Backwards-compatible BZZ compressor module. - bzz_new
- BZZ decompressor — clean-room implementation.
- djvm
- DJVM document merge and split operations. DJVM document merge and split operations.
- djvu_
async - Async render surface for
DjVuPage— phase 5 extension. - djvu_
document - New document model — phase 3.
- djvu_
encode - High-level page encoder — composes the codec primitives into a
complete
FORM:DJVUpage. - djvu_
mut - In-place document mutation — byte-preserving rewrite primitive (PR1 of #222).
- djvu_
render - Rendering pipeline for
DjVuPage— phase 5. - epub
- DjVu to EPUB 3 exporter.
- error
- Typed error hierarchy for the new implementation (phase 1).
- ffi
- C FFI bindings for foreign language integration.
- fgbz_
encode - FGbz foreground-palette encoder — produces FGbz chunk payloads.
- iff
- IFF container parser (phase 1, written from spec). Backwards-compatible IFF module.
- image_
compat image::ImageDecoderintegration — allows DjVu pages to be used as first-class image sources in theimagecrate ecosystem.- iw44_
encode - IW44 wavelet encoder — produces BG44/FG44/TH44 chunk payloads.
- iw44_
new - IW44 wavelet image decoder — clean-room implementation (phase 2c).
- jb2
- JB2 bilevel image decoder — clean-room implementation.
- jb2_
encode - JB2 bilevel image encoder — produces Sjbz chunk payloads.
- metadata
- Document metadata parser for METa/METz chunks — phase 4 extension.
- navm_
encode - NAVM bookmark encoder — serializes
djvu_document::DjVuBookmarktrees to BZZ-compressed binary. NAVM bookmark encoder. - ocr
- Pluggable OCR backend trait and error types.
- ocr_
export - hOCR and ALTO XML export for the text layer.
- ocr_
neural - Neural OCR backend via Candle (requires
ocr-neuralfeature). Neural OCR backend (requiresocr-neuralfeature). - ocr_
onnx - ONNX OCR backend via tract (requires
ocr-onnxfeature). ONNX OCR backend via tract (requiresocr-onnxfeature). - ocr_
tesseract - Tesseract OCR backend (requires
ocr-tesseractfeature). Tesseract OCR backend (requiresocr-tesseractfeature). - DjVu to PDF converter — phase 6.
- segment
- Photometric foreground/background segmentation — splits an RGBA page into a bilevel ink mask and a sub-sampled background pixmap.
- smmr
- Smmr chunk codec — ITU-T G4 (MMR) bilevel image compression.
- text
- Text layer parser for DjVu TXTz/TXTa chunks — phase 4.
- text_
encode - TXTa/TXTz text layer encoder — writes
text::TextLayerback to DjVu binary format. TXTa/TXTz text layer encoder. - tiff_
export - DjVu to TIFF exporter — phase 4 format extension.
- wasm
- WebAssembly bindings for djvu-rs.
Structs§
- Bitmap
- A 1-bit-per-pixel packed bitmap image.
- Document
- A parsed DjVu document. Owns the parsed structure.
- Gray
Pixmap - An 8-bit grayscale image, 1 byte per pixel.
- Page
- A page within a DjVu document.
- Page
Info - Metadata from the INFO chunk of a DjVu page.
- Pixmap
- An RGBA pixel image, 4 bytes per pixel.
Enums§
- BzzError
- BZZ compression decoding errors.
- Error
- Original error type used by the legacy implementation.
- IffError
- Errors that can occur while parsing the IFF container.
- Iw44
Error - IW44 wavelet image decoding errors.
- Jb2Error
- JB2 bitonal image decoding errors.
- Rotation
- Page rotation encoded in INFO flags bits 0–1.