djvu-rs 0.2.1

Pure-Rust DjVu decoder written from the DjVu v3 public specification
Documentation

djvu-rs

Crates.io docs.rs CI License: MIT

Pure-Rust DjVu decoder. MIT licensed. Written from the DjVu v3 public specification.

Features

  • IFF container parser — zero-copy, borrowing slices from input
  • JB2 bilevel image decoder — adaptive arithmetic coding (ZP coder) with symbol dictionary
  • IW44 wavelet image decoder — planar YCbCr storage, multiple refinement chunks
  • BZZ decompressor — ZP arithmetic coding + MTF + BWT (DIRM, NAVM, ANTz chunks)
  • Text layer extraction — TXTz/TXTa chunk parsing with zone hierarchy (page/column/region/paragraph/line/word/character)
  • Annotation parsing — ANTz/ANTa chunk parsing (hyperlinks, map areas, background color)
  • Bookmarks — NAVM table-of-contents parsing
  • Multi-page documents — DJVM bundle format with DIRM directory chunk
  • Page rendering — composite foreground + background into RGBA output
  • Progressive rendering — incremental BG44 wavelet refinement
  • Thumbnails — TH44 embedded thumbnail extraction
  • no_std compatible — IFF/BZZ/JB2/IW44/ZP modules work with alloc only (no std)

Quick start

use djvu_rs::Document;

let doc = Document::open("file.djvu")?;
println!("{} pages", doc.page_count());

let page = doc.page(0)?;
println!("{}x{} @ {} dpi", page.width(), page.height(), page.dpi());

let pixmap = page.render()?;
// pixmap.data is RGBA bytes (4 bytes per pixel, row-major)
// pixmap.width, pixmap.height are dimensions in pixels

Text extraction

use djvu_rs::Document;

let doc = Document::open("scanned.djvu")?;
let page = doc.page(0)?;
if let Some(text) = page.text()? {
    println!("Page text: {}", text);
}

Low-level IFF access

use djvu_rs::iff::parse_form;

let data = std::fs::read("file.djvu")?;
let form = parse_form(&data)?;
println!("FORM type: {:?}", std::str::from_utf8(&form.form_type));
for chunk in &form.chunks {
    println!("  chunk {:?} ({} bytes)", std::str::from_utf8(&chunk.id), chunk.data.len());
}

Feature flags

Flag Default Description
std enabled Enables Document, Page, file I/O, and rendering. Disable for no_std

Without std, the crate provides IFF parsing, BZZ decompression, JB2/IW44 decoding, text/annotation parsing — all codec primitives that work on byte slices.

Minimum supported Rust version (MSRV)

Rust 1.88 (edition 2024 features: let-chains stabilized in 1.88)

Roadmap

See GitHub milestones for the full roadmap and progress tracking.

Next up: v0.2 — Structural PDF export — proper DjVu → PDF with selectable text, lossless IW44/JB2 embedding, and preserved bookmarks/hyperlinks.

License

MIT. See LICENSE.

Specification

Written from the public DjVu v3 specification:

No code derived from GPL-licensed DjVuLibre or any other GPL source. All algorithms are independent implementations from the spec.