tesseract5-rs 0.1.4

High-level Rust OCR library built on top of tesseract-rs (Tesseract 5.5 + Leptonica 1.85). Provides ergonomic access to word-level bounding boxes, hierarchy output, and OCR options.
docs.rs failed to build tesseract5-rs-0.1.4
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.

tesseract5-rs

High-level Rust OCR library built on top of tesseract-rs, the dariofinardi fork of cafercangundogdu/tesseract-rs. Provides an ergonomic Ocr5Engine API with automatic tessdata path resolution, word-level bounding-box hierarchy, and re-exports the full tesseract-rs surface so callers need only one dependency.

Crates.io License: MIT

On versioning — this crate starts at v0.1.0 because it is the first public release, not because it is incomplete or experimental. The OCR pipeline, word-level hierarchy, automatic tessdata resolution, and all re-exported tesseract-55-rs bindings are production-ready and work correctly for their intended purpose. The 0.x prefix simply follows Rust convention for a first crates.io publication.


Why this crate exists

The dependency chain

tesseract5-rs          ← you are here (high-level API)
    └── tesseract-rs   ← dariofinardi fork (semplifica branch)
            └── Tesseract 5.5.0 + Leptonica 1.85.0  (compiled from source at build time)

Using tesseract-rs directly is perfectly valid. tesseract5-rs adds a thin ergonomic layer on top:

tesseract-rs tesseract5-rs
FFI bindings via re-export
Tessdata path resolution manual automatic
OcrOptions / OcrOutput structs
Ocr5Engine wrapper
Hierarchy in one call manual with_hierarchy: true
crates.io publication target no (git-only) yes

Why the semplifica branch of the fork

The upstream cafercangundogdu/tesseract-rs crate (v0.1.20 on crates.io) targets Tesseract 5.3.x and does not expose word-level positional output. The fork (semplifica branch) ships the following changes on top of upstream:

Commit Change
ab37bf8 Tesseract 5.5.0 + Leptonica 1.85.0 — build script bumped; newer model support and C API fixes
e51751b ARM64 / Snapdragon X Elite support — correct library names and linker flags for aarch64-pc-windows-msvc
bbb0b1d Per-arch build cache%APPDATA%/tesseract-rs/<arch>/… prevents host/cross conflicts
0819eb8 dynamic-libs feature — builds Tesseract + Leptonica as DLL/.so for desktop app bundling (Tauri, etc.)
c981751 TesseractHierarchy + get_hierarchy() — walks the ResultIterator and returns a nested Rust struct: TesseractHierarchy → [Block → [Paragraph → [TextLine → [Word(text, bbox, confidence)]]]] with BoundingBox, serializable via serde
da080d2 UB fix in process_pages()TessBaseAPIProcessPages returns BOOL (c_int), not char *; old code cast integer 1 to a string pointer causing undefined behaviour

None of these changes are available in the upstream crate. Until they are merged upstream (or a compatible crates.io release is published), tesseract5-rs pins to this branch to provide a stable, tested surface.


Installation

[dependencies]
tesseract5-rs = { git = "https://github.com/dariofinardi/Tesseract5-rs" }

Note: the first build compiles Tesseract 5.5.0 and Leptonica 1.85.0 from source (~2–4 min). Subsequent builds use the cached compiled libraries in %APPDATA%/tesseract-rs/<arch>/ (Windows) or ~/.tesseract-rs/<arch>/ (Linux/macOS).

Optional features

Feature Description
dynamic-libs Build Tesseract + Leptonica as shared libraries (.dll/.so) instead of static libs. Useful when bundling native binaries in a desktop app.

Quick start

use tesseract5_rs::{Ocr5Engine, OcrOptions};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Engine initializes Tesseract and resolves tessdata automatically.
    let engine = Ocr5Engine::new(OcrOptions {
        lang: "eng".into(),
        ..Default::default()
    })?;

    // Load your image as raw bytes (RGB, width × height).
    let img = image::open("document.png")?.to_rgb8();
    let (w, h) = img.dimensions();

    let output = engine.recognize(img.as_raw(), w as i32, h as i32, 3, 3 * w as i32)?;
    println!("{}", output.text);

    Ok(())
}

With word-level bounding boxes

use tesseract5_rs::{Ocr5Engine, OcrOptions};

let engine = Ocr5Engine::new(OcrOptions {
    lang: "eng".into(),
    with_hierarchy: true,
    ..Default::default()
})?;

let output = engine.recognize(bytes, width, height, bytes_per_pixel, bytes_per_line)?;

if let Some(hier) = output.hierarchy {
    for block in &hier.blocks {
        for para in &block.paragraphs {
            for line in &para.lines {
                for word in &line.words {
                    println!(
                        "{:?}  conf={:.0}%  bbox={:?}",
                        word.text, word.confidence, word.bbox
                    );
                }
            }
        }
    }
}

Custom tessdata path

use tesseract5_rs::{Ocr5Engine, OcrOptions};
use std::path::PathBuf;

let engine = Ocr5Engine::new(OcrOptions {
    lang: "ita+eng".into(),
    tessdata_dir: Some(PathBuf::from("/opt/tessdata")),
    ..Default::default()
})?;

The TESSDATA_PREFIX environment variable is also honoured if tessdata_dir is not set.


API overview

Ocr5Engine

pub struct Ocr5Engine { /**/ }

impl Ocr5Engine {
    /// Create and initialize. Resolves tessdata, sets PSM if provided.
    pub fn new(opts: OcrOptions) -> Result<Self>;

    /// Run OCR on raw image bytes. Returns text + optional hierarchy.
    pub fn recognize(
        &self,
        image: &[u8],
        width: i32, height: i32,
        bytes_per_pixel: i32, bytes_per_line: i32,
    ) -> Result<OcrOutput>;

    /// Access the underlying `TesseractAPI` for advanced use.
    pub fn inner(&self) -> &TesseractAPI;
}

OcrOptions

pub struct OcrOptions {
    pub lang: String,              // e.g. "eng", "ita+eng"
    pub psm: Option<u8>,           // page segmentation mode (0–13)
    pub tessdata_dir: Option<PathBuf>,
    pub with_hierarchy: bool,      // populate OcrOutput::hierarchy
}

OcrOutput

pub struct OcrOutput {
    pub text: String,
    pub hierarchy: Option<TesseractHierarchy>,
}

Re-exported types

All public types from tesseract-rs are re-exported: TesseractAPI, TesseractHierarchy, TesseractBlock, TesseractParagraph, TesseractTextLine, TesseractWord, BoundingBox, TesseractError, Result, TessPageSegMode, TessPageIteratorLevel, and the remaining enums and iterators.

Tessdata path resolution

default_tessdata_dir() (also public) resolves in this order:

  1. TESSDATA_PREFIX environment variable
  2. Build-cache path written by tesseract-rs's build script:
    • Windows: %APPDATA%\tesseract-rs\<arch>\static\tessdata
    • Linux: ~/.tesseract-rs/<arch>/static/tessdata
    • macOS: ~/Library/Application Support/tesseract-rs/<arch>/static/tessdata

System requirements

  • Rust 1.83.0+
  • C++ compiler (MSVC on Windows, GCC/Clang on Linux/macOS)
  • CMake ≥ 3.20
  • Internet connection on first build (downloads Tesseract + Leptonica source archives)

Credits

  • Tesseract OCR — Apache 2.0, Google Inc.
  • cafercangundogdu/tesseract-rs — original Rust FFI bindings, MIT, Cafer Can Gündoğdu
  • dariofinardi/tesseract-rs (semplifica branch) — fork with Tesseract 5.5, ARM64, TesseractHierarchy, dynamic-libs, UB fixes — Dario Finardi