leptess 0.4.1

Rust binding for Tesseract and Leptonica.
Documentation

Leptess

CircleCI Crates.io

High level Rust binding for Tesseract and Leptonica.

Low level C API bindings are auto generated using bindgen.

Build dependencies

Make sure you have Leptonica and Tesseract installed.

For Ubuntu user:

sudo apt-get install libleptonica-dev libtesseract-dev

You will also need to install tesseract language data based on your OCR needs:

sudo apt-get install tesseract-ocr-eng

Usage

Minimal example:

let mut api = tesseract::TessApi::new(None, "eng");
let mut pix = leptonica::pix_read(Path::new("path/page.bmp")).unwrap();
api.set_image(&pix);

println!("{}", api.get_utf8_text().unwrap());

api.destroy();
pix.destroy();

For more examples, see examples directory.

Development

Regenerate capi binding:

make gen

To run tests, you will need at Tesseract 4.x to match what we have in tests/tessdata/eng.traineddata. See CircleCI config to see how to replicate the setup.