Skip to main content

Crate tesseract_ocr_static_c

Crate tesseract_ocr_static_c 

Source
Expand description

§tesseract-ocr-static-c

This crate bundles Tesseract OCR and Leptonica libraries. These two libraries are built together with Musl libc and LLVM libcxx and linked statically. The build should be reproducible since the versions of all libraries are pinned. Since there are no dependencies one needs to supply images in raw RGB/RGBA/grayscale format to Tesseract.

The build should work with both dynamically and statically linked C libraries, i.e. *-gnu and *-musl targets.

Required CLI tools: cmake, make, git, python3, curl, tar, zstd.

Required compiler: Clang 20+.

§Environment variables

The following environment variables affect the build process.

VariableDefault valueComment
PATHExecutable search path
TESSERACT_CCclangC compiler
TESSERACT_CXXclang++C++ compiler
TESSERACT_ARllvm-ar
TESSERACT_RANLIBllvm-ranlib
TESSERACT_CFLAGS-O3C compiler flags
TESSERACT_CXXFLAGS-O3C++ compiler flags
TESSERACT_LDFLAGSLinker flags
TESSERACT_BUILD_FROM_SOURCEIf set, Tesseract OCR is built from source; otherwise an attempt is made to download pre-built binary. If the attempt fails, it is built from source.
TESSERACT_PRE_BUILT_ARCHIVE_URLOverride URL from which pre-built binary is downloaded. Normally you should have a different URL for each Rust target.
TESSERACT_PRE_BUILT_ARCHIVE_HASHBLAKE2b hash of the pre-built binary archive. Must be set if you’ve overriden hard-coded archive URLs. Can be computed with b2sum CLI tool.

§High-level interface

The following crate provides ergonomic Rust interface: tesseract-ocr-static.

Structs§

ETEXT_DESC
TessBaseAPI
TessChoiceIterator
TessPageIterator
TessResultIterator

Constants§

TessOcrEngineMode_OEM_DEFAULT
TessOcrEngineMode_OEM_LSTM_ONLY
TessOcrEngineMode_OEM_TESSERACT_LSTM_COMBINED
TessOcrEngineMode_OEM_TESSERACT_ONLY
TessOrientation_ORIENTATION_PAGE_DOWN
TessOrientation_ORIENTATION_PAGE_LEFT
TessOrientation_ORIENTATION_PAGE_RIGHT
TessOrientation_ORIENTATION_PAGE_UP
TessPageIteratorLevel_RIL_BLOCK
TessPageIteratorLevel_RIL_PARA
TessPageIteratorLevel_RIL_SYMBOL
TessPageIteratorLevel_RIL_TEXTLINE
TessPageIteratorLevel_RIL_WORD
TessPageSegMode_PSM_AUTO
TessPageSegMode_PSM_AUTO_ONLY
TessPageSegMode_PSM_AUTO_OSD
TessPageSegMode_PSM_CIRCLE_WORD
TessPageSegMode_PSM_OSD_ONLY
TessPageSegMode_PSM_RAW_LINE
TessPageSegMode_PSM_SINGLE_BLOCK
TessPageSegMode_PSM_SINGLE_BLOCK_VERT_TEXT
TessPageSegMode_PSM_SINGLE_CHAR
TessPageSegMode_PSM_SINGLE_COLUMN
TessPageSegMode_PSM_SINGLE_LINE
TessPageSegMode_PSM_SINGLE_WORD
TessPageSegMode_PSM_SPARSE_TEXT
TessPageSegMode_PSM_SPARSE_TEXT_OSD
TessParagraphJustification_JUSTIFICATION_CENTER
TessParagraphJustification_JUSTIFICATION_LEFT
TessParagraphJustification_JUSTIFICATION_RIGHT
TessParagraphJustification_JUSTIFICATION_UNKNOWN
TessPolyBlockType_PT_CAPTION_TEXT
TessPolyBlockType_PT_EQUATION
TessPolyBlockType_PT_FLOWING_IMAGE
TessPolyBlockType_PT_FLOWING_TEXT
TessPolyBlockType_PT_HEADING_IMAGE
TessPolyBlockType_PT_HEADING_TEXT
TessPolyBlockType_PT_HORZ_LINE
TessPolyBlockType_PT_INLINE_EQUATION
TessPolyBlockType_PT_NOISE
TessPolyBlockType_PT_PULLOUT_IMAGE
TessPolyBlockType_PT_PULLOUT_TEXT
TessPolyBlockType_PT_TABLE
TessPolyBlockType_PT_UNKNOWN
TessPolyBlockType_PT_VERTICAL_TEXT
TessPolyBlockType_PT_VERT_LINE
TessTextlineOrder_TEXTLINE_ORDER_LEFT_TO_RIGHT
TessTextlineOrder_TEXTLINE_ORDER_RIGHT_TO_LEFT
TessTextlineOrder_TEXTLINE_ORDER_TOP_TO_BOTTOM
TessWritingDirection_WRITING_DIRECTION_LEFT_TO_RIGHT
TessWritingDirection_WRITING_DIRECTION_RIGHT_TO_LEFT
TessWritingDirection_WRITING_DIRECTION_TOP_TO_BOTTOM

Functions§

TessBaseAPIAnalyseLayout
TessBaseAPIClear
TessBaseAPIClearAdaptiveClassifier
TessBaseAPIClearPersistentCache
TessBaseAPICreate
TessBaseAPIDelete
TessBaseAPIEnd
TessBaseAPIGetAltoText
TessBaseAPIGetBoolVariable
TessBaseAPIGetBoxText
TessBaseAPIGetDatapath
TessBaseAPIGetDoubleVariable
TessBaseAPIGetGradient
TessBaseAPIGetHOCRText
TessBaseAPIGetIntVariable
TessBaseAPIGetIterator
TessBaseAPIGetLSTMBoxText
TessBaseAPIGetPAGEText
TessBaseAPIGetPageSegMode
TessBaseAPIGetStringVariable
TessBaseAPIGetTextDirection
TessBaseAPIGetThresholdedImage
TessBaseAPIGetThresholdedImageScaleFactor
TessBaseAPIGetTsvText
TessBaseAPIGetUNLVText
TessBaseAPIGetUTF8Text
TessBaseAPIGetWordStrBoxText
TessBaseAPIInit2
TessBaseAPIInitForAnalysePage
TessBaseAPIIsValidWord
TessBaseAPIOem
TessBaseAPIPrintVariablesToFile
TessBaseAPIRecognize
TessBaseAPISetDebugVariable
TessBaseAPISetImage2
TessBaseAPISetMinOrientationMargin
TessBaseAPISetPageSegMode
TessBaseAPISetRectangle
TessBaseAPISetSourceResolution
TessBaseAPISetVariable
TessChoiceIteratorConfidence
TessChoiceIteratorDelete
TessChoiceIteratorGetUTF8Text
TessChoiceIteratorNext
TessDeleteText
TessMonitorCreate
TessMonitorDelete
TessMonitorGetProgress
TessMonitorSetCancelFunc
TessMonitorSetCancelThis
TessMonitorSetDeadlineMSecs
TessMonitorSetProgressFunc
TessPageIteratorBaseline
TessPageIteratorBegin
TessPageIteratorBlockType
TessPageIteratorBoundingBox
TessPageIteratorCopy
TessPageIteratorDelete
TessPageIteratorGetBinaryImage
TessPageIteratorGetImage
TessPageIteratorIsAtBeginningOf
TessPageIteratorIsAtFinalElement
TessPageIteratorNext
TessPageIteratorOrientation
TessPageIteratorParagraphInfo
TessResultIteratorConfidence
TessResultIteratorCopy
TessResultIteratorDelete
TessResultIteratorGetChoiceIterator
TessResultIteratorGetPageIterator
TessResultIteratorGetUTF8Text
TessResultIteratorNext
TessResultIteratorSymbolIsDropcap
TessResultIteratorSymbolIsSubscript
TessResultIteratorSymbolIsSuperscript
TessResultIteratorWordFontAttributes
TessResultIteratorWordIsFromDictionary
TessResultIteratorWordIsNumeric
TessResultIteratorWordRecognitionLanguage
TessVersion
getLeptonicaVersion
pixClone
pixCreate
pixDestroy
pixGetData
pixGetDimensions
pixGetHeight
pixGetWidth
pixGetWpl

Type Aliases§

PIX
TessProgressFunc