Expand description
§OAR OCR
A Rust OCR library that extracts text from document images using ONNX models. Supports text detection, recognition, document orientation, and rectification.
§Features
- Complete OCR pipeline from image to text
- High-level builder APIs for easy pipeline configuration
- Model adapter system for easy model swapping
- Batch processing support
- ONNX Runtime integration for fast inference
§Components
- Text Detection: Find text regions in images
- Text Recognition: Convert text regions to readable text
- Layout Detection: Identify document structure elements (text blocks, titles, tables, figures)
- Document Orientation: Detect document rotation (0°, 90°, 180°, 270°)
- Document Rectification: Fix perspective distortion
- Text Line Classification: Detect text line orientation
- Seal Text Detection: Detect text in circular seals
- Formula Recognition: Recognize mathematical formulas
§Modules
core- Core traits, error handling, and batch processingdomain- Domain types like orientation helpers and prediction modelsmodels- Model adapters for different OCR tasksoarocr- High-level OCR pipeline buildersprocessors- Image processing utilitiesutils- Utility functions for images and tensorspredictors- Task-specific predictor interfaces
§Quick Start
§OCR Pipeline
use oar_ocr::oarocr::{OAROCRBuilder, OAROCR};
use oar_ocr::utils::load_image;
use std::path::Path;
// Create OCR pipeline with required components
let ocr = OAROCRBuilder::new(
"models/text_detection.onnx",
"models/text_recognition.onnx",
"models/character_dict.txt"
)
.with_document_image_orientation_classification("models/doc_orient.onnx")
.with_text_line_orientation_classification("models/line_orient.onnx")
.image_batch_size(4)
.region_batch_size(32)
.build()?;
// Process images
let image = load_image(Path::new("document.jpg"))?;
let results = ocr.predict(vec![image])?;
for result in results {
for region in result.text_regions {
if let Some(text) = region.text {
println!("Text: {}", text);
}
}
}§Document Structure Analysis
use oar_ocr::oarocr::{OARStructureBuilder, OARStructure};
use std::path::Path;
// Create structure analysis pipeline
let structure = OARStructureBuilder::new("models/layout_detection.onnx")
.with_table_classification("models/table_classification.onnx")
.with_table_cell_detection("models/table_cell_detection.onnx", "wired")
.with_table_structure_recognition("models/table_structure.onnx", "wired")
.table_structure_dict_path("models/table_structure_dict.txt")
.with_formula_recognition(
"models/formula_recognition.onnx",
"models/tokenizer.json",
"pp_formulanet"
)
.build()?;
// Analyze document structure
let result = structure.predict("document.jpg")?;
println!("Layout elements: {}", result.layout_elements.len());
println!("Tables: {}", result.tables.len());
println!("Formulas: {}", result.formulas.len());Modules§
- core
- The core module of the OCR pipeline.
- domain
- Domain-level structures shared across the OCR pipeline.
- models
- Model implementations.
- oarocr
- The OCR pipeline module.
- predictors
- Predictors module
- prelude
- Prelude module for convenient imports.
- processors
- Image processing utilities for OCR systems.
- utils
- Utility functions for the OCR pipeline.
- vl
- Vision-Language modules (optional).
Macros§
- apply_
ort_ config - Macro to conditionally apply OrtSessionConfig to any builder that has
with_ort_config. - common_
builder_ methods - Macro to inject common builder methods into an existing
impl Builderblock. Use this insideimpl YourBuilder { ... }and pass the field name that holdsModelInferenceConfig(e.g.,common). - impl_
common_ builder_ methods - Macro to implement common builder methods for structs with a
ModelInferenceConfigfield. - impl_
complete_ builder - Comprehensive builder macro for generating common builder method patterns.
- impl_
config_ new_ and_ with_ common - Macro to implement
new()andwith_common()for config structs with per-module defaults. - impl_
config_ validator - Macro to implement ConfigValidator with basic validation patterns.
- impl_
dyn_ task_ output - Generates the DynTaskOutput enum and its methods from the task registry.
- impl_
task_ adapter - Generates the TaskAdapter enum and its DynModelAdapter implementation.
- impl_
task_ type_ enum - Generates the TaskType enum from the task registry.
- metrics
- Macro to create pre-populated StageMetrics with common patterns.
- validate_
field - Helper macro for field validation.
- with_
nested - Macro to handle optional nested config initialization in builders.
- with_
task_ registry - Central task registry macro that defines all tasks in a single location.