Crate oar_ocr

Crate oar_ocr 

Source
Expand description

§OAR OCR

A Rust OCR library that extracts text from document images using ONNX models. Supports text detection, recognition, document orientation, and rectification.

§Features

  • Complete OCR pipeline from image to text
  • High-level builder APIs for easy pipeline configuration
  • Model adapter system for easy model swapping
  • Batch processing support
  • ONNX Runtime integration for fast inference

§Components

  • Text Detection: Find text regions in images
  • Text Recognition: Convert text regions to readable text
  • Layout Detection: Identify document structure elements (text blocks, titles, tables, figures)
  • Document Orientation: Detect document rotation (0°, 90°, 180°, 270°)
  • Document Rectification: Fix perspective distortion
  • Text Line Classification: Detect text line orientation
  • Seal Text Detection: Detect text in circular seals
  • Formula Recognition: Recognize mathematical formulas

§Modules

  • core - Core traits, error handling, and batch processing
  • domain - Domain types like orientation helpers and prediction models
  • models - Model adapters for different OCR tasks
  • oarocr - High-level OCR pipeline builders
  • processors - Image processing utilities
  • utils - Utility functions for images and tensors
  • predictors - Task-specific predictor interfaces

§Quick Start

§OCR Pipeline

use oar_ocr::oarocr::{OAROCRBuilder, OAROCR};
use oar_ocr::utils::load_image;
use std::path::Path;

// Create OCR pipeline with required components
let ocr = OAROCRBuilder::new(
    "models/text_detection.onnx",
    "models/text_recognition.onnx",
    "models/character_dict.txt"
)
.with_document_image_orientation_classification("models/doc_orient.onnx")
.with_text_line_orientation_classification("models/line_orient.onnx")
.image_batch_size(4)
.region_batch_size(32)
.build()?;

// Process images
let image = load_image(Path::new("document.jpg"))?;
let results = ocr.predict(vec![image])?;

for result in results {
    for region in result.text_regions {
        if let Some(text) = region.text {
            println!("Text: {}", text);
        }
    }
}

§Document Structure Analysis

use oar_ocr::oarocr::{OARStructureBuilder, OARStructure};
use std::path::Path;

// Create structure analysis pipeline
let structure = OARStructureBuilder::new("models/layout_detection.onnx")
    .with_table_classification("models/table_classification.onnx")
    .with_table_cell_detection("models/table_cell_detection.onnx", "wired")
    .with_table_structure_recognition("models/table_structure.onnx", "wired")
    .table_structure_dict_path("models/table_structure_dict.txt")
    .with_formula_recognition(
        "models/formula_recognition.onnx",
        "models/tokenizer.json",
        "pp_formulanet"
    )
    .build()?;

// Analyze document structure
let result = structure.predict("document.jpg")?;

println!("Layout elements: {}", result.layout_elements.len());
println!("Tables: {}", result.tables.len());
println!("Formulas: {}", result.formulas.len());

Modules§

core
The core module of the OCR pipeline.
domain
Domain-level structures shared across the OCR pipeline.
models
Model implementations.
oarocr
The OCR pipeline module.
predictors
Predictors module
prelude
Prelude module for convenient imports.
processors
Image processing utilities for OCR systems.
utils
Utility functions for the OCR pipeline.
vl
Vision-Language modules (optional).

Macros§

apply_ort_config
Macro to conditionally apply OrtSessionConfig to any builder that has with_ort_config.
common_builder_methods
Macro to inject common builder methods into an existing impl Builder block. Use this inside impl YourBuilder { ... } and pass the field name that holds ModelInferenceConfig (e.g., common).
impl_common_builder_methods
Macro to implement common builder methods for structs with a ModelInferenceConfig field.
impl_complete_builder
Comprehensive builder macro for generating common builder method patterns.
impl_config_new_and_with_common
Macro to implement new() and with_common() for config structs with per-module defaults.
impl_config_validator
Macro to implement ConfigValidator with basic validation patterns.
impl_dyn_task_output
Generates the DynTaskOutput enum and its methods from the task registry.
impl_task_adapter
Generates the TaskAdapter enum and its DynModelAdapter implementation.
impl_task_type_enum
Generates the TaskType enum from the task registry.
metrics
Macro to create pre-populated StageMetrics with common patterns.
validate_field
Helper macro for field validation.
with_nested
Macro to handle optional nested config initialization in builders.
with_task_registry
Central task registry macro that defines all tasks in a single location.