Skip to main content

Crate oar_ocr

Crate oar_ocr 

Source
Expand description

§OAR OCR

A Rust OCR library that extracts text from document images using ONNX models. Supports text detection, recognition, document orientation, and rectification.

§Features

  • Complete OCR pipeline from image to text
  • High-level builder APIs for easy pipeline configuration
  • Model adapter system for easy model swapping
  • Batch processing support
  • ONNX Runtime integration for fast inference

§Components

  • Text Detection: Find text regions in images
  • Text Recognition: Convert text regions to readable text
  • Layout Detection: Identify document structure elements (text blocks, titles, tables, figures)
  • Document Orientation: Detect document rotation (0°, 90°, 180°, 270°)
  • Document Rectification: Fix perspective distortion
  • Text Line Classification: Detect text line orientation
  • Seal Text Detection: Detect text in circular seals
  • Formula Recognition: Recognize mathematical formulas

§Modules

  • core - Core traits, error handling, and batch processing
  • domain - Domain types like orientation helpers and prediction models
  • models - Model adapters for different OCR tasks
  • oarocr - High-level OCR pipeline builders
  • processors - Image processing utilities
  • utils - Utility functions for images and tensors
  • predictors - Task-specific predictor interfaces

§Quick Start

§OCR Pipeline

use oar_ocr::oarocr::{OAROCRBuilder, OAROCR};
use oar_ocr::utils::load_image;
use std::path::Path;

// Create OCR pipeline with required components
let ocr = OAROCRBuilder::new(
    "models/text_detection.onnx",
    "models/text_recognition.onnx",
    "models/character_dict.txt"
)
.with_document_image_orientation_classification("models/doc_orient.onnx")
.with_text_line_orientation_classification("models/line_orient.onnx")
.image_batch_size(4)
.region_batch_size(32)
.build()?;

// Process images
let image = load_image(Path::new("document.jpg"))?;
let results = ocr.predict(vec![image])?;

for result in results {
    for region in result.text_regions {
        if let Some(text) = region.text {
            println!("Text: {}", text);
        }
    }
}

§Document Structure Analysis

use oar_ocr::oarocr::{OARStructureBuilder, OARStructure};
use std::path::Path;

// Create structure analysis pipeline
let structure = OARStructureBuilder::new("models/layout_detection.onnx")
    .with_table_classification("models/table_classification.onnx")
    .with_table_cell_detection("models/table_cell_detection.onnx", "wired")
    .with_table_structure_recognition("models/table_structure.onnx", "wired")
    .table_structure_dict_path("models/table_structure_dict.txt")
    .with_formula_recognition(
        "models/formula_recognition.onnx",
        "models/tokenizer.json",
        "pp_formulanet"
    )
    .build()?;

// Analyze document structure
let result = structure.predict("document.jpg")?;

println!("Layout elements: {}", result.layout_elements.len());
println!("Tables: {}", result.tables.len());
println!("Formulas: {}", result.formulas.len());

Modules§

core
domain
models
oarocr
The OCR pipeline module.
predictors
prelude
Prelude module for convenient imports.
processors
utils
Utility functions for the OCR pipeline.

Derive Macros§

ConfigValidator
Derive macro for implementing ConfigValidator trait.
TaskPredictorBuilder
Derive macro for implementing TaskPredictorBuilder trait.