Crate oar_ocr

Expand description

§OAR OCR

A Rust OCR library that extracts text from document images using ONNX models. Supports text detection, recognition, document orientation, and rectification.

§Features

Complete OCR pipeline from image to text
Modular components (use only what you need)
Batch processing support
ONNX Runtime integration for fast inference

§Components

Text Detection: Find text regions in images
Text Recognition: Convert text regions to readable text
Document Orientation: Detect document rotation (0°, 90°, 180°, 270°)
Document Rectification: Fix perspective distortion
Text Line Classification: Detect text line orientation

§Modules

core - Core traits, error handling, and batch processing
domain - Domain types like orientation helpers and prediction models
predictor - OCR predictor implementations
pipeline - Complete OCR pipeline
processors - Image processing utilities
utils - Utility functions for images and tensors

§Quick Start

§Complete OCR Pipeline

use oar_ocr::prelude::*;
use oar_ocr::utils::load_images;
use std::path::Path;

// Build OCR pipeline
let mut ocr = OAROCRBuilder::new(
    "detection_model.onnx".to_string(),
    "recognition_model.onnx".to_string(),
    "char_dict.txt".to_string(),
).build()?;

// Process single image
let image = load_image(Path::new("document.jpg"))?;
let results = ocr.predict(&[image])?;
let result = &results[0];

// Print results
for region in &result.text_regions {
    if let (Some(text), Some(confidence)) = (&region.text, region.confidence) {
        println!("Text: {} (confidence: {:.2})", text, confidence);
    }
}

// Process multiple images
let images = load_images(&[Path::new("doc1.jpg"), Path::new("doc2.jpg")])?;
let results = ocr.predict(&images)?;
for result in results {
    println!("Image {}: {} text regions found", result.index, result.text_regions.len());
}

§Advanced Configuration with Confidence Thresholding

use oar_ocr::prelude::*;
use std::path::Path;

// Build OCR pipeline with confidence thresholding for orientation detection
let mut ocr = OAROCRBuilder::new(
    "detection_model.onnx".to_string(),
    "recognition_model.onnx".to_string(),
    "char_dict.txt".to_string(),
)

// Configure document orientation with confidence threshold
.doc_orientation_classify_model_path("orientation_model.onnx")
.doc_orientation_threshold(0.8) // Only accept predictions with 80% confidence
.use_doc_orientation_classify(true)

// Configure text line orientation with confidence threshold
.textline_orientation_classify_model_path("textline_orientation_model.onnx")
.textline_orientation_threshold(0.7) // Only accept predictions with 70% confidence
.use_textline_orientation(true)

// Set recognition score threshold
.text_rec_score_threshold(0.5)
.build()?;

// Process image - low confidence orientations will fall back to defaults
let image = load_image(Path::new("document.jpg"))?;
let results = ocr.predict(&[image])?;

§Individual Components

use oar_ocr::prelude::*;
use oar_ocr::core::traits::StandardPredictor;
use oar_ocr::predictor::{TextDetPredictorBuilder, TextRecPredictorBuilder};
use oar_ocr::utils::load_image;
use std::path::Path;

// Text detection only
let mut detector = TextDetPredictorBuilder::new()
    .build(Path::new("detection_model.onnx"))?;

let image = load_image(Path::new("image.jpg"))?;
let result = detector.predict(vec![image], None)?;
println!("Detection result: {:?}", result);

// Text recognition only
let char_dict = vec!["a".to_string(), "b".to_string()]; // Load your dictionary
let mut recognizer = TextRecPredictorBuilder::new()
    .character_dict(char_dict)
    .build(Path::new("recognition_model.onnx"))?;

let image = load_image(Path::new("text_crop.jpg"))?;
let result = recognizer.predict(vec![image], None)?;
println!("Recognition result: {:?}", result);

Modules§

core: The core module of the OCR pipeline.
domain: Domain-level structures shared across the OCR pipeline.
pipeline: The OCR pipeline module.
predictor: Predictor implementations for various OCR tasks.
prelude: Prelude module for convenient imports.
processors: Image processing utilities for OCR systems.
utils: Utility functions for the OCR pipeline.

Macros§

common_builder_methods: Macro to inject common builder methods into an existing impl Builder block. Use this inside impl YourBuilder { ... } and pass the field name that holds CommonBuilderConfig (e.g., common).
impl_common_builder_methods: Macro to implement common builder methods for structs with a CommonBuilderConfig field.
impl_complete_builder: Comprehensive builder macro for generating common builder method patterns.
impl_config_new_and_with_common: Macro to implement new() and with_common() for config structs with per-module defaults.
impl_config_validator: Macro to implement ConfigValidator with basic validation patterns.
metrics: Macro to create pre-populated StageMetrics with common patterns.
validate_field: Helper macro for field validation.
with_nested: Macro to handle optional nested config initialization in builders.

Crate oar_ocr

Crate oar_ocr Copy item path

§OAR OCR

§Features

§Components

§Modules

§Quick Start

§Complete OCR Pipeline

§Advanced Configuration with Confidence Thresholding

§Individual Components

Modules§

Macros§

Crate oar_ocr