Crate ruvector_scipix

Crate ruvector_scipix 

Source
Expand description

§Ruvector-Scipix

A high-performance Rust implementation of Scipix OCR for mathematical expressions and equations. Built on top of ruvector-core for efficient vector-based caching and similarity search.

§Features

  • Mathematical OCR: Extract LaTeX from images of equations
  • Vector Caching: Intelligent caching using image embeddings
  • Multiple Formats: Support for LaTeX, MathML, AsciiMath
  • High Performance: Parallel processing and efficient caching
  • Configurable: Extensive configuration options via TOML or API

§Quick Start

use ruvector_scipix::{Config, OcrEngine, Result};

#[tokio::main]
async fn main() -> Result<()> {
    // Load configuration
    let config = Config::from_file("scipix.toml")?;

    // Create OCR engine
    let engine = OcrEngine::new(config).await?;

    // Process image
    let result = engine.process_image("equation.png").await?;
    println!("LaTeX: {}", result.latex);

    Ok(())
}

§Architecture

  • config: Configuration management with TOML support
  • error: Comprehensive error types with context
  • math: LaTeX and mathematical format handling
  • ocr: Core OCR processing engine
  • output: Output formatting and serialization
  • preprocess: Image preprocessing pipeline
  • cache: Vector-based intelligent caching

Re-exports§

pub use config::Config;
pub use config::OcrConfig;
pub use config::ModelConfig;
pub use config::PreprocessConfig;
pub use config::OutputConfig;
pub use config::PerformanceConfig;
pub use config::CacheConfig;
pub use error::ScipixError;
pub use error::Result;
pub use cli::Cli;
pub use cli::Commands;
pub use api::ApiServer;
pub use api::state::AppState;
pub use cache::CacheManager;

Modules§

api
cache
Vector-based intelligent caching for Scipix OCR results
cli
config
Configuration system for Ruvector-Scipix
error
Error types for Ruvector-Scipix
optimize
Performance optimization utilities for scipix OCR
output
Output formatting module for Scipix OCR results
preprocess
Image preprocessing module for OCR pipeline

Constants§

VERSION
Library version

Functions§

default_config
Default configuration preset
high_accuracy_config
High-accuracy configuration preset
high_speed_config
High-speed configuration preset