Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Rust PaddleOCR
A lightweight and efficient OCR (Optical Character Recognition) Rust library based on PaddleOCR models. This library utilizes the MNN inference framework to provide high-performance text detection and recognition capabilities.
This project is a pure Rust library, focused on providing core OCR functionality. For command-line tools or HTTP services, please refer to:
- 🖥️ Command Line Tool: newbee-ocr-cli
- 🌐 HTTP Service: newbee-ocr-service
✨ Version 2.0 New Features
- 🎯 New Layered API Design: Provides a complete layered API ranging from low-level models to high-level Pipelines.
- 🔧 Flexible Model Loading: Supports loading models from file paths or memory bytes.
- ⚙️ Configurable Detection Parameters: Supports custom detection thresholds, resolution, precision modes, and more.
- 🎚️ Three Precision Presets: Fast, Balanced, and High-Precision modes to meet different scenario requirements.
- 🚀 GPU Acceleration Support: Supports multiple GPU backends including Metal, OpenCL, and Vulkan.
- 📦 Batch Processing Optimization: Supports batch text recognition to improve throughput.
- 🔌 Independent Engine Mode: Ability to create standalone detection engines or recognition engines.
Features
Core Capabilities
- Text Detection: Accurately locates text regions within images.
- Text Recognition: Recognizes text content within detected regions.
- End-to-End Recognition: completes the full flow of detection and recognition in a single call.
- Layered API Architecture: Supports three usage patterns: end-to-end, layered calls, and independent models.
Model Support
- Multi-Version Support: Supports both PP-OCRv4 and PP-OCRv5 models for flexible selection.
- Multi-Language Support: PP-OCRv5 supports 11+ specific language models, covering over 100 languages.
- Complex Scenario Recognition: Enhanced capabilities for handwritten text, vertical text, and rare characters.
- Flexible Loading: Models can be loaded via file paths or directly from memory bytes.
Performance
- High-Performance Inference: Based on the MNN inference framework for fast speed and low memory usage.
- GPU Acceleration: Supports Metal, OpenCL, Vulkan, and other GPU backends.
- Batch Processing: Supports batch text recognition to boost throughput.
- Precision Presets: Provides Fast, Balanced, and High-Precision modes.
Developer Experience
- Flexible Configuration: Parameters such as detection thresholds, resolution, and precision modes are customizable.
- Memory Safety: Automatic memory management to prevent leaks.
- Pure Rust Implementation: No external runtime required, cross-platform compatible.
- Minimal Dependencies: Lightweight and easy to integrate.
Model Versions
This library supports three versions of PaddleOCR models:
PP-OCRv4
- Stable Version: thoroughly verified with excellent compatibility.
- Use Case: Standard document recognition where high accuracy is required.
- Model Files:
- Detection:
ch_PP-OCRv4_det_infer.mnn - Recognition:
ch_PP-OCRv4_rec_infer.mnn - Charset:
ppocr_keys_v4.txt
- Detection:
PP-OCRv5
- Latest Version: The next-generation text recognition solution.
- Multi-Language Support: The default model (
PP-OCRv5_mobile_rec.mnn) supports Simplified Chinese, Traditional Chinese, English, Japanese, and Chinese Pinyin. - Specific Language Models: Provides dedicated models for 11+ languages covering 100+ languages for optimal performance.
- Shared Detection Model: All V5 language models share the same detection model (
PP-OCRv5_mobile_det.mnn). - Enhanced Scenario Recognition:
- Significantly improved recognition for complex handwritten Chinese/English.
- Optimized vertical text recognition.
- Enhanced recognition of rare characters.
- Performance: 13% improvement in end-to-end performance compared to PP-OCRv4.
- Model Files (Default Multi-language):
- Detection:
PP-OCRv5_mobile_det.mnn(Shared by all languages) - Recognition:
PP-OCRv5_mobile_rec.mnn(Default, supports CN/EN/JP) - Charset:
ppocr_keys_v5.txt
- Detection:
- Specific Language Model Files (Optional):
- Recognition:
{lang}_PP-OCRv5_mobile_rec_infer.mnn - Charset:
ppocr_keys_{lang}.txt - Available Language Codes:
arabic,cyrillic,devanagari,el,en,eslav,korean,latin,ta,te,th
- Recognition:
PP-OCRv5 Language Model Support List
| Model Name | Supported Languages |
|---|---|
| korean_PP-OCRv5_mobile_rec | Korean, English |
| latin_PP-OCRv5_mobile_rec | French, German, Afrikaans, Italian, Spanish, Bosnian, Portuguese, Czech, Welsh, Danish, Estonian, Irish, Croatian, Uzbek, Hungarian, Serbian (Latin), Indonesian, Occitan, Icelandic, Lithuanian, Maori, Malay, Dutch, Norwegian, Polish, Slovak, Slovenian, Albanian, Swedish, Swahili, Tagalog, Turkish, Latin, Azerbaijani, Kurdish, Latvian, Maltese, Pali, Romanian, Vietnamese, Finnish, Basque, Galician, Luxembourgish, Romansh, Catalan, Quechua |
| eslav_PP-OCRv5_mobile_rec | Russian, Belarusian, Ukrainian, English |
| th_PP-OCRv5_mobile_rec | Thai, English |
| el_PP-OCRv5_mobile_rec | Greek, English |
| en_PP-OCRv5_mobile_rec | English |
| cyrillic_PP-OCRv5_mobile_rec | Russian, Belarusian, Ukrainian, Serbian (Cyrillic), Bulgarian, Mongolian, Abkhaz, Adyghe, Kabardian, Avar, Dargwa, Ingush, Chechen, Lak, Lezgian, Tabasaran, Kazakh, Kyrgyz, Tajik, Macedonian, Tatar, Chuvash, Bashkir, Mari, Moldovan, Udmurt, Komi, Ossetian, Buryat, Kalmyk, Tuvan, Yakut, Karakalpak, English |
| arabic_PP-OCRv5_mobile_rec | Arabic, Persian, Uyghur, Urdu, Pashto, Kurdish, Sindhi, Balochi, English |
| devanagari_PP-OCRv5_mobile_rec | Hindi, Marathi, Nepali, Bihari, Maithili, Angika, Bhojpuri, Magahi, Santali, Newari, Konkani, Sanskrit, Haryanvi, English |
| ta_PP-OCRv5_mobile_rec | Tamil, English |
| te_PP-OCRv5_mobile_rec | Telugu, English |
PP-OCRv5 FP16
- High Efficiency Version: Provides faster inference speeds and lower memory usage without sacrificing accuracy.
- Use Case: Scenarios requiring high performance and low memory footprint.
- Performance Gains:
- Inference speed increased by ~9% (higher on devices supporting FP16 acceleration).
- Memory usage reduced by ~8%.
- Model size halved.
- Model Files:
- Detection:
PP-OCRv5_mobile_det_fp16.mnn - Recognition:
PP-OCRv5_mobile_rec_fp16.mnn - Charset:
ppocr_keys_v5.txt
- Detection:
Model Performance Comparison
| Feature | PP-OCRv4 | PP-OCRv5 | PP-OCRv5 FP16 |
|---|---|---|---|
| Language Support | Chinese, English | Multi-language (Default CN/EN/JP, 11+ specific models) | Multi-language (Default CN/EN/JP, 11+ specific models) |
| Text Types | Chinese, English | Simplified/Traditional CN, EN, JP, Pinyin | Simplified/Traditional CN, EN, JP, Pinyin |
| Handwriting | Basic | Significantly Enhanced | Significantly Enhanced |
| Vertical Text | Basic | Optimized | Optimized |
| Rare Characters | Limited | Enhanced | Enhanced |
| Speed (FPS) | 1.1 | 1.2 | 1.2 |
| Memory (Peak) | 422.22MB | 388.41MB | 388.41MB |
| Model Size | Standard | Standard | Halved |
| Recommended | Standard Docs | Complex Scenes & Multi-lang | High Performance & Multi-lang |
Application Scenarios
Choose the appropriate API level based on your requirements:
Scenario 1: Quick OCR Integration
Use: End-to-End Recognition (OcrEngine)
Suitable for:
- Rapid prototyping.
- Simple document recognition needs.
- No need for intermediate processing.
- Only care about the final text result.
let engine = new?;
let results = engine.recognize?;
Scenario 2: Custom Post-Processing for Detection
Use: Layered Calls (OcrEngine detect + recognize_batch)
Suitable for:
- Filtering or selecting detection results.
- Adjusting text box positions.
- Processing text in a specific order.
- Sorting or grouping detection boxes.
let engine = new?;
// 1. Detect
let mut boxes = engine.detect?;
// 2. Custom processing (e.g., filter small boxes)
boxes.retain;
// 3. Recognize
let detections = engine.det_model.detect_and_crop?;
let results = engine.recognize_batch?;
Scenario 3: Detection Only
Use: DetOnlyEngine
Suitable for:
- Document layout analysis.
- Text region annotation tools.
- Pre-processing workflows (only need text locations).
- Using with other recognition engines.
let det_engine = det_only?;
let text_boxes = det_engine.detect?;
// Use detection boxes for other processing...
Scenario 4: Recognition Only
Use: RecOnlyEngine
Suitable for:
- Text location is already known, only recognition is needed.
- Processing pre-cropped text line images.
- Handwriting recognition (input is a single line image).
- Batch recognition of fixed-format text.
let rec_engine = rec_only?;
let text = rec_engine.recognize_text?;
Scenario 5: Fully Custom Workflow
Use: Independent Models (DetModel + RecModel)
Suitable for:
- Custom pre-processing workflows.
- Different configurations for detection and recognition.
- Inserting complex logic between detection and recognition.
- Performance optimization (e.g., reusing detection results).
let det_model = from_file?
.with_options;
let rec_model = from_file?;
// Fully custom processing flow...
Scenario 6: Embedded or Encrypted Deployment
Use: Load from Bytes
Suitable for:
- Embedded devices (compiling models into the binary).
- Model encryption requirements.
- Downloading models dynamically from the network.
- Custom model storage formats.
let det_bytes = include_bytes!;
let rec_bytes = include_bytes!;
let charset_bytes = include_bytes!;
let engine = from_bytes?;
Installation
Add the following to your Cargo.toml:
[]
= "https://github.com/zibo-chen/rust-paddle-ocr.git"
You can also specify a specific branch or tag:
[]
= "https://github.com/zibo-chen/rust-paddle-ocr.git"
= "next"
Prerequisites
This library requires:
- Pre-trained PaddleOCR models converted to MNN format.
- A character set file for text recognition.
API Architecture
This library provides a Layered Inference API, allowing you to choose the usage pattern that best fits your scenario:
┌─────────────────────────────────────────────────┐
│ OcrEngine (End-to-End Pipeline) │
│ Complete detection & recognition in one call │
├─────────────────────────────────────────────────┤
│ DetOnlyEngine │ RecOnlyEngine │ OcrEngine │
│ Detection Only │ Recognition Only │ Det + Rec │
├─────────────────────────────────────────────────┤
│ DetModel │ RecModel │
│ Text Det Model │ Text Rec Model │
├─────────────────────────────────────────────────┤
│ InferenceEngine (MNN) │
│ Low-level Inference Engine │
└─────────────────────────────────────────────────┘
Three Usage Patterns
1. End-to-End Recognition (Recommended) - Simplest
Use OcrEngine to complete the full OCR process with a single call:
use OcrEngine;
2. Layered Calls - More Flexible
Use OcrEngine but call detection and recognition separately. Useful for inserting custom processing:
use OcrEngine;
3. Independent Model Calls - Most Flexible
Create detection and recognition engines separately, or create a single-function engine:
use ;
Usage Examples
Basic Configuration Options
use ;
GPU Acceleration
use ;
Custom Detection and Recognition Parameters
use ;
Using Specific Language Models
use OcrEngine;
Loading Models from Memory Bytes
Suitable for embedded deployment or scenarios requiring model encryption:
use OcrEngine;
Convenience Functions
use ocr_file;
For more complete examples, please refer to the examples directory.
Related Projects
- 🖥️ newbee-ocr-cli - A command-line tool based on this library, providing a simple and easy-to-use OCR CLI interface.
- 🌐 newbee-ocr-service - An HTTP service based on this library, providing RESTful API interfaces.
Performance Optimization Suggestions
1. Choose Appropriate Precision Modes
// Real-time processing
let config = fast;
// General document recognition
let config = balanced;
// High quality requirements
let config = high_precision;
2. Use GPU Acceleration
// macOS/iOS
let config = gpu; // Uses Metal
// Other platforms
let config = new.with_backend;
3. Batch Processing
// Batch recognizing multiple text lines is much faster than one by one
let results = rec_model.recognize_batch?;
Contribution
Contributions are welcome! Please feel free to submit Issues or Pull Requests.
License
This project is licensed under the Apache License, Version 2.0 - see the LICENSE file for details.