UniProps-gen
UniProps is a blazing-fast, compile-time Unicode property generator for Rust. It generates highly optimized static tables to look up Unicode General Categories and Numeric Values with zero runtime allocation.
Note: This project is a complete evolution and fork of my previous project,
dec-from-char. Whiledec-from-charfocused solely on parsing decimal digits, UniProps generalizes this approach. It allows you to generate efficient data structures for any subset of Unicode data directly viabuild.rs.
Todo
- Update metadata actions pipeline
🚀 Features
- Unmatched Performance:
- Categories: Uses a Two-Level Trie (Index Table + Data Blocks) for true O(1) access.
- Zero Runtime Allocation: All data is baked into your binary as standard
staticarrays. - Customizable: Generate only what you need. Filter by specific categories (e.g., only
Nddigits) to drastically reduce your binary size. - Safe API: Generated code relies on safe wrappers around bounded
unsafelookups, ensuring maximum speed without bounds-checking overhead, while remaining 100% memory safe.
⚡ Benchmarks
| Method | Operation | Time / iter | Notes |
|---|---|---|---|
| Rust Standard Library | char::is_numeric() |
~7.66 ns | Standard std implementation |
| UniProps Digits | get_digit_value(c) |
~6.15 ns | ~20% Faster on mixed text |
| UniProps Categories | Category::from_char(c) |
~5.22 ns | ~32% Faster (O(1) Trie lookup) |
📦 Installation
Add uniprops-gen to your [build-dependencies] in Cargo.toml.
[build-dependencies]
uniprops-gen = "0.3.0" # Use the latest version
🛠 Usage
1. Configure build.rs
Create a build.rs file in your project root. Use the builder to generate your tables into OUT_DIR.
// build.rs
use uniprops-UnipropsBuilder;
2. Include in lib.rs
Import the generated code using the include! macro.
// src/lib.rs
📄 License
This project is dual-licensed under either of:
at your option.