penmanship
A Rust library for Unicode character lookup via text patterns. Convert text aliases like "...", "alpha", "(c)" to their corresponding Unicode characters (…, α, ©).
Features
no_stdcompatible: Works in embedded and bare-metal environments- Zero runtime overhead: Uses compile-time perfect hash maps via
phf - No allocations: Returns static string references
- Feature-gated: Enable only the character categories you need
- Comprehensive: Supports punctuation, math, Greek letters, fractions, currency, symbols, HTML entities, emoji, and more
- Safe: Forbids unsafe code and maintains strict quality standards
Quick Start
Add penmanship to your Cargo.toml:
[]
= "0.1"
Basic usage:
use lookup;
Supported Categories
All categories are enabled by default via the full feature. Examples:
// Punctuation
lookup // … - horizontal ellipsis
lookup // — - em dash
lookup // ' - left single quotation mark
// Math
lookup // ≠ - not equal to
lookup // → - rightwards arrow
lookup // ∞ - infinity
// Greek letters (case-sensitive)
lookup // α - greek small letter alpha
lookup // Α - greek capital letter alpha
// Fractions
lookup // ½ - fraction one half
// Currency
lookup // € - euro sign
// Symbols
lookup // © - copyright sign
lookup // ° - degree sign
// Superscripts & Subscripts
lookup // ² - superscript two
lookup // ₂ - subscript two
// HTML entities (2200+ supported)
lookup // (non-breaking space)
lookup // < - less than
// Emoji (1800+ shortcodes)
lookup // 😄 - grinning face with smiling eyes
lookup // ❤️ - red heart
For a complete list of all supported patterns, see docs/mappings.md.
Feature Flags
By default, all categories are enabled via the full feature. To use only specific categories:
[]
= { = "0.1", = false, = ["punctuation", "math", "greek"] }
Available features:
full(default) - All categoriespunctuation- Punctuation and typographymath- Mathematical operators and symbolsgreek- Greek lettersfractions- Fraction characterscurrency- Currency symbolssymbols- Miscellaneous symbolssuperscripts- Superscript characterssubscripts- Subscript charactershtml- HTML named character referencesemoji- Emoji shortcode lookup (requiresemojiscrate)
Design Philosophy
no_stdcompatible: No standard library required, works in embedded environments- Compile-time: All mappings use perfect hash functions computed at compile time
- Zero allocations: All strings are static references
- Library-only: Pure library with no binary, focused on being a building block
- Feature-gated: Pay only for what you use
Development Notes
- This project uses a whitelist approach to
.gitignore - 100% test coverage maintained
- Strict linting: no unsafe code, all items documented
Contributing
Contributions are welcome! See CONTRIBUTING.md for guidelines.
Security
For security vulnerabilities and reporting guidelines, see SECURITY.md.
Acknowledgments
- Emoji support provided by the
emojiscrate - HTML entities based on the WHATWG HTML Living Standard
License
Copyright © 2025 Adam Mill
Licensed under the Apache License, Version 2.0. See LICENSE.txt for details.