ICU MessageFormat Parser (Rust)
A Rust implementation of the ICU MessageFormat parser, optimized for performance and WebAssembly compilation.
Features
- Full ICU MessageFormat syntax support
- High-performance parsing
- WebAssembly-ready with wasm-bindgen
- Zero-copy parsing where possible
- Comprehensive error handling
Project Structure
lib.rs- Main library and WASM bindingsparser.rs- Core parser implementationtypes.rs- AST typeserror.rs- Error typesdate_time_pattern_generator.rs- Date/time pattern supportmanipulator.rs- AST manipulation utilitiesprinter.rs- AST printing utilities
Building
Native Rust Library
# Run tests
# Build library
# Run benchmarks
WebAssembly
The parser can be compiled to WebAssembly using Bazel's platform transition approach.
Build with Bazel
This uses rust_shared_library with platform = "@rules_rust//rust/platform:wasm" to cross-compile to wasm32.
What Gets Built
The WASM build includes:
formatjs_icu_messageformat_parser_bg.wasm- The WASM binary (~1.2MB)formatjs_icu_messageformat_parser.js- JavaScript glue code generated by wasm-bindgenformatjs_icu_messageformat_parser.d.ts- TypeScript type definitionsformatjs_icu_messageformat_parser_bg.wasm.d.ts- WASM module types
WASM Configuration
The WASM build uses:
- crate-type:
cdylibfor dynamic library output - features:
wasmfeature flag enables wasm-bindgen dependencies - platform:
@rules_rust//rust/platform:wasmfor wasm32 target - dependencies:
wasm-bindgenandserde-wasm-bindgenfor JS interop
See BUILD.bazel for the full configuration.
WASM API
When compiled to WASM, the parser exports two functions:
parse(input: string): MessageFormatElement[]
Parse ICU MessageFormat with default options.
import init from './formatjs_icu_messageformat_parser.js';
await ;
const ast = ;
console.log;
parse_ignore_tag(input: string): MessageFormatElement[]
Parse with ignore_tag option enabled (treats HTML-like tags as literals).
import init from './formatjs_icu_messageformat_parser.js';
await ;
const ast = ;
console.log;
Both functions return the parsed AST as a JavaScript object or throw an error on parse failure.
Usage in Packages
The WASM binary is used by the @formatjs/icu-messageformat-parser-wasm npm package, which provides a convenient JavaScript wrapper:
import from '@formatjs/icu-messageformat-parser-wasm';
// Automatically initializes WASM on first call
const ast = await ;
Implementation Notes
Platform Transition
The build uses Bazel's platform transition feature to cross-compile from the host platform to wasm32:
This approach:
- ✅ Works entirely within Bazel's hermetic build system
- ✅ No external tools (like wasm-pack) required at build time
- ✅ Leverages rules_rust's native WASM support
- ✅ Automatically uses the wasm32 dummy CC toolchain
WASM Bindgen Integration
The wasm feature flag in Cargo.toml enables:
wasm-bindgenfor JS interopserde-wasm-bindgenfor serializing complex types to JS- Exported
parseandparse_ignore_tagfunctions
The Rust code uses #[cfg(feature = "wasm")] to conditionally compile WASM-specific code.
Dependencies
icu- Unicode/ICU functionalityregex- Pattern matchingserde- Serialization frameworkonce_cell- Lazy static initialization
WASM-only dependencies (behind wasm feature):
wasm-bindgen- JS interopserde-wasm-bindgen- Serialize to JS values
Development
Regenerate Generated Files
# Regenerate time data
# Regenerate regex patterns
Testing
# Run Rust tests
# Run benchmarks