whitehole
A simple, fast, intuitive parser combinator framework for Rust.
Features
- Simple: only a handful of combinators to remember:
eat,till,next,take,wrap,recur. - Operator overloading: use
+,|,!to compose combinators, use*to repeat a combinator. - Almost zero heap allocation: this framework only uses stack memory, except
recurwhich uses some pointers for recursion. - Re-usable heap memory: store accumulated values in a parser-managed heap, instead of re-allocation for each iteration.
- Stateful-able: control the parsing flow with an optional custom state.
- Safe by default, with
unsafevariants for performance. - Provide both string (
&str) and bytes (&[u8]) support.
Installation
Examples
See the examples directory for more examples.
Here is a simple example to parse hexadecimal color codes:
use ;
let double_hex = ;
// Concat multiple combinators with `+`.
// Tuple values will be concatenated into a single tuple.
// Here `() + (u8,) + (u8,) + (u8,)` will be `(u8, u8, u8)`.
let entry = eat + double_hex + double_hex + double_hex;
let mut parser = builder.entry.build;
let output = parser.next.unwrap;
assert_eq!;
assert_eq!;
How to Debug
With Logging
The easiest way is to apply .log(name) to any combinator you need to inspect.
use ;
let double_hex = ;
let entry =
.log;
let mut parser = builder.entry.build;
parser.next.unwrap;
Output:
(entry) input: "#FFA500"
| (hash) input: "#FFA500"
| (hash) output: Some("#")
| (R) input: "FFA500"
| | (double_hex) input: "FFA500"
| | | (hex) input: "FFA500"
| | | (hex) output: Some("F")
| | | (hex) input: "FA500"
| | | (hex) output: Some("F")
| | (double_hex) output: Some("FF")
| (R) output: Some("FF")
| (G) input: "A500"
| | (double_hex) input: "A500"
| | | (hex) input: "A500"
| | | (hex) output: Some("A")
| | | (hex) input: "500"
| | | (hex) output: Some("5")
| | (double_hex) output: Some("A5")
| (G) output: Some("A5")
| (B) input: "00"
| | (double_hex) input: "00"
| | | (hex) input: "00"
| | | (hex) output: Some("0")
| | | (hex) input: "0"
| | | (hex) output: Some("0")
| | (double_hex) output: Some("00")
| (B) output: Some("00")
(entry) output: Some("#FFA500")
If you need to inspect your custom state and heap, you can use combinator decorators or write your own combinator extensions to achieve this.
With Breakpoints
Because of the high level abstraction, it's hard to set breakpoints to combinators.
One workaround is to use wrap to wrap your combinator in a closure or function and manually call Action::exec.
use ;
let double_hex = ;
// wrap the original combinator
let double_hex = ;
let entry = eat + double_hex + double_hex + double_hex;
let mut parser = builder.entry.build;
parser.next.unwrap;
Documentation
Benchmarks
Related
in_str: a procedural macro to generate a closure that checks if a character is in the provided literal string.
Credits
This project is inspired by: