# Parser Compose
> ⚠️ Warning ☣️
Homemade, hand-rolled code ahead. Experimental. May not function as advertised.
[Documentation](https://docs.rs/parser-compose/latest/parser_compose/)
[Examples](https://gitlab.com/wake-sleeper/parser-compose/-/tree/main/tests)
`parser-compose` is a library for writing and composing parsers for arbitrary
file or data formats. It has a strong focus on usability and API design.
It's based on the ideas around [parser
combinators](https://en.wikipedia.org/wiki/Parser_combinator) and [parsing
expression grammars](https://en.wikipedia.org/wiki/Parsing_expression_grammar),
but don't let those terms scare you off.
I made this because many of my projects involve parsing _something_ (like a
configuration file, a binary formt, an HTTP message), but the field of parsing
and language theory sounds incredibly dull to me. I have always resorted to
"string.split()"-type parsing, but it is tedious.
Turns out there is a better way! No theory required.
## Crash course in parser combinators
_(Note: Skip this and head straight to the documentation if you are already
familiar with the concepts)_
Say you want to extract the letter 'a' from a sequence of bytes. You can write a
parser function takes the bytes as argument and returns successfully if it saw
the byte `97` (ascii for 'a') at the start:
> _Note: I am not handling edge cases to keep things brief_
```rust
// If we find `97` successfully at the start of sequence, extract it and put it
// in a tuple. Return the remaining input as well
fn match_a(input: &[u8]) -> Result<(u8, &[u8]), String> {
if input[0] == 97 {
Ok((input[0], &input[1..]))
} else {
Err(format!("could not find 97"))
}
}
fn main() {
let msg = &b"abc"[..];
let (value, remaining) = match_a(&msg).unwrap();
println!("{value}");
// -> 97
println!("{remaining:?}");
// -> [98, 99]
}
```
Ok, but what if you wanted to parse `98` now? You can write a function that
builds a parser. The argument to this parser builder will be the byte you'd
like to recognize. The parser builder will return a parser that accepts that byte.
```rust
fn match_u8(expected: u8) -> impl Fn(&[u8]) -> Result<(u8, &[u8]), String> {
// ...
}
}
}
fn main() {
let msg = &b"abc"[..];
let (value, remaining) = match_u8(b'a')(&msg).unwrap();
println!("{value}");
// -> 97
println!("{remaining:?}");
// -> [98, 99]
// Note how we parse `remaining` instead of `msg` here.
// This is how you "move" through the input
let (value, remaining) = match_u8(b'b')(&remaining).unwrap();
println!("{value}");
// -> 98
println!("{remaining:?}");
// -> [99]
}
```
For the final touch. What if you wanted to recognize `97` or `98`? We
can write a function that ... uhh ... _combines_ (hint, hint, wink, wink) two
parsers it gets as arguments. When this combiner function is called, it returns
a parser that succeeds with the value of the first succeeding inner
parser.
```rust
// referred to as a "parser combinator"
fn or<P1, P2>(parser1: P1, parser2: P2) -> impl Fn(&[u8]) -> Result<(u8, &[u8]), String>
where
P1: Fn(&[u8]) -> Result<(u8, &[u8]), String>,
P2: Fn(&[u8]) -> Result<(u8, &[u8]), String>
{
Ok((value, rest)) => Ok((value, rest)),
Err(_) => match parser2(input) {
Ok((value, rest)) => Ok((value, rest)),
Err(e) => Err(e)
}
}
}
}
fn match_u8(expected: u8) -> impl Fn(&[u8]) -> Result<(u8, &[u8]), String> {
// ...
}
fn main() {
let msg = &b"abc"[..];
let (value, remaining) = or(
match_u8(b'a'),
match_u8(b'b')
)(&msg).unwrap();
println!("{value}");
// -> 97
println!("{remaining:?}");
// -> [98, 99]
let (value, remaining) = or(
match_u8(b'a'),
match_u8(b'b')
)(&msg).unwrap();
println!("{value}");
// -> 98
println!("{remaining:?}");
// -> [99]
}
```
That is the basic idea. You can now go crazy writing all sorts of combinators
like `and()` and `optional()`, and use them to combine your parsers
together ... or you could use this crate :)
## Similar projects
- [Nom](https://crates.io/crates/nom)
- [pom](https://crates.io/crates/pom)
- [combine](https://crates.io/crates/combine)
## Thanks
This crate would not have been possible without:
- This post called [You could have invented Parser
Combinators](https://www.theorangeduck.com/page/you-could-have-invented-parser-combinators),
which brought the concept of parser combinators down from "academic sounding
term, no thank you" to "wow, i can understand this"
- [This guide to writing parser combinators in
rust](https://bodil.lol/parser-combinators/)
- [This
article](https://github.com/J-F-Liu/pom/blob/master/doc/article.md#what-is-parser-combinator)
by the author of `pom`, which lays out the various approaches to writing parser
combinators in rust.