Expand description
§A performant, low-level, lightweight and intuitive combinatoric parser library.
Manger allows for translation of the intuition developed for Rust’s primitive and standard
library types into your intuition for using this library. Most of the behaviour is defined with
the Consumable
trait, which can be easily implemented using the consume_struct
and
consume_enum
macros.
This library is suited for deterministic regular languages. It is optimally used in addition to a predefined syntax. For example, if you have a predefined EBNF, it is really easy to implement the syntax within this crate.
§Getting Started
To get started with implementing Consumable
on your own traits, I suggest taking a look at
the consume_struct
or consume_enum
documentation. Then you can come back here and look
at some common patterns.
§Common patterns
Parsing and thus consuming has a lot of often used patterns. Ofcourse, these are very easily available here aswell.
§Concatenation
Often we want to express that two patterns follow eachother in a source
string. For example,
you might want to express that every Line
is followed by a ';'
. In manger there are two
ways to do this.
§Macro’s
The first way, and the preferred way, is with the consume_struct
or consume_enum
macros
where you can present sequential consume instructions. You can see in the following example that
we are first consuming a '('
, followed by a i32
, followed by a closing ')'
.
use manger::{ Consumable, consume_struct };
struct EncasedInteger(i32);
consume_struct!(
EncasedInteger => [
> '(',
value: i32,
> ')';
(value)
]
);
§Tuples
Another way to represent the same concept is with the tuple type syntax. This can be done with
up to 10 types. Here we are again parsing the same (i32)
structure.
use manger::chars;
type EncasedInteger = (chars::OpenParenthese, i32, chars::CloseParenthese);
§Repetition
Most of the time you want to represent some kind of repetition. There are a lot of different way to represent repetition. Here there are two easy ways.
§Vec
The easiest way to do repetition is with the Vec<T>
. This will consume 0 or
more instances of type T
. Ofcourse, the type T
has have has Consumable
implemented.
Here you can see how what that looks like:
Since
Vec<T>
will consume instances of typeT
until it finds a error, it can never fail itself. You are therefore safe to unwrap the result.
use manger::{ Consumable, consume_struct };
struct EncasedInteger(i32);
consume_struct!(
EncasedInteger => [
> '[',
value: i32,
> ']';
(value)
]
);
let source = "[3][-4][5]";
let (encased_integers, _) = <Vec<EncasedInteger>>::consume_from(source)?;
let sum: i32 = encased_integers
.iter()
.map(|EncasedInteger(value)| value)
.sum();
assert_eq!(sum, 4);
§OneOrMore
The other easy way to do repetition is with OneOrMore<T>
. This allows for
consuming 1 or more instances of type T
. And again, type T
has to have Consumable
implemented. Here you can see what that looks like:
use manger::{ Consumable, consume_struct };
use manger::common::OneOrMore;
struct EncasedInteger(i32);
consume_struct!(
EncasedInteger => [
> '[',
value: i32,
> ']';
(value)
]
);
let source = "[3][-4][5]";
let (encased_integers, _) = <OneOrMore<EncasedInteger>>::consume_from(source)?;
let product: i32 = encased_integers
.into_iter()
.map(|EncasedInteger(value)| value)
.product();
assert_eq!(product, -60);
§Optional value
To express optional values you can use the Option<T>
standard rust
type. This will consume either 0 or 1 of type T
.
Since
Option<T>
will consume a instance of typeT
if it finds no error, it can never fail itself. You are therefore safe to unwrap the result.
use manger::consume_struct;
use manger::chars;
struct PossiblyEncasedInteger(i32);
consume_struct!(
PossiblyEncasedInteger => [
: Option<chars::OpenParenthese>,
value: i32,
: Option<chars::CloseParenthese>;
(value)
]
);
§Recursion
Another common pattern seen within combinatoric parsers is recursion. Since rust types need to
have a predefined since, we cannot do direct type recursion and we need to do heap allocation
with the [Box<T>
][std::box::Box] type from the standard library. We can make a prefixed
math expression parser as followed:
use manger::consume_enum;
use manger::common::{OneOrMore, Whitespace};
enum Expression {
Times(Box<Expression>, Box<Expression>),
Plus(Box<Expression>, Box<Expression>),
Constant(u32),
}
consume_enum!(
Expression {
Times => [
> '*',
: OneOrMore<Whitespace>,
left: Box<Expression>,
: OneOrMore<Whitespace>,
right: Box<Expression>;
(left, right)
],
Plus => [
> '+',
: OneOrMore<Whitespace>,
left: Box<Expression>,
: OneOrMore<Whitespace>,
right: Box<Expression>;
(left, right)
],
Constant => [
value: u32;
(value)
]
}
);
§Whitespace
For whitespace we can use the [manger::common::Whitespace
] struct. This will consume any
utf-8 character that is identified as a whitespace character by the char::is_whitespace
function.
§Either
If two possibilities are present for consuming there are two options to choose from. Both are valid in certain scenarios.
§Macro
Using the consume_enum
you can create an struct which can be consuming in a number of
options and you can see which option was selected. If you need to see which of the different
options was selected, this should be your choice.
§Either<L, R>
You can also use the [Either<L, R>
][either::Either] type to represent the either
relationship. This option is preferred if we do not care about which option is selected.
Modules§
- chars
- Types for consuming individual utf-8 characters.
- common
- Types for common structures within consuming.
Macros§
- consume_
enum - A macro used for defining the way a
enum
should be consumed. It will implementConsumable
for thisenum
. - consume_
struct - A macro used for defining the way a
struct
should be consumed. It will implementConsumable
for thisstruct
.
Structs§
- Consume
Error - A list of errors that occured while consuming from a
source
. - Consume
Iter - Iterator over a
source
for aConsumable
typeT
.
Enums§
- Consume
Error Type - One item in
ConsumeError
. these can occur while consuming from a `source.
Traits§
- Consumable
- Trait that defines whether a trait can be interpretted for a
source
string or not. It is the trait that defines most behaviour for manger. - Consume
Source - Trait that exposes some functions for easier consuming syntax on
&str
. - Self
Consumable - Trait which allows for consuming of instances and literals from a string.