RustyParser
A Generic Parser generator and Pattern Matching Library written in Rust
Example
rusty_parser/src/example/example1.rs
// import rusty_parser
use rusty_parser as rp;
// for assert_eq!()
use type_name;
use type_name_of_val;
// trait Parser; must be imported for .parse( ... ) method
use Parser;
Structures
Every Parser implements trait Parser<It>
.
It
is the type of the iterator that the Parser will work on.
trait Parser
has associate type Output
which is the type of the output, the extracted data from the input string.
trait Parser
has following methods.
;
;
which takes an iterator and returns ParseResult<Self::Output, It>
.
match_pattern(...)
is used
when you only want to check if the pattern is matched or not, without extracting data.
For some parsers, like repeat
, it is expensive to call parse(...)
to get the output since it invokes Vec::insert
inside.
ParseResult
is a struct representing the result of parsing.
Note that Output
must be a Tuple
(including null-tuple ()
).
Even if the Parser extracts only one element, the output must be a Tuple.
Since the parse(...)
internally clones the iterator,
the iterator must be cheaply clonable.
Basic Parsers
one
: consumes one charactor if it is equal to c
.
let parser = one
let a_parser = one;
Output
: (Iterator::Item,)
range
: consumes one charactor if it is in the range r
.
let parser = range
Output
: (Iterator::Item,)
string
: consumes multiple charactors if it is equal to s
.
let parser = string
Output
: ()
Dictionary: build Trie from a list of strings
// let mut parser = rp::DictBTree::new();
let mut parser = new;
parser.insert;
parser.insert;
parser.insert;
// this will match as long as possible
let res = parser.parse;
assert_eq!;
// 'hello_world' is parsed, so the rest is "_abcdefg"
assert_eq!;
// match 'hello' only
let res = parser.parse;
assert_eq!;
Output
: generic type you support
There are two types of Dictionary: DictBTree
and DictHashMap
, for Trie implementation.
Both of them have their own Pros and Cons (the memory usage and time complexity of searching), so you can choose one of them.
Combinators
seq
: sequence of parsers
let a_parser = one;
let b_parser = one;
// parser sequence
// 'a', and then 'b'
let ab_parser = a_parser.seq;
let res = ab_parser.parse;
assert_eq!;
assert_eq!;
Output
: ( L0, L1, ..., R0, R1, ... )
where (L0, L1, ...)
are the outputs of the first parser,
and (R0, R1, ...)
are the outputs of the second parser.
or_
: or combinator
let a_parser = one;
let b_parser = one;
// parser sequence
// if 'a' is not matched, then try 'b'
// the order is preserved; if both parser shares condition
let ab_parser = a_parser.or_;
// 'a' is matched
let res = ab_parser.parse;
assert_eq!;
assert_eq!;
// continue parsing from the rest
// 'a' is not matched, but 'b' is matched
let res = ab_parser.parse;
assert_eq!;
assert_eq!;
// continue parsing from the rest
// 'a' is not matched, 'b' is not matched; failed
let res = ab_parser.parse;
assert_eq!;
assert_eq!;
Output
: Output
of the first and second parser.
Note that the output of both parsers must be the same type.
map
: map the output of the parser
let a_parser = one;
// map the output
// (Charactor Type You Entered,) --> (i32, )
let int_parser = a_parser.map;
let res = int_parser.parse;
assert_eq!;
assert_eq!;
Output
: return type of the closure ( must be Tuple )
repeat
: repeat the parser multiple times
let a_parser = one;
// repeat 'a' 3 to 5 times (inclusive)
let multiple_a_parser = a_parser.repeat;
let res = multiple_a_parser.parse;
// four 'a' is parsed
assert_eq!;
assert_eq!;
Output
:
- if
Output
of the repeated parser is()
, thenOutput
is()
- if
Output
of the repeated parser is(T,)
, thenOutput
isVec<T>
- otherwise,
Vec< Output of the Repeated Parser >
void_
: ignore the output of the parser
Force the output to be ()
.
It internally calls match_pattern(...)
instead of parse(...)
.
This is useful when you only want to check if the pattern is matched or not.
For more information, see match_pattern(...)
above.
let a_parser = one;
let a_parser = a_parser.map;
let multiple_a_parser = a_parser.repeat;
let multiple_a_void_parser = multiple_a_parser.void_;
// ignore the output of parser
// this internally calls 'match_pattern(...)' instead of 'parse(...)'
let res = multiple_a_void_parser.parse;
assert_eq!;
assert_eq!;
Output
: ()
For complex, highly recursive pattern
By default, all the 'parser-generating' member functions consumes self
and returns a new Parser.
And Parser::parse(&self)
takes immutable reference of Self.
However, in some cases, you may want to define a recursive parser. Which involves 'reference-of-parser' or 'virtual-class-like' structure.
Luckily, Rust std provides wrapper for these cases.
Rc
, RefCell
, Box
are the most common ones.
RustyParser provides BoxedParser
, RCedParser
, RefCelledParser
which are Parser Wrapper for Box
, Rc
, RefCell
.
boxed
: a Box<dyn Parser>
wrapper
let hello_parser = string;
let digit_parser = range.void_; // force the output to be ()
// this will wrap the parser into Box< dyn Parser >
let mut boxed_parser = hello_parser.boxed;
// Note. boxed_parser is mutable
let target_string = "hello0123";
let res_hello = boxed_parser.parse;
// success
assert_eq!;
assert_eq!;
// now change boxed_parser to digit_parser
boxed_parser = digit_parser.boxed;
// this is same as:
// boxed_parser.assign(digit_parser);
let res_digit = boxed_parser.parse;
// success
assert_eq!;
assert_eq!;
Output
: the Output
of child parser
refcelled
: a RefCell<Parser>
wrapper
RefCelledParser
is useful if it is combined with BoxedParser
or RCedParser
.
Since it provides internal mutability.
let hello_parser = string;
let digit_parser = range.void_;
// this will wrap the parser into Box< dyn Parser >
let boxed_parser = hello_parser.boxed;
let refcelled_parser = boxed_parser.refcelled;
// Note. refcelled_parser is immutable
let target_string = "hello0123";
let res_hello = refcelled_parser.parse;
// success
assert_eq!;
assert_eq!;
// now change refcelled_parser to digit_parser
refcelled_parser // RefCelledParser
.refcelled_parser // &RefCell<BoxedParser>
.borrow_mut // RefMut<BoxedParser> --> &mut BoxedParser
.assign; // assign new parser
let res_digit = refcelled_parser.parse;
// success
assert_eq!;
assert_eq!;
Output
: the Output
of child parser
rced
: a Rc<Parser>
wrapper
RCedParser
is used to share the same parser.
let hello_parser = string;
let digit_parser = range.void_;
// this will wrap the parser into Box< dyn Parser >
let boxed_parser = hello_parser.boxed;
let refcelled_parser = boxed_parser.refcelled;
// Note. refcelled_parser is immutable
let rced_parser1 = refcelled_parser.rced;
let rced_parser2 = clone;
// rced_parser2 is now pointing to the same parser as rced_parser1
let target_string = "hello0123";
let res_hello = rced_parser1.parse;
// success
assert_eq!;
assert_eq!;
// now change rced_parser1 to digit_parser
rced_parser1 // RCedParser
.rced_parser // &Rc<RefCelledParser>
.refcelled_parser // &RefCell<BoxedParser>
.borrow_mut // RefMut<BoxedParser> --> &mut BoxedParser
.assign; // assign new parser
// rced_parser2 should also be digit_parser
let res_digit = rced_parser2.parse;
// success
assert_eq!;
assert_eq!;
Output
: the Output
of child parser