Trait parsell::Boxable
[−]
[src]
pub trait Boxable<Ch, Str, Output> { fn more_boxable(&mut self, string: &mut Str) -> ParseResult<(), Output>; fn done_boxable(&mut self) -> Output; }
A trait for boxable parsers.
Regular languages can be parsed in constant memory, so do not require any heap allocation (other than the heap allocation peformed by user code such as creating buffers). Context-free languages require allocating unbounded memory. In order to support streaming input, the state of the parser must be saved on the heap, and restored when more input arrives.
In Rust, heap-allocated data is often kept in a Box<T>
, where T
is a trait. In the case
of parsers, the library needs to save and restore a stateful parser, for which the obvious
type is Box<Stateful<Ch, Str, Output>
. There are two issues with this...
Firstly, the lifetimes mentioned in the type of the parser may change over time. For example, the program:
let this: &'a str = ...;
let that: &'b str = ...;
let parser = character(char::is_alphanumeric).star(ignore).buffer();
match parser.init_str(this).unwrap() {
Continue(stateful) => {
let boxed: Box<Stateful<char, Chars<'a>, Cow<'a, str>>> = Box::new(stateful);
match boxed.more_str(that) {
Done(result: Cow<'b,str>) => ...
does not typecheck, because the type of boxed
is fixed as containing parsers for input &'a str
,
but it was fed input of type &'b str
. To fix this, we change the type of the box to be
polymorphic in the lifetime of the parser:
let this: &'a str = ...;
let that: &'b str = ...;
let parser = character(char::is_alphanumeric).star(ignore).buffer();
match parser.init_str(this).unwrap() {
Continue(stateful) => {
let boxed: Box<for<'c> Stateful<char, Chars<'c>, Cow<'c, str>>> = Box::new(stateful);
match boxed.more_str(that) {
Done(result: Cow<'b,str>) => ...
Secondly, the Stateful
trait is not
object-safe,
so cannot be boxed and unboxed safely. In order to address this, there is a trait
Boxable<Ch, Str, Output>
, which represents stateful parsers, but is object-safe
and so can be boxed and unboxed safely.
The type Box<Boxable<Ch, Str, Output>>
implements the trait
Stateful<Ch, Str, Output>
, so boxes can be used as parsers,
which allows stateful parsers to heap-allocate their state.
Boxable parsers are usually used in recursive-descent parsing, for context-free grammars that cannot be parsed as a regular language. For example, consider a simple type for trees:
struct Tree(Vec<Tree>);
which can be parsed from a well-nested sequence of parentheses, for example
(()())
can be parsed as Tree(vec![Tree(vec![]),Tree(vec![])])
.
The desired implementation is:
fn is_lparen(ch: char) -> bool { ch == '(' }
fn is_rparen(ch: char) -> bool { ch == ')' }
fn mk_vec() -> Result<Vec<Tree>,String> { Ok(Vec::new()) }
fn mk_ok<T>(ok: T) -> Result<T,String> { Ok(ok) }
fn mk_err<T>(_: Option<char>) -> Result<T,String> { Err(String::from("Expected a ( or ).")) }
fn mk_tree(_: char, children: Vec<Tree>, _: char) -> Tree { Tree(children) }
let LPAREN = character(is_lparen);
let RPAREN = character(is_rparen).map(mk_ok).or_else(CHARACTER.map(mk_err));
let TREE = LPAREN
.and_then_try(TREE.star(mk_vec))
.try_and_then_try(RPAREN)
.try_map3(mk_tree);
but this doesn't work because it gives the definition of TREE
in terms of itself,
and Rust doesn't allow this kind of cycle.
Instead, the solution is to define a struct TreeParser
, and then implement UncommittedInfer<char, Chars<'a>>
for it. The type of the state of a TreeParser
is a box containing an appropriate
BoxableParserState
trait:
type TreeParserState = Box<for<'b> Boxable<char, Chars<'b>, Result<Tree, String>>>;
The implementation of UncommittedInfer<char, Chars<'b>>
for TreeParser
is mostly straightfoward:
struct Tree(Vec<Tree>); impl StaticMarker for Tree {} type TreeParserState = InState<TreeParser, Box<for<'a> Boxable<char, Chars<'a>, Result<Tree, String>>>>; struct TreeParser; impl Parser for TreeParser {} impl<'a> HasOutput<char, Chars<'a>> for TreeParser { type Output = Result<Tree, String>; } impl<'a> Uncommitted<char, Chars<'a>, Result<Tree, String>> for TreeParser { type State = TreeParserState; fn init(&self, data: &mut Chars<'a>) -> Option<ParseResult<TreeParserState, Result<Tree, String>>> { // ... parser goes here...` } }
The important thing is that the definiton of parse
can make use of TREE
, so the parser can call itself
recursively, then box up the result state. Boxing up the state makes use of a method parser.boxed(mk_box)
where mk_box
is a function that boxes up boxable state:
type TreeParserState = InState<TreeParser, Box<for<'a> Boxable<char, Chars<'a>, Result<Tree, String>>>>; fn is_lparen(ch: char) -> bool { ch == '(' } fn is_rparen(ch: char) -> bool { ch == ')' } fn mk_vec() -> Result<Vec<Tree>,String> { Ok(Vec::new()) } fn mk_ok<T>(ok: T) -> Result<T,String> { Ok(ok) } fn mk_err<T>(_: Option<char>) -> Result<T,String> { Err(String::from("Expected a ( or ).")) } fn mk_tree(_: char, children: Vec<Tree>, _: char) -> Tree { Tree(children) } fn mk_box<P>(state: P) -> TreeParserState where P: 'static+for<'a> Boxable<char, Chars<'a>, Result<Tree,String>> { TreeParser.in_state(Box::new(state)) } struct TreeParser; impl Parser for TreeParser {} impl<'a> HasOutput<char, Chars<'a>> for TreeParser { type Output = Result<Tree, String>; } impl<'a> Uncommitted<char, Chars<'a>, Result<Tree, String>> for TreeParser { type State = TreeParserState; fn init(&self, data: &mut Chars<'a>) -> Option<ParseResult<TreeParserState, Result<Tree, String>>> { let LPAREN = character(is_lparen); let RPAREN = character(is_rparen).map(mk_ok).or_else(CHARACTER.map(mk_err)); let parser = LPAREN .and_then_try(TreeParser.star(mk_vec)) .try_and_then_try(RPAREN) .try_map3(mk_tree) .boxed(mk_box); parser.init(data) } } let TREE = TreeParser; match TREE.init_str("((").unwrap() { Continue(parsing) => match parsing.more_str(")())") { Done(result) => assert_eq!(result, Ok(Tree(vec![Tree(vec![]),Tree(vec![])]))), _ => panic!("Can't happen"), }, _ => panic!("Can't happen"), }
The reason for making Boxable<S>
a different trait from StatefulInfer<S>
is that it provides weaker safety guarantees. StatefulInfer<S>
enforces that
clients cannot call parse
after done
, but Boxable<S>
does not.
Required Methods
fn more_boxable(&mut self, string: &mut Str) -> ParseResult<(), Output>
fn done_boxable(&mut self) -> Output
Implementors
impl<P, Ch, Str, Output> Boxable<Ch, Str, Output> for BoxableState<P> where P: Stateful<Ch, Str, Output>