Trait parsell::Boxable [] [src]

pub trait Boxable<Ch, Str, Output> {
    fn more_boxable(&mut self, string: &mut Str) -> ParseResult<(), Output>;
    fn done_boxable(&mut self) -> Output;
}

A trait for boxable parsers.

Regular languages can be parsed in constant memory, so do not require any heap allocation (other than the heap allocation peformed by user code such as creating buffers). Context-free languages require allocating unbounded memory. In order to support streaming input, the state of the parser must be saved on the heap, and restored when more input arrives.

In Rust, heap-allocated data is often kept in a Box<T>, where T is a trait. In the case of parsers, the library needs to save and restore a stateful parser, for which the obvious type is Box<Stateful<Ch, Str, Output>. There are two issues with this...

Firstly, the lifetimes mentioned in the type of the parser may change over time. For example, the program:

let this: &'a str = ...;
let that: &'b str = ...;
let parser = character(char::is_alphanumeric).star(ignore).buffer();
match parser.init_str(this).unwrap() {
    Continue(stateful) => {
        let boxed: Box<Stateful<char, Chars<'a>, Cow<'a, str>>> = Box::new(stateful);
        match boxed.more_str(that) {
            Done(result: Cow<'b,str>) => ...

does not typecheck, because the type of boxed is fixed as containing parsers for input &'a str, but it was fed input of type &'b str. To fix this, we change the type of the box to be polymorphic in the lifetime of the parser:

let this: &'a str = ...;
let that: &'b str = ...;
let parser = character(char::is_alphanumeric).star(ignore).buffer();
match parser.init_str(this).unwrap() {
    Continue(stateful) => {
        let boxed: Box<for<'c> Stateful<char, Chars<'c>, Cow<'c, str>>> = Box::new(stateful);
        match boxed.more_str(that) {
            Done(result: Cow<'b,str>) => ...

Secondly, the Stateful trait is not object-safe, so cannot be boxed and unboxed safely. In order to address this, there is a trait Boxable<Ch, Str, Output>, which represents stateful parsers, but is object-safe and so can be boxed and unboxed safely.

The type Box<Boxable<Ch, Str, Output>> implements the trait Stateful<Ch, Str, Output>, so boxes can be used as parsers, which allows stateful parsers to heap-allocate their state.

Boxable parsers are usually used in recursive-descent parsing, for context-free grammars that cannot be parsed as a regular language. For example, consider a simple type for trees:

struct Tree(Vec<Tree>);

which can be parsed from a well-nested sequence of parentheses, for example (()()) can be parsed as Tree(vec![Tree(vec![]),Tree(vec![])]). The desired implementation is:

fn is_lparen(ch: char) -> bool { ch == '(' }
fn is_rparen(ch: char) -> bool { ch == ')' }
fn mk_vec() -> Result<Vec<Tree>,String> { Ok(Vec::new()) }
fn mk_ok<T>(ok: T) -> Result<T,String> { Ok(ok) }
fn mk_err<T>(_: Option<char>) -> Result<T,String> { Err(String::from("Expected a ( or ).")) }
fn mk_tree(_: char, children: Vec<Tree>, _: char) -> Tree { Tree(children) }
let LPAREN = character(is_lparen);
let RPAREN = character(is_rparen).map(mk_ok).or_else(CHARACTER.map(mk_err));
let TREE = LPAREN
     .and_then_try(TREE.star(mk_vec))
     .try_and_then_try(RPAREN)
     .try_map3(mk_tree);

but this doesn't work because it gives the definition of TREE in terms of itself, and Rust doesn't allow this kind of cycle.

Instead, the solution is to define a struct TreeParser, and then implement UncommittedInfer<char, Chars<'a>> for it. The type of the state of a TreeParser is a box containing an appropriate BoxableParserState trait:

type TreeParserState = Box<for<'b> Boxable<char, Chars<'b>, Result<Tree, String>>>;

The implementation of UncommittedInfer<char, Chars<'b>> for TreeParser is mostly straightfoward:

struct Tree(Vec<Tree>);
impl StaticMarker for Tree {}
type TreeParserState = InState<TreeParser, Box<for<'a> Boxable<char, Chars<'a>, Result<Tree, String>>>>;
struct TreeParser;
impl Parser for TreeParser {}
impl<'a> HasOutput<char, Chars<'a>> for TreeParser { type Output = Result<Tree, String>; }
impl<'a> Uncommitted<char, Chars<'a>, Result<Tree, String>> for TreeParser {
    type State = TreeParserState;
    fn init(&self, data: &mut Chars<'a>) -> Option<ParseResult<TreeParserState, Result<Tree, String>>> {
        // ... parser goes here...`
    }
}

The important thing is that the definiton of parse can make use of TREE, so the parser can call itself recursively, then box up the result state. Boxing up the state makes use of a method parser.boxed(mk_box) where mk_box is a function that boxes up boxable state:

type TreeParserState = InState<TreeParser, Box<for<'a> Boxable<char, Chars<'a>, Result<Tree, String>>>>;
fn is_lparen(ch: char) -> bool { ch == '(' }
fn is_rparen(ch: char) -> bool { ch == ')' }
fn mk_vec() -> Result<Vec<Tree>,String> { Ok(Vec::new()) }
fn mk_ok<T>(ok: T) -> Result<T,String> { Ok(ok) }
fn mk_err<T>(_: Option<char>) -> Result<T,String> { Err(String::from("Expected a ( or ).")) }
fn mk_tree(_: char, children: Vec<Tree>, _: char) -> Tree { Tree(children) }
fn mk_box<P>(state: P) -> TreeParserState
    where P: 'static+for<'a> Boxable<char, Chars<'a>, Result<Tree,String>>
{
    TreeParser.in_state(Box::new(state))
}
struct TreeParser;
impl Parser for TreeParser {}
impl<'a> HasOutput<char, Chars<'a>> for TreeParser { type Output = Result<Tree, String>; }
impl<'a> Uncommitted<char, Chars<'a>, Result<Tree, String>> for TreeParser {
    type State = TreeParserState;
    fn init(&self, data: &mut Chars<'a>) -> Option<ParseResult<TreeParserState, Result<Tree, String>>> {
        let LPAREN = character(is_lparen);
        let RPAREN = character(is_rparen).map(mk_ok).or_else(CHARACTER.map(mk_err));
        let parser = LPAREN
            .and_then_try(TreeParser.star(mk_vec))
            .try_and_then_try(RPAREN)
            .try_map3(mk_tree)
            .boxed(mk_box);
        parser.init(data)
    }
}
let TREE = TreeParser;
match TREE.init_str("((").unwrap() {
    Continue(parsing) => match parsing.more_str(")())") {
        Done(result) => assert_eq!(result, Ok(Tree(vec![Tree(vec![]),Tree(vec![])]))),
         _ => panic!("Can't happen"),
    },
    _ => panic!("Can't happen"),
}

The reason for making Boxable<S> a different trait from StatefulInfer<S> is that it provides weaker safety guarantees. StatefulInfer<S> enforces that clients cannot call parse after done, but Boxable<S> does not.

Required Methods

Implementors