Trait parsimonious::Boxable [] [src]

pub trait Boxable<S> {
    type Output;
    fn parse_boxable(&mut self, value: S) -> (S, Option<Self::Output>);
    fn done_boxable(&mut self) -> Self::Output;
}

A trait for boxable parsers.

Regular languages can be parsed in constant memory, so do not require any heap allocation (other than the heap allocation peformed by user code such as creating buffers). Context-free languages require allocating unbounded memory. In order to support streaming input, the state of the parser must be saved on the heap, and restored when more input arrives.

In Rust, heap-allocated data is often kept in a Box<T>, where T is a trait. In the case of parsers, the library needs to save and restore a stateful parser, for which the obvious type is Box<Stateful<S,Output=T>. There are two issues with this...

Firstly, the lifetimes mentioned in the type of the parser may change over time. For example, the program:

fn check_results (self, rest: &'b str, result: String) {
   assert_eq!(rest,"!"); assert_eq!(result,"abc123");
}
let parser = character(char::is_alphanumeric).star(String::new);
let stateful = parser.init();
let boxed: Box<Stateful<&'a str,Output=String>> = Box::new(stateful);
let stuff: &'b str = "abc123!";
match boxed.parse(stuff) {
   Done(rest,result) => self.check_results(rest,result),
   _ => println!("can't happen"),
}

does not typecheck, because the type of boxed is fixed as containing parsers for input &'a str, but it was fed input of type &'b str. To fix this, we change the type of the box to be polymorphic in the lifetime of the parser:

fn check_results (self, rest: &'b str, result: String) {
   assert_eq!(rest,"!"); assert_eq!(result,"abc123");
}
let parser = character(char::is_alphanumeric).star(String::new);
let stateful = parser.init();
let boxed: Box<for <'a> Stateful<&'a str,Output=String>> = Box::new(stateful);
let stuff: &'b str = "abc123!";
match boxed.parse(stuff) {
   Done(rest,result) => self.check_results(rest,result),
   _ => println!("can't happen"),
}

Secondly, the Stateful trait is not object-safe, so cannot be boxed and unboxed safely. In order to address this, there is a trait Boxable<S,Output=T>, which represents stateful parsers, but is object-safe and so can be boxed and unboxed safely:

fn check_results (self, rest: &'b str, result: String) {
   assert_eq!(rest,"!"); assert_eq!(result,"abc123");
}
let parser = character(char::is_alphanumeric).star(String::new);
let stateful = parser.init();
let boxed: Box<for <'a> Boxable<&'a str,Output=String>> = Box::new(stateful.boxable());
let stuff: &'b str = "abc123!";
match boxed.parse(stuff) {
   Done(rest,result) => self.check_results(rest,result),
   _ => println!("can't happen"),
}

The type Box<Boxable<S,Output=T>> implements the trait Stateful<S,Output=T>, so boxes can be used as parsers, which allows stateful parsers to heap-allocate their state.

Boxable parsers are usually used in recursive-descent parsing, for context-free grammars that cannot be parsed as a regular language. For example, consider a simple type for trees:

struct Tree(Vec<Tree>);

which can be parsed from a well-nested sequence of parentheses, for example (()()) can be parsed as Tree(vec![Tree(vec![]),Tree(vec![])]). The desired implementation is:

fn is_lparen(ch: char) -> bool { ch == '(' }
fn is_rparen(ch: char) -> bool { ch == ')' }
fn mk_vec() -> Result<Vec<Tree>,String> { Ok(Vec::new()) }
fn mk_ok<T>(ok: T) -> Result<T,String> { Ok(ok) }
fn mk_err<T>() -> Result<T,String> { Err(String::from("Expected a ( or ).")) }
fn mk_tree(_: char, children: Vec<Tree>, _: char) -> Result<Tree,String> {
    Ok(Tree(children))
}
let LPAREN = character(is_lparen);
let RPAREN = character(is_rparen).map(mk_ok).or_emit(mk_err);
let TREE = LPAREN
    .and_then_try(TREE.star(mk_vec))
    .try_and_then_try(RPAREN)
    .try_map3(mk_tree);

but this doesn't work because it gives the definition of TREE in terms of itself, and Rust doesn't allow this kind of cycle.

Instead, the solution is to define a struct TreeParser, and then implement Uncommitted<&str> for it. The type of the state of a TreeParser is a box containing an appropriate BoxableParserState trait:

type TreeParserState = Box<for<'b> Boxable<&'b str, Output=Tree>>;

The implementation of Uncommitted<&str> for TreeParser is mostly straightfoward:

struct Tree(Vec<Tree>);
struct TreeParser;
type TreeParserState = Box<for<'b> Boxable<&'b str, Output=Result<Tree,String>>>;
impl Parser for TreeParser {}
impl<'a> Uncommitted<&'a str> for TreeParser {
    type Output = Result<Tree,String>;
    type State = TreeParserState;
    fn parse(&self, data: &'a str) -> MaybeParseResult<Self::State,&'a str> {
        // ... parser goes here...`
    }
}

The important thing is that the definiton of parse can make use of TREE, so the parser can call itself recursively, then box up the result state:

struct Tree(Vec<Tree>);
struct TreeParser;
type TreeParserState = Box<for<'b> Boxable<&'b str, Output=Result<Tree,String>>>;
impl Parser for TreeParser {}
impl<'a> Uncommitted<&'a str> for TreeParser {
    type Output = Result<Tree,String>;
    type State = TreeParserState;
    fn parse(&self, data: &'a str) -> MaybeParseResult<Self::State,&'a str> {
        fn is_lparen(ch: char) -> bool { ch == '(' }
        fn is_rparen(ch: char) -> bool { ch == ')' }
        fn mk_vec() -> Result<Vec<Tree>,String> { Ok(Vec::new()) }
        fn mk_ok<T>(ok: T) -> Result<T,String> { Ok(ok) }
        fn mk_err<T>() -> Result<T,String> { Err(String::from("Expected a ( or ).")) }
        fn mk_tree(_: char, children: Vec<Tree>, _: char) -> Result<Tree,String> {
            Ok(Tree(children))
        }
        fn mk_box<P>(parser: P) -> TreeParserState
        where P: 'static+for<'a> Stateful<&'a str, Output=Result<Tree,String>> {
            Box::new(parser.boxable())
        }
        let LPAREN = character(is_lparen);
        let RPAREN = character(is_rparen).map(mk_ok).or_emit(mk_err);
        let parser = LPAREN
            .and_then_try(TreeParser.star(mk_vec))
            .try_and_then_try(RPAREN)
            .try_map3(mk_tree);
        parser.parse(data).map(mk_box)
    }
}
let TREE = TreeParser;
match TREE.parse("((") {
    Commit(Continue("",parsing)) => match parsing.parse(")()))") {
        Done(")",result) => assert_eq!(result,Ok(Tree(vec![Tree(vec![]),Tree(vec![])]))),
         _ => panic!("can't happen"),
    },
    _ => panic!("can't happen"),
}

The reason for making Boxable<S> a different trait from Stateful<S> is that it provides weaker safety guarantees. Stateful<S> enforces that clients cannot call parse after done, but Boxable<S> does not.

Associated Types

Required Methods

Implementors