gll 0.0.1

GLL parsing framework.
docs.rs failed to build gll-0.0.1
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build: gll-0.0.2

GLL parsing framework

Build Status Latest Version Rust Documentation

Usage

Easiest way to get started is through gll-macros:

[dependencies]
gll = "0.0.1"
gll-macros = "0.0.1"
#![feature(decl_macro, generators, generator_trait)]

extern crate gll;
extern crate gll_macros;

As an example, this is what you might write for a JSON-like syntax, that uses plain identifiers instead of string literals for field names, and allows values to be parenthesized Rust expressions:

mod json_like {
    ::gll_macros::proc_macro_parser! {
        Value =
            Null:"null" |
            False:"false" |
            True:"true" |
            Literal:LITERAL |
            Array:{ "[" elems:Value* % "," "]" } |
            Object:{ "{" fields:Field* % "," "}" } |
            InterpolateRust:{ "(" TOKEN_TREE+ ")" };
        Field = name:IDENT ":" value:Value;
    }
}

You can also use a build script to generate the parser (TODO: document).

To parse a string with that grammar:

let tokens = string.parse().unwrap();
json_like::Value::parse_with(tokens, |parser, result| {
    let value = result.unwrap();
    // ...
});

Grammar

All grammars contain a set of named rules, with the syntax Name = rule;. (the order between the rules doesn't matter)

Rules are made out of:

  • grouping, using {...}
  • string literals, matching input characters / tokens exactly
  • character ranges: 'a'..='d' is equivalent to "a"|"b"|"c"|"d"
    • only in scannerless mode
  • builtin rules: IDENT, PUNCT, LITERAL, TOKEN_TREE
    • only in proc macro mode
  • named rules, referred to by their name
  • concatenation: A B - "A followed by B"
  • alternation: A | B - "either A or B"
  • optionals: A? - "either A or nothing"
  • lists: A* - "zero or more As", A+ - "one or more As"
    • optional separator: A* % "," - "comma-separated As"

Parts of a rule can be labeled with field names, to allow later access to them:

LetDecl = "let" pat:Pat { "=" init:Expr }? ";"; produces:

// Note: generic parameters omitted for brevity.
struct LetDecl {
    pat: Handle<Pat>,
    init: Option<Handle<Expr>>,
}

One Rust-specific convention is that alternation fields are enum variants.

Expr = Lit:LITERAL | Add:{ a:Expr "+" b:Expr }; produces:

enum Expr {
    Lit(Handle<LITERAL>),
    Add {
        a: Handle<Expr>,
        b: Handle<Expr>,
    },
}

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this crate by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.