Struct wast::parser::Parser

source ·

pub struct Parser<'a> { /* private fields */ }

Expand description

An in-progress parser for the tokens of a WebAssembly text file.

A Parser is argument to the Parse trait and is now the input stream is interacted with to parse new items. Cloning Parser or copying a parser refers to the same stream of tokens to parse, you cannot clone a Parser and clone two items.

For more information about a Parser see its methods.

Implementations§

source §

impl<'a> Parser<'a>

source

pub fn is_empty(self) -> bool

Returns whether there are no more Token tokens to parse from this Parser.

This indicates that either we’ve reached the end of the input, or we’re a sub-Parser inside of a parenthesized expression and we’ve hit the ) token.

Note that if false is returned there may be more comments. Comments and whitespace are not considered for whether this parser is empty.

source

pub fn parse<T: Parse<'a>>(self) -> Result<T>

Parses a T from this Parser.

This method has a trivial definition (it simply calls T::parse) but is here for syntactic purposes. This is what you’ll call 99% of the time in a Parse implementation in order to parse sub-items.

Typically you always want to use ? with the result of this method, you should not handle errors and decide what else to parse. To handle branches in parsing, use Parser::peek.

§Examples

A good example of using parse is to see how the TableType type is parsed in this crate. A TableType is defined in the official specification as tabletype and is defined as:

tabletype ::= lim:limits et:reftype

so to parse a TableType we recursively need to parse a Limits and a RefType

struct TableType<'a> {
    limits: Limits,
    elem: RefType<'a>,
}

impl<'a> Parse<'a> for TableType<'a> {
    fn parse(parser: Parser<'a>) -> Result<Self> {
        // parse the `lim` then `et` in sequence
        Ok(TableType {
            limits: parser.parse()?,
            elem: parser.parse()?,
        })
    }
}

source

pub fn peek<T: Peek>(self) -> Result<bool>

Performs a cheap test to see whether the current token in this stream is T.

This method can be used to efficiently determine what next to parse. The Peek trait is defined for types which can be used to test if they’re the next item in the input stream.

Nothing is actually parsed in this method, nor does this mutate the state of this Parser. Instead, this simply performs a check.

This method is frequently combined with the Parser::lookahead1 method to automatically produce nice error messages if some tokens aren’t found.

§Examples

For an example of using the peek method let’s take a look at parsing the Limits type. This is defined in the official spec as:

limits ::= n:u32
         | n:u32 m:u32

which means that it’s either one u32 token or two, so we need to know whether to consume two tokens or one:

struct Limits {
    min: u32,
    max: Option<u32>,
}

impl<'a> Parse<'a> for Limits {
    fn parse(parser: Parser<'a>) -> Result<Self> {
        // Always parse the first number...
        let min = parser.parse()?;

        // ... and then test if there's a second number before parsing
        let max = if parser.peek::<u32>()? {
            Some(parser.parse()?)
        } else {
            None
        };

        Ok(Limits { min, max })
    }
}

source

pub fn peek2<T: Peek>(self) -> Result<bool>

Same as the Parser::peek method, except checks the next token, not the current token.

source

pub fn peek3<T: Peek>(self) -> Result<bool>

Same as the Parser::peek2 method, except checks the next next token, not the next token.

source

pub fn lookahead1(self) -> Lookahead1<'a>

A helper structure to perform a sequence of peek operations and if they all fail produce a nice error message.

This method purely exists for conveniently producing error messages and provides no functionality that Parser::peek doesn’t already give. The Lookahead1 structure has one main method Lookahead1::peek, which is the same method as Parser::peek. The difference is that the Lookahead1::error method needs no arguments.

§Examples

Let’s look at the parsing of Index. This type is either a u32 or an Id and is used in name resolution primarily. The official grammar for an index is:

idx ::= x:u32
      | v:id

Which is to say that an index is either a u32 or an Id. When parsing an Index we can do:

enum Index<'a> {
    Num(u32),
    Id(Id<'a>),
}

impl<'a> Parse<'a> for Index<'a> {
    fn parse(parser: Parser<'a>) -> Result<Self> {
        let mut l = parser.lookahead1();
        if l.peek::<Id>()? {
            Ok(Index::Id(parser.parse()?))
        } else if l.peek::<u32>()? {
            Ok(Index::Num(parser.parse()?))
        } else {
            // produces error message of `expected identifier or u32`
            Err(l.error())
        }
    }
}

source

pub fn parens<T>(self, f: impl FnOnce(Parser<'a>) -> Result<T>) -> Result<T>

Parse an item surrounded by parentheses.

WebAssembly’s text format is all based on s-expressions, so naturally you’re going to want to parse a lot of parenthesized things! As noted in the documentation of Parse you typically don’t parse your own surrounding ( and ) tokens, but the parser above you parsed them for you. This is method method the parser above you uses.

This method will parse a ( token, and then call f on a sub-parser which when finished asserts that a ) token is the next token. This requires that f consumes all tokens leading up to the paired ).

Usage will often simply be parser.parens(|p| p.parse())? to automatically parse a type within parentheses, but you can, as always, go crazy and do whatever you’d like too.

§Examples

A good example of this is to see how a Module is parsed. This isn’t the exact definition, but it’s close enough!

struct Module<'a> {
    fields: Vec<ModuleField<'a>>,
}

impl<'a> Parse<'a> for Module<'a> {
    fn parse(parser: Parser<'a>) -> Result<Self> {
        // Modules start out with a `module` keyword
        parser.parse::<kw::module>()?;

        // And then everything else is `(field ...)`, so while we've got
        // items left we continuously parse parenthesized items.
        let mut fields = Vec::new();
        while !parser.is_empty() {
            fields.push(parser.parens(|p| p.parse())?);
        }
        Ok(Module { fields })
    }
}

source

pub fn parens_depth(&self) -> usize

Return the depth of nested parens we’ve parsed so far.

This is a low-level method that is only useful for implementing recursion limits in custom parsers.

source

pub fn step<F, T>(self, f: F) -> Result<T>
where F: FnOnce(Cursor<'a>) -> Result<(T, Cursor<'a>)>,

A low-level parsing method you probably won’t use.

This is used to implement parsing of the most primitive types in the core module. You probably don’t want to use this, but probably want to use something like Parser::parse or Parser::parens.

source

pub fn error(self, msg: impl Display) -> Error

Creates an error whose line/column information is pointing at the current token.

This is used to produce human-readable error messages which point to the right location in the input stream, and the msg here is arbitrary text used to associate with the error and indicate why it was generated.

source

pub fn error_at(self, span: Span, msg: impl Display) -> Error

Creates an error whose line/column information is pointing at the given span.

source

pub fn cur_span(&self) -> Span

Returns the span of the current token

source

pub fn prev_span(&self) -> Span

Returns the span of the previous token

source

pub fn register_annotation<'b>(self, annotation: &'b str) -> impl Drop + 'b
where 'a: 'b,

Registers a new known annotation with this parser to allow parsing annotations with this name.

WebAssembly annotations are a proposal for the text format which allows decorating the text format with custom structured information. By default all annotations are ignored when parsing, but the whole purpose of them is to sometimes parse them!

To support parsing text annotations this method is used to allow annotations and their tokens to not be skipped. Once an annotation is registered with this method, then while the return value has not been dropped (e.g. the scope of where this function is called) annotations with the name annotation will be parse of the token stream and not implicitly skipped.

§Skipping annotations

The behavior of skipping unknown/unregistered annotations can be somewhat subtle and surprising, so if you’re interested in parsing annotations it’s important to point out the importance of this method and where to call it.

Generally when parsing tokens you’ll be bottoming out in various Cursor methods. These are all documented as advancing the stream as much as possible to the next token, skipping “irrelevant stuff” like comments, whitespace, etc. The Cursor methods will also skip unknown annotations. This means that if you parse any token, it will skip over any number of annotations that are unknown at all times.

To parse an annotation you must, before parsing any token of the annotation, register the annotation via this method. This includes the beginning ( token, which is otherwise skipped if the annotation isn’t marked as registered. Typically parser parse the contents of an s-expression, so this means that the outer parser of an s-expression must register the custom annotation name, rather than the inner parser.

§Return

This function returns an RAII guard which, when dropped, will unregister the annotation given. Parsing annotation is only supported while the returned value is still alive, and once dropped the parser will go back to skipping annotations with the name annotation.

§Example

Let’s see an example of how the @name annotation is parsed for modules to get an idea of how this works:

struct Module<'a> {
    name: Option<NameAnnotation<'a>>,
}

impl<'a> Parse<'a> for Module<'a> {
    fn parse(parser: Parser<'a>) -> Result<Self> {
        // Modules start out with a `module` keyword
        parser.parse::<kw::module>()?;

        // Next may be `(@name "foo")`. Typically this annotation would
        // skipped, but we don't want it skipped, so we register it.
        // Note that the parse implementation of
        // `Option<NameAnnotation>` is the one that consumes the
        // parentheses here.
        let _r = parser.register_annotation("name");
        let name = parser.parse()?;

        // ... and normally you'd otherwise parse module fields here ...

        Ok(Module { name })
    }
}

Another example is how we parse the @custom annotation. Note that this is parsed as part of ModuleField, so note how the annotation is registered before we parse the parentheses of the annotation.

struct Module<'a> {
    fields: Vec<ModuleField<'a>>,
}

impl<'a> Parse<'a> for Module<'a> {
    fn parse(parser: Parser<'a>) -> Result<Self> {
        // Modules start out with a `module` keyword
        parser.parse::<kw::module>()?;

        // register the `@custom` annotation *first* before we start
        // parsing fields, because each field is contained in
        // parentheses and to parse the parentheses of an annotation we
        // have to known to not skip it.
        let _r = parser.register_annotation("custom");

        let mut fields = Vec::new();
        while !parser.is_empty() {
            fields.push(parser.parens(|p| p.parse())?);
        }
        Ok(Module { fields })
    }
}

enum ModuleField<'a> {
    Custom(Custom<'a>),
    // ...
}

impl<'a> Parse<'a> for ModuleField<'a> {
    fn parse(parser: Parser<'a>) -> Result<Self> {
        // Note that because we have previously registered the `@custom`
        // annotation with the parser we known that `peek` methods like
        // this, working on the annotation token, are enabled to ever
        // return `true`.
        if parser.peek::<annotation::custom>()? {
            return Ok(ModuleField::Custom(parser.parse()?));
        }

        // .. typically we'd parse other module fields here...

        Err(parser.error("unknown module field"))
    }
}