[][src]Module autumn::parsers

Commonly used parsers that can be combined with each other

The functions here are either parsers in their own right or functions that produce parsers as their output.

Specifying characters

The following parsers are provided that parse a single character matching a certain category of characters. They all produce a Span of the character they match.

  • any_character will match any Unicode character.
  • alphabetic will match an ASCII alphabetic character.
  • alphanumeric will match an ASCII alphanumeric character.
  • digit will match an ASCII decimal digit.
  • whitespace will match any Unicode whitespace character. space is also provided to match one or more Unicode whitespace characters.

These can be combined to produce many basic parsers.

/// Parses C-like identifiers
fn identifier(source: &str, location: Span) -> ParseResult<String> {
    alphabetic
        .or("_")
        .and(alphanumeric.or("_").multiple().maybe())
        .copy_string()
        .parse(source, location)
}

/// Parses integers
fn integer(source: &str, location: Span) -> ParseResult<String> {
    digit.multiple().copy_string().parse(source, location)
}

/// Parses float literals
fn float(source: &str, location: Span) -> ParseResult<String> {
    digit
        .multiple()
        .and(".".and(digit.multiple()).maybe())
        .or(".".and(digit.multiple()))
        .copy_string()
        .parse(source, location)
}

Specific characters or strings

To parse a specific literal character or string, the corresponding &str or String can be used directly as a parser. These types will all parse themselves from the input and produce a Span corresponding to the matched characters.

The empty parser is also provided to parse no characters and produce an empty Concat.

/// Parse C18 storage class specifiers
fn storage_class(source: &str, location: Span) -> ParseResult<String> {
    "auto"
        .or("extern")
        .or("register")
        .or("static")
        .or("typedef")
        .or("_Thread_local")
        .copy_string()
        .parse(source, location)
}

Parsing a character rather than a list of characters

The character parser consumes any Unicode character and produces a char rather than a Span. The condition combinator can be used to restrict which character is matched.

/// Parse simple single-character operators
fn operator(source: &str, location: Span) -> ParseResult<char> {
    '+'.or('-').or('*').or('/').or('%').parse(source, location)
}

Closures as parsers

Although the Rust compiler will quite happily accept any function with the appropriate signature as a parser it requires additional help to determine that closures may also be parsers. The closure simply ensures that a closure is appropriately considered a parser.

/// Parse an exact number of a specific character
fn counted(character: char, count: usize) -> impl Parser<Span> {
    closure(move |source, location| {
        if count > 0 {
            any_character
                .and(counted(character, count - 1))
                .parse(source, location)
        } else {
            empty.parse(source, location)
        }
    })
}

Parsers that produce errors

The error and throw functions can be used when a parse output needs to be checked to determine if the value is correct or a parse occurs that relates to an error in the source. The value parser can be used to produce a matching parser if a check shows that no error has occurred.

#[derive(Clone)]
struct InvalidIdentifier(String);

/// Parses C-like identifiers
fn identifier(
    source: &str,
    location: Span,
) -> ParseResult<Option<String>, InvalidIdentifier> {
    alphabetic
        .or("_")
        .and(alphanumeric.or("_").multiple())
        .copy_string()
        .map(Some)
        .on_none(
            any_character
                .str_condition(|c| !c.chars().any(char::is_whitespace))
                .multiple()
                .copy_string()
                .and_then(|identifier| throw(None, InvalidIdentifier(identifier)))
        )
        .catch()
        .parse(source, location)
}

Functions

alphabetic

Parses a single ASCII alphabetic character from the input

alphanumeric

Parses a single ASCII alphabetic or digit character from the input

any_character

Parsers a single character from the input as an element of a Span

character

Parses a single character from the input

closure

Converts a parser-like closure into an actual parser

digit

Parses a single ASCII digit character from the input

empty

A parser that consumes no input and produces an empty Concat

error

Creates a parser that consumes no input and produces the given error as the result of parsing

ref_parser

Takes a reference to a parser and produces a complete parser

space

Parses one or more Unicode whitespace characters from the input

throw

Creates a parser that consumes no input and produces the given error as an exception

value

Creates a parser that consumes no input and produces the given value as the result of parsing

whitespace

Parses a single Unicode whitespace character from the input