Trait Lexer

Source
pub trait Lexer: Debug {
    type Token: Sized + Debug;
    type State: Sized + Copy + Debug + Default;
    type Error: LexerError<Self::State>;

    // Required methods
    fn parse<'a>(
        &'a self,
        state: Self::State,
        parsers: &[BoxDynLexerParseFn<'a, Self>],
    ) -> LexerParseResult<Self::State, Self::Token, Self::Error>;
    fn iter<'iter>(
        &'iter self,
        parsers: &'iter [BoxDynLexerParseFn<'iter, Self>],
    ) -> Box<dyn Iterator<Item = Result<Self::Token, Self::Error>> + 'iter>;
}
Expand description

The Lexer trait is provided by stream types that support parsing into tokens.

The trait itself requires:

  • a token type that the Lexer will produce

  • a stream state (often just a byte offset) that can be tracked during parsing

  • an error type that suports LexerError so that the lexer can generate a failure should a token parse fail

The Lexer will parse its stream provided to it by matching data in the stream to tokens using parser functions. Such functions are invoked with a reference to the stream being parsed, the stream state, and the next character in the stream (the one pointed to by the stream state).

The signature is:

   fn parse(stream: &LexerOfStr<P, T, E>, pos:P, ch:char) ->
              LexerParseResult<P, T, E>

where

   LexerParseResult<P, T, E> = Result<Option<P, T>, E>

Parsing functions examine the character they are given, and possibly more characters by accessing the stream using the provide state. If they match, they return an Ok result with the token they have parsed to, and an updated state which is beyond the matched token.

If the parser function mismatches then it returns an Ok result of None

If the parser function hits a fatal error (for example, a stream indicates a network disconnection) then it must return an Err with the appropriate error (of its provided Error type).

Parser functions are provided to the Lexer as an array of Box dyn functions, such as:

      let parsers = [
           Box::new(parse_char_fn) as BoxDynLexerParseFn<OurLexer>
           Box::new(parse_value_fn),
           Box::new(parse_whitespace_fn),
       ];

Note that the use of ‘as Box…’ is required, as without it type inference will kick in on the Box::new() to infer parse_char_fn as a precise type, whereas the more generic dyn Fn is what is required.

This trait is provided in part to group the types for a lexical parser together, enabling simpler type inference and less turbofish syntax in clients of the lexical analysis.

Required Associated Types§

Source

type Token: Sized + Debug

The Token type is the type of the token to be returned by the Lexer; it is used as part of the result of the Lexer parse functions.

Source

type State: Sized + Copy + Debug + Default

The State of the stream that is used and returned by the parse functions; it must be copy as it is replicated constantly throughout the parsing process.

This can be a crate::StreamCharPos

Source

type Error: LexerError<Self::State>

The error type returned by the parser functions in the lexical analyzer

Required Methods§

Source

fn parse<'a>( &'a self, state: Self::State, parsers: &[BoxDynLexerParseFn<'a, Self>], ) -> LexerParseResult<Self::State, Self::Token, Self::Error>

This attempts to parse the next token found at the state of the Lexer stream, by applying the parsers in order.

An error is returned if the token cannot be parsed

Source

fn iter<'iter>( &'iter self, parsers: &'iter [BoxDynLexerParseFn<'iter, Self>], ) -> Box<dyn Iterator<Item = Result<Self::Token, Self::Error>> + 'iter>

This creates an iterator over all of the tokens in the Lexer stream, by applying the parsers in order at the current stream position whenever the ‘next’ method is invoked.

The iterator returns None when the end of stream is reached, otherwise it returns a result of the token or an error, depending on the success of the parsers.

Dyn Compatibility§

This trait is not dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety", so this trait is not object safe.

Implementors§

Source§

impl<'a, P, T, E> Lexer for LexerOfStr<'a, P, T, E>
where P: PosnInCharStream, T: Debug + Clone, E: LexerError<P>,

Source§

type Token = T

Source§

type Error = E

Source§

type State = P