Module rgo::lexer [−] [src]

Lexer

A Lexer parses a source string into a list of tokens, which may later be used to construct an Abstract Syntax Tree.

We want meaningful errors from the start. That means printing the line and column number on error, returning Results instead of panicking (later on, we may use unwinding to speed up lexical analysis in non-erroneous cases).
It is unclear whether we should operator on Unicode char, or plain bytes u8. chars are more convenient to display and offer a clean API; bytes are (most likely) faster to work with.
I'm not sure what the best way to store tokens is. A slice into the original source, an interned string...? Probably an interned string, this is what rustc uses and it speeds up comparisons, which are going to be very frequent. Probably reduces allocations, too - and we're allocating a lot. We'd have to benchmark to be sure.

pub use token::*;

Convenience function to collect all the tokens from a string.