Expand description
Low-level Rust lexer.
Tokens produced by this lexer are not yet ready for parsing the Rust syntax,
for that see librustc_parse::lexer
, which converts this basic token stream
into wide tokens used by actual parser.
The purpose of this crate is to convert raw sources into a labeled sequence of well-known token types, so building an actual Rust token stream will be easier.
Main entity of this crate is TokenKind
enum which represents common
lexeme types.
Modules§
- unescape
- Utilities for validating string and char literals and turning them into values they represent.
Structs§
- Token
- Parsed token. It doesn’t contain information about data that has been parsed, only the type of the token and its size.
Enums§
- Base
- Base of numeric literal encoding according to its prefix.
- Literal
Kind - Token
Kind - Enum representing common lexeme types.
Functions§
- first_
token - Parses the first token from the provided input string.
- is_
id_ continue - True if
c
is valid as a non-first character of an identifier. See Rust language reference for a formal definition of valid identifier name. - is_
id_ start - True if
c
is valid as a first character of an identifier. See Rust language reference for a formal definition of valid identifier name. - is_
whitespace - True if
c
is considered a whitespace according to Rust language definition. See Rust language reference for definitions of these classes. - strip_
shebang rustc
allows files to have a shebang, e.g. “#!/usr/bin/rustrun”, but shebang isn’t a part of rust syntax, so this function skips the line if it starts with a shebang (“#!”). Line won’t be skipped if it represents a valid Rust syntax (e.g. “#![deny(missing_docs)]”).- tokenize
- Creates an iterator that produces tokens from the input string.