Expand description
This crate provides several utilities for creating regex-based lexers. A lexer uses RegexSet’s from the regex crate for maximum efficiency. Use this when you want to spin up a lexer quickly.
Here is a quick example to get you started:
use relex::*;
#[derive(Debug, Clone, PartialEq)]
enum MyToken {
Whitespace,
ID,
Float,
Eof,
Unrecognized,
}
impl TokenKind for MyToken {
fn eof() -> Self { MyToken::Eof }
fn unrecognized() -> Self { MyToken::Unrecognized }
}
let mut lexer = RecognizerBuilder::new()
.token(Rule::new(MyToken::Whitespace, r"\s+").unwrap().skip(true))
.token(Rule::new(MyToken::ID, r"[A-Za-z]+").unwrap())
.token(Rule::new(MyToken::Float, r"(\d+)(?:\.(\d+))?").unwrap().capture(true)) // this one captures groups
// because it calls `.capture(true)`
.build()
.into_lexer(" abc", 0);
let token = lexer.next().unwrap();
assert_eq!(token.kind, MyToken::ID);
assert_eq!(token.text, "abc");
assert_eq!(token.skipped[0].kind, MyToken::Whitespace);
Structs§
- Lexer
- A lexer is a stateful lexer, i.e. one that keeps track of its current position.
- Recognizer
- A recognizer that houses a bunch of Rules
- Recognizer
Builder - A convenient builder-pattern struct that creates Lexers
- Rule
- Represents a lexer rule: i.e., a regex that can produce tokens, and whether they are skippable or not. A Rule also stores the token kind.
- Token
- Represents a detected token
- Token
Capture - Represents information for a given capture for a given token.
Traits§
- Token
Kind - You must implement this trait for your own custom TokenKind’s For example: