Crate relex[][src]

Expand description

This crate provides several utilities for creating regex-based lexers. A lexer uses RegexSet’s from the regex crate for maximum efficiency. Use this when you want to spin up a lexer quickly.

Here is a quick example to get you started:

use relex::*;

#[derive(Debug, Clone, PartialEq)]
enum MyToken {
  Whitespace,
  ID,
  Float,
  Eof,
  Unrecognized,
}
impl TokenKind for MyToken {
  fn eof() -> Self { MyToken::Eof }
  fn unrecognized() -> Self { MyToken::Unrecognized }
}

let mut lexer = RecognizerBuilder::new()
  .token(Rule::new(MyToken::Whitespace, r"\s+").unwrap().skip(true))
  .token(Rule::new(MyToken::ID, r"[A-Za-z]+").unwrap())
  .token(Rule::new(MyToken::Float, r"(\d+)(?:\.(\d+))?").unwrap().capture(true)) // this one captures groups
                                                                                 // because it calls `.capture(true)`
  .build()
  .into_lexer(" abc", 0);

let token = lexer.next().unwrap();
assert_eq!(token.kind, MyToken::ID);
assert_eq!(token.text, "abc");
assert_eq!(token.skipped[0].kind, MyToken::Whitespace);

Structs

Lexer

A lexer is a stateful lexer, i.e. one that keeps track of its current position.

Recognizer

A recognizer that houses a bunch of Rules

RecognizerBuilder

A convenient builder-pattern struct that creates Lexers

Rule

Represents a lexer rule: i.e., a regex that can produce tokens, and whether they are skippable or not. A Rule also stores the token kind.

Token

Represents a detected token

TokenCapture

Represents information for a given capture for a given token.

Traits

TokenKind

You must implement this trait for your own custom TokenKind’s For example: