klex (kujira-lexer)
A simple lexer (tokenizer) generator for Rust.
English | 日本語はこちら
Overview
klex generates Rust lexer code from a single definition file. You describe token patterns with regular expressions, and it outputs Rust source that includes a Token struct and a Lexer struct.
Installation
From crates.io
From source
Usage
As a library
use ;
use fs;
// Read input file
let input = read_to_string.expect;
// Parse the input
let spec = parse_spec.expect;
// Generate Rust code
let output = generate_lexer;
// Write output
write.expect;
Command line tool
Input file format
An input file consists of three sections separated by %%:
(Rust code here – e.g. use statements)
%%
(Rules here – token patterns written as regular expressions)
%%
(Rust code here – e.g. main function or tests)
Writing rules
Write one rule per line in the following form:
<regex pattern> -> <TOKEN_NAME>
Examples:
[0-9]+ -> NUMBER
[a-zA-Z_][a-zA-Z0-9_]* -> IDENTIFIER
\+ -> PLUS
\- -> MINUS
Generated Token struct
The generated lexer produces tokens with the following shape:
Examples
See example.klex for a minimal definition file.
Generate a lexer
Use the generated lexer
The generated file exports a Lexer struct and related constants:
let input = "123 + abc".to_string;
let mut lexer = new;
while let Some = lexer.next_token
Tests
License
MIT License