tokenizer_py-0.2.0 has been yanked.
Python-like Tokenizer in Rust
This project implements a Python-like tokenizer in Rust.
It can tokenize a string into a sequence of tokens, which are
represented by the Token enum. The supported tokens are:
Token::Name: a name token, such as a function or variable name.Token::Number: a number token, such as a literal integer or floating-point number.Token::String: a string token, such as a single or double-quoted string.Token::OP: an operator token, such as an arithmetic or comparison operator.Token::Indent: an indent token, indicating that a block of code is being indented.Token::Dedent: a dedent token, indicating that a block of code is being dedented.Token::Comment: a comment token, such as a single-line or multi-line comment.Token::NewLine: a newline token, indicating a new line in the source code.Token::NL: a token indicating a new line, for compatibility with the original tokenizer.Token::EndMarker: an end-of-file marker.
The tokenizer recognizes the following tokens:
-
Whitespace: spaces, tabs, and newlines. -
Numbers: integers and floating-point numbers.-
float: floats numbers. -
int: integer numbers. -
complex: complex numbers.
-
-
Names: identifiers and keywords. -
Strings: single- and double-quoted strings.-
basic-String: single- and double-quoted strings. -
format-String: format string from python. -
byte-String: byte string from python. -
raw-String: raw string. -
multy-line-String: single- and double-quoted multy-line-string. -
combined-string: string with combined prefix.
-
-
Operators: arithmetic, comparison, and other operators. -
Comments: single-line comments.
The tokenizer also provides a tokenize
method that takes a string as input and returns a Result containing a vector
of tokens.
Usage
Add this to your Cargo.toml:
[]
= "0.2.0"
Exemples
Example of using the tokenizer to tokenize the string "hello world"
use ;
let tokens = tokenize.unwrap;
assert_eq!;
Example of using the BinaryExp structure to evaluate the binary expression "10 + 10"
use ;
// Structure representing a binary expression
let mut tokens = tokenize.unwrap;
let _ = tokens.pop; // Remove Token::EndMarker
let _ = tokens.pop; // Remove Token::NewLine
let binexp = new;
assert_eq!; // Checking the execution result