Expand description
§SAS Lexer
A lexer for the SAS programming language.
§Usage
use sas_lexer::{lex_program, LexResult, TokenIdx};
let source = "data mydata; set mydataset; run;";
let LexResult { buffer, .. } = lex_program(&source).unwrap();
let tokens: Vec<TokenIdx> = buffer.iter_tokens().collect();
for token in tokens {
println!("{:?}", buffer.get_token_raw_text(token, &source));
}
§Features
macro_sep
: Enables a special virtualMacroSep
token that is emitted between open code and macro statements when there is no “natural” separator, or when semicolon is missing between two macro statements (a coding error). This may be used by a downstream parser as a reliable terminating token for dynamic open code and thus avoid doing lookaheads. Dynamic, means that the statement has a macro statements in it, likedata %if cond %then %do; t1 %end; %else %do; t2 %end;;
serde
: Enables serialization and deserialization of theResolvedTokenInfo
struct using theserde
library. For an example of usage, see the Python bindings cratesas-lexer-py
.opti_stats
: Enables some additional statistics during lexing, used for performance tuning. Not intended for general use.
§License
Licensed under the Affero GPL v3 license.
Modules§
Structs§
- LexResult
- Result of lexing
- Resolved
Token Info - A struct with all token information usable without the
TokenizedBuffer
- Token
Idx - A token index, used to get actual token data via the tokenized buffer.
- Token
Info - A struct to hold information about the tokens in the tokenized buffer.
- Tokenized
Buffer - A special structure produced by the lexer that stores the full information about lexed tokens and lines. A struct of arrays, used to optimize memory usage and cache locality.
Enums§
- Payload
- Enum representing varios types of extra data associated with a token.
- Token
Channel - Token channel.
- Token
Type - What you expect - the token types.
Functions§
- lex_
program - Lex the source code of an entire program.