leekscript-rs
A LeekScript parser implemented in Rust using sipha 2.0 (PEG parser with green/red syntax trees).
Status
- Phase 1 (lexer) — Done: token stream (keywords, identifiers, numbers, strings, operators, brackets, comments). Use
parse_tokens(). - Phase 2 — Done: primary expressions (number, string, identifier, parenthesized). Use
parse_expression(). - Phase 3 — Done: program = list of statements (var/global/const/let, if/while/for/do-while, return/break/continue, blocks, expression statements). Use
parse(). - Phase 4 — Done: top-level statements include
include, function declarations, and class declarations; program root is a single node with statement children.
CLI
The leekscript binary supports format and validate (and more to come):
# Format from stdin to stdout
# Format a file in place
# Check if formatting would change (exit 1 if so)
# Validate syntax and run semantic analysis (scopes, types, deprecations)
# Canonical format: normalize indentation, braces, semicolons
Format: by default prints the syntax tree as-is (round-trip). Use --canonical to normalize layout (indent, brace style, semicolons). Use --preserve-comments (default) to include comments and whitespace. See leekscript format --help.
Library usage
use ;
// Token stream only (Phase 1)
let out = parse_tokens?;
let root = out.syntax_root.unwrap;
// Full parse
let root = parse?.expect;
// Format
let options = default;
let formatted = format;
Examples
# Parse and print syntax tree for example .leek files
# Validate a program (optional: path to .leek file)
Tests
Architecture
See ARCHITECTURE.md for grammar phases (token stream → expression → program), the analysis pipeline (ScopeBuilder → Validator → TypeChecker → DeprecationChecker), and how DocumentAnalysis ties parsing, analysis, and definition map for LSP.
Reference
Grammar and token set are aligned with the LeekScript Java compiler (lexer in LexicalParser.java, token types in TokenType.java).