minipg 0.2.0

A modern parser generator supporting ANTLR4 grammars with code generation for Rust, Python, and JavaScript
Documentation

minipg - Mini Parser Generator

A fast, Rust-native ANTLR4-compatible parser generator focused on the Rust ecosystem. Generate standalone parsers for Rust and Python from ANTLR4 grammars with no runtime dependencies.

✨ Features

🚀 Performance

  • Fast code generation - Sub-millisecond for typical grammars
  • Linear O(n) scaling with grammar complexity
  • Low memory usage - <100 KB for code generation
  • No runtime dependencies - Generates standalone parsers

🌍 Language Support (3 Core Languages)

  • Rust - Primary target with optimized DFA generation and inline attributes ✅
  • Python - Type hints and dataclasses (Python 3.10+) ✅
  • JavaScript - Modern ES6+ with error recovery ✅

🎯 ANTLR4 Compatible

  • Character Classes - Full Unicode support with escapes (\u0000-\uFFFF) ✅
  • Non-Greedy Quantifiers - .*?, .+?, .??
  • Lexer Commands - -> skip, -> channel(NAME), -> mode(NAME)
  • Lexer Modes & Channels - Mode stack management and channel routing ✅
  • Labels - Element labels (id=ID) and list labels (ids+=ID) ✅
  • Named Actions - @header, @members
  • Actions & Predicates - Embedded actions and semantic predicates ✅
  • Fragments - Reusable lexer components ✅
  • Parameterized Rules - Arguments, returns, and local variables ✅
  • Grammar Imports - import X; syntax ✅
  • Grammar Options - options {...} blocks ✅

🔧 Core Features

  • Standalone Code Generation - No runtime dependencies
  • Semantic Analysis - Undefined rules, duplicates, left recursion detection
  • Rich Diagnostics - Detailed error messages with location information
  • Visitor Pattern - Flexible AST traversal
  • CLI Tool - Easy-to-use command-line interface
  • Comprehensive Testing - 150+ tests with 100% pass rate

Architecture

minipg is organized as a single crate with modular structure:

  • core: Core types, traits, and error handling
  • ast: Abstract Syntax Tree definitions and visitor patterns
  • parser: Grammar file parser (lexer + parser)
  • analysis: Semantic analysis and validation
  • codegen: Code generation for target languages (Rust, Python, JavaScript)
  • CLI: Command-line interface with binary

See ARCHITECTURE.md for detailed design documentation.

Installation

From crates.io

cargo install minipg

From Source

git clone https://github.com/yingkitw/minipg
cd minipg
cargo install --path .

Usage

Generate a Parser

# Generate Rust parser
minipg generate grammar.g4 -o output/ -l rust

# Generate Python parser
minipg generate grammar.g4 -o output/ -l python

# Generate JavaScript parser
minipg generate grammar.g4 -o output/ -l javascript

Validate a Grammar

minipg validate grammar.g4

Show Grammar Information

minipg info grammar.g4

Grammar Syntax

minipg supports ANTLR4-compatible syntax with advanced features:

grammar Calculator;

// Parser rules
expr: term (('+' | '-') term)*;
term: factor (('*' | '/') factor)*;
factor: NUMBER | '(' expr ')';

// Lexer rules with character classes
NUMBER: [0-9]+;
IDENTIFIER: [a-zA-Z_][a-zA-Z0-9_]*;

// Non-greedy quantifiers for comments
BLOCK_COMMENT: '/*' .*? '*/' -> skip;
LINE_COMMENT: '//' .*? '\n' -> skip;

// Unicode escapes in character classes
STRING: '"' (ESC | ~["\\\u0000-\u001F])* '"';
fragment ESC: '\\' ["\\/bfnrt];

// Lexer commands
WS: [ \t\r\n]+ -> skip;

Comparisons

minipg vs ANTLR4 vs Pest

Feature minipg ANTLR4 Pest
Language Rust Java Rust
Runtime Dependency None (standalone) Requires runtime library Requires runtime library
Grammar Syntax ANTLR4 compatible ANTLR4 (native) PEG
Target Languages Rust, Python, JavaScript Java, Python, JS, C#, C++, Go, Swift Rust only
Code Generation Standalone parsers Runtime-based parsers Macro-based
Generation Speed Sub-millisecond Seconds Compile-time
Memory Usage <100 KB Higher (JVM overhead) Low (Rust native)
AST Patterns Visitor/listener Visitor/listener Manual tree walking
Error Recovery Built-in Built-in Stops at first error
Test Coverage 150+ tests, 100% pass Comprehensive Good
Standalone Output ✅ Yes ❌ Requires runtime ❌ Requires runtime
ANTLR4 Grammars ✅ Compatible ✅ Native ❌ Different syntax

Key Advantages of minipg:

  • Fast code generation - Sub-millisecond for typical grammars
  • 🚀 No runtime dependencies - Generates standalone parsers
  • 🦀 Modern Rust - Safety guarantees and modern tooling
  • 📦 Smaller footprint - <100 KB memory usage
  • 🔧 Easy integration - No Java runtime required
  • ANTLR4 compatible - Works with existing ANTLR4 grammars
  • Focused scope - Rust and Python targets, well-tested

Choose minipg if you need:

  • ANTLR4 grammar compatibility
  • Standalone parsers with no runtime dependencies
  • Rust or Python parser generation
  • Fast code generation
  • Modern Rust implementation

Choose ANTLR4 if you need:

  • Mature, battle-tested tooling
  • Extensive documentation and community
  • Java ecosystem integration
  • Runtime-based parsing with advanced features

Choose Pest if you need:

  • Rust-only parsing
  • PEG parsing semantics
  • Compile-time grammar validation
  • Tight Rust macro integration
  • Zero-cost abstractions at compile time

See docs/archive/COMPARISON_WITH_ANTLR4RUST.md and docs/archive/COMPARISON_WITH_PEST.md for detailed comparisons.

Documentation

Development

Building

cargo build

Running Tests

cargo test --all

✅ All Tests Passing!

minipg has comprehensive test coverage with 186+ tests passing at 100% success rate:

  • 106 unit tests - Core functionality and parsing
  • 19 integration tests - Full pipeline (parse → analyze → generate)
  • 21 analysis tests - Semantic analysis, ambiguity detection, reachability
  • 21 codegen tests - Multi-language code generation
  • 19 compatibility tests - ANTLR4 feature compatibility
  • 13 feature tests - Advanced grammar features
  • 9 example tests - Real-world grammar examples

Grammar Test Suite: minipg can successfully parse and generate code from a wide variety of ANTLR4 grammars, including:

  • ✅ All example grammars in the repository
  • ✅ Real-world grammars from the grammars-v4 repository
  • ✅ Complex grammars with advanced features (modes, channels, actions)
  • ✅ Multi-language code generation validation

All tests pass successfully, demonstrating robust grammar parsing and code generation capabilities.

Running with Logging

RUST_LOG=info cargo run -- generate grammar.g4

Project Status

  • Current Version: 0.2.0 (Simplified & Focused)
  • Status: Core Features Complete ✅
  • Test Suite: 150+ tests with 100% pass rate
    • ✅ Grammar parsing tests
    • ✅ Code generation tests (Rust, Python, JavaScript)
    • ✅ Integration tests
    • ✅ ANTLR4 compatibility tests
  • Target Languages: 3 core languages (Rust, Python, JavaScript)
  • Grammar Support:
    • ✅ CompleteJSON.g4
    • ✅ SQL.g4
    • ✅ 15+ example grammars
    • ✅ ANTLR4 grammars-v4 compatible
  • ANTLR4 Compatibility: High - supports most common features
  • Core Features:
    • ✅ Rule arguments: rule[Type name]
    • ✅ Return values: returns [Type name]
    • ✅ Local variables: locals [Type name]
    • ✅ List labels (ids+=ID)
    • ✅ Named actions (@header, @members)
    • ✅ Lexer modes and channels
    • ✅ Grammar composition and imports

See TODO.md for current tasks and docs/archive/ROADMAP.md for the complete roadmap.

License

Apache-2.0