Parsanol-rs

A high-performance PEG (Parsing Expression Grammar) parser library for Rust with packrat memoization and arena allocation.

Purpose

Parsanol-rs is a generic, domain-agnostic PEG parser library written in Rust. It provides high-performance parsing capabilities with a focus on:

Speed: Packrat memoization for O(n) parsing complexity
Memory efficiency: Arena allocation for zero-copy AST construction
Developer experience: Fluent API for building grammars, rich error reporting
Flexibility: Transform system for converting parse trees to typed Rust structs via derive macros

Features

Quick Start - Get started in minutes
Backend Abstraction - Extensible backend trait system
Bytecode Backend - Optional VM backend for linear patterns
Parser DSL - Fluent API for grammar definition
Capture Atoms - Extract named values during parsing
Scope Atoms - Isolated capture contexts
Dynamic Atoms - Runtime-determined parsing via callbacks
Streaming with Captures - Memory-efficient parsing with capture support
Transform System - Convert parse trees to typed structs
Derive Macros - Automatic typed AST generation
Streaming Builder - Single-pass parsing with custom output
Parallel Parsing - Multi-file parsing with rayon
Infix Expression Parsing - Built-in operator precedence
Rich Error Reporting - Tree-structured error messages
Source Location Tracking - Line/column tracking through transforms
Grammar Composition - Import and compose grammars
Ruby FFI - Optional Ruby bindings
WASM Support - Optional WebAssembly bindings

Bytecode Backend

Parsanol-rs supports two parsing backends:

Packrat (default): Memoization-based parser with O(n) time complexity for all grammars
Bytecode VM: Stack-based virtual machine with optimization passes

Backend Comparison

Both backends produce identical parsing results for all valid inputs. The difference lies in performance characteristics:

Aspect	Packrat	Bytecode VM
Time Complexity	Guaranteed O(n)	O(n) to O(2^n) depending on grammar
Memory Usage	Higher (memoization table)	Lower (stack-based)
Compilation	None required	Pre-compilation needed
Nested Repetitions	Handles efficiently	Can be exponential
Simple Patterns	Good	Excellent
Predictability	Consistent performance	Varies by grammar

Performance Characteristics

Packrat Backend:

Memoization stores parse results at each position
Guarantees O(n) time complexity regardless of grammar structure
Memory overhead scales with input size and grammar complexity
Ideal when predictable performance is required

Bytecode VM Backend:

Stack-based execution with backtracking
O(n) for linear patterns (most common case)
Can exhibit O(2^n) behavior for pathological patterns like (a*)*
Lower memory footprint, good for memory-constrained environments
Pre-compilation enables optimization passes

Decision Matrix

Grammar Type	Recommended Backend	Reason
JSON, XML, config files	Either	Linear patterns, both perform well
Programming languages	Packrat	Complex grammar with nested structures
Log parsing	Bytecode	Simple patterns, streaming potential
Nested repetitions `(a)`	Packrat	Avoids exponential backtracking
Memory-constrained	Bytecode	Lower memory footprint
Need predictable O(n)	Packrat	Guaranteed linear time

Automatic Selection

Use Backend::Auto (the default) to let parsanol analyze your grammar:

// Automatic selection (default)
let mut parser = Parser::auto(grammar);

// Or explicitly:
let mut parser = Parser::new(grammar, Backend::Auto);

// Check the analysis
let analysis = parser.analysis();
println!("Has nested repetitions: {}", analysis.has_nested_repetition);
println!("Recommended: {:?}", analysis.recommended_backend());

Why Nested Repetitions Are the Criterion

The backend selection is based on a single hard rule:

Has nested repetitions (e.g., (a*)*) → Packrat
Otherwise → Bytecode

This is the only criterion because nested repetitions are the only pattern that causes exponential time complexity in the bytecode backend. Here's why:

The Algorithmic Problem:

When a repetition contains another repetition, the parser must try all possible ways to divide the input. For pattern (a*)* on input "aaa":

Division 1: (aaa)           - outer * matches 1 group
Division 2: (aa)(a)         - outer * matches 2 groups
Division 3: (a)(aa)         - outer * matches 2 groups (different split)
Division 4: (a)(a)(a)       - outer * matches 3 groups
... and so on

The number of ways to partition n characters is O(2^n). The bytecode VM tries each possibility via backtracking, leading to exponential time.

Why Packrat Solves It:

Packrat memoizes results by (position, rule). Once (a*) is evaluated at position i, the result is cached. Subsequent evaluations at the same position are O(1) cache hits. This guarantees O(n) total time.

Why Other Patterns Don't Matter:

Pattern	Time Impact	Backend Difference
Overlapping choices (`"a" \| "aa"`)	Linear backtracking	Both handle identically
Deep nesting	Stack depth increases	Both handle fine
Many alternatives	More choice points	Linear in alternative count
Left recursion	Infinite loop	Both fail - not a backend issue

How the Analysis Works

The grammar analysis is deliberately simple:

pub struct GrammarAnalysis {
    /// Total atoms in the grammar
    pub atom_count: usize,
    /// Whether any Repetition contains another Repetition
    pub has_nested_repetition: bool,
}

The algorithm iterates through all atoms and checks: "Is this a Repetition whose inner atom is also a Repetition?"

for atom in &grammar.atoms {
    if let Atom::Repetition { atom: inner_idx, .. } = atom {
        if let Some(inner) = grammar.get_atom(*inner_idx) {
            if matches!(inner, Atom::Repetition { .. }) {
                has_nested_repetition = true;
                break;
            }
        }
    }
}

This is O(atoms) time and detects the only pattern that matters for backend selection.

When to Override Auto-Selection

The auto-selection only considers time complexity. You may want to manually select based on:

Scenario	Manual Selection	Rationale
Memory-constrained (embedded, WASM)	`Backend::Bytecode`	Lower memory: O(depth) vs O(n×rules)
Very large files (>100MB)	`Backend::Bytecode`	Packrat table grows with input size
Predictable latency required	`Backend::Packrat`	Guaranteed O(n), no pathological cases
Streaming parsing	`Backend::Bytecode`	Packrat requires full input in memory
Incremental re-parsing	`Backend::Packrat`	Memo table can be reused for unchanged portions
Grammar has nested repetitions but input is bounded	Either	If input is always small, exponential doesn't matter
Testing/debugging	`Backend::Packrat`	Consistent behavior across all inputs

// Memory-constrained environment
let mut parser = Parser::bytecode(grammar);

// Safety-critical with guaranteed O(n)
let mut parser = Parser::packrat(grammar);

// Explicit choice regardless of analysis
let mut parser = Parser::new(grammar, Backend::Packrat);

Problematic Grammar Patterns

The following patterns can cause exponential O(2^n) behavior in the Bytecode backend. They are safe with Packrat due to memoization. If your grammar contains these, use Packrat explicitly or rely on Backend::Auto.

Critical Pattern: Nested Repetitions

(a*)*     // CRITICAL: Outer * tries O(2^n) ways to divide input
(a+)+     // Same issue
((a|b)*)* // Even worse with choice

// Safe alternatives:
a*        // Single repetition - O(n)
(a b)*    // Fixed sequence inside - O(n)

Moderate Pattern: Overlapping Choice Prefixes

// Problematic: All start with 'a'
("a" | "aa" | "aaa")+

// Better: Distinct first characters
("a" | "b" | "c")+

Safe Pattern: Deep Recursion (Both handle well)

expr = term (("+" | "-") term)*
// Recursive but structured - both backends handle efficiently

Analyzing Your Grammar

Use the GrammarAnalysis API to check for nested repetitions:

use parsanol::portable::{
    parser_dsl::{str, re, GrammarBuilder},
    bytecode::{Backend, GrammarAnalysis, Parser},
};

fn main() {
    let grammar = GrammarBuilder::new()
        .rule("expr", re(r"[0-9]+"))
        .build();

    // Analyze the grammar
    let analysis = GrammarAnalysis::analyze(&grammar);

    // The only field that matters for backend selection
    if analysis.has_nested_repetition {
        println!("⚠️  Nested repetitions detected - use Packrat!");
    } else {
        println!("✅ No nested repetitions - Bytecode is efficient");
    }

    // Get recommendation (hard rule: nested repetition → Packrat, else → Bytecode)
    println!("Recommended: {:?}", analysis.recommended_backend());
}

GrammarAnalysis Fields:

Field	Type	Purpose
`atom_count`	`usize`	Number of atoms in grammar (informational)
`has_nested_repetition`	`bool`	The criterion - if true, use Packrat

The recommended_backend() Method:

Returns Backend::Packrat if has_nested_repetition is true, otherwise Backend::Bytecode. This is what Backend::Auto uses internally.

Using the Bytecode Backend

use parsanol::portable::{
    parser_dsl::{str, re, GrammarBuilder},
    bytecode::{Backend, Parser},
};

let grammar = GrammarBuilder::new()
    .rule("number", re(r"[0-9]+"))
    .build();

// Create parser with bytecode backend
let mut parser = Parser::new(grammar, Backend::Bytecode);
let result = parser.parse("42");

// Or use auto-selection (analyzes grammar complexity)
let mut parser = Parser::auto(grammar);
let result = parser.parse("42");

Known Differences

Both backends produce identical results for the vast majority of patterns. However, there are edge cases where behavior differs:

Alternatives in Sequences: For patterns like ("a" | "aa") "b" on input "aab":

Packrat: May succeed due to memoization re-evaluation
Bytecode: Fails (standard PEG semantics - once "a" succeeds, "aa" is not tried)

This difference only affects patterns with:

Alternatives containing overlapping prefixes ("a" vs "aa")
The alternative is followed by content that fails
The later alternative would allow the following content to succeed

For most practical grammars, this difference never manifests. Use Backend::Auto to let parsanol choose the appropriate backend.

Backend Abstraction

Parsanol provides a trait-based backend abstraction for extensibility. You can implement custom backends or use the built-in ones interchangeably.

Using the ParsingBackend Trait

use parsanol::portable::backend::{ParsingBackend, PackratBackend, BytecodeBackend, Backend};

// Use Packrat backend for predictable O(n) performance
let mut packrat = PackratBackend::new();
let result = packrat.parse(&grammar, input)?;

// Use Bytecode backend for lower memory usage
let mut bytecode = BytecodeBackend::new();
let result = bytecode.parse(&grammar, input)?;

// Configure backends
let packrat = PackratBackend::new()
    .with_max_recursion_depth(500)
    .with_timeout_ms(5000);

let bytecode = BytecodeBackend::new()
    .with_auto_fallback(true);  // Falls back to Packrat for complex grammars

Runtime Backend Selection

use parsanol::portable::backend::Backend;

// Select backend at runtime
let backend_type = Backend::default_for_grammar(&grammar);

match backend_type {
    Backend::Packrat => {
        let mut parser = PackratBackend::new();
        parser.parse(&grammar, input)?
    }
    Backend::Bytecode => {
        let mut parser = BytecodeBackend::new();
        parser.parse(&grammar, input)?
    }
};

Backend Characteristics

Each backend documents its performance characteristics:

use parsanol::portable::backend::{ParsingBackend, PackratBackend};

let backend = PackratBackend::new();
let chars = backend.characteristics();

println!("Time: {}", chars.time_complexity);        // "O(n)"
println!("Memory: {}", chars.memory_complexity);    // "O(n × r)"
println!("Memoization: {}", chars.uses_memoization); // true
println!("Streaming: {}", chars.supports_streaming); // false
println!("Incremental: {}", chars.supports_incremental); // true
println!("Safe: {}", chars.safe_for_all_grammars);  // true

Implementing Custom Backends

use parsanol::portable::backend::{ParsingBackend, BackendCharacteristics, BackendResult};
use parsanol::portable::grammar::Grammar;

struct MyCustomBackend;

impl ParsingBackend for MyCustomBackend {
    fn parse(&mut self, grammar: &Grammar, input: &str) -> BackendResult {
        // Custom parsing logic here
        todo!()
    }

    fn name(&self) -> &'static str {
        "my-custom"
    }

    fn characteristics(&self) -> BackendCharacteristics {
        BackendCharacteristics {
            time_complexity: "O(n log n)",
            memory_complexity: "O(n)",
            uses_memoization: false,
            supports_streaming: true,
            supports_incremental: false,
            safe_for_all_grammars: true,
        }
    }
}

Dynamic Backend Dispatch

For runtime polymorphism:

use parsanol::portable::backend::{DynBackend, PackratBackend, BytecodeBackend};

fn get_backend(use_packrat: bool) -> DynBackend {
    if use_packrat {
        Box::new(PackratBackend::new())
    } else {
        Box::new(BytecodeBackend::new())
    }
}

let mut backend: DynBackend = get_backend(true);
let result = backend.parse(&grammar, input)?;

Quick Start Examples

Using the bytecode backend explicitly:

use parsanol::portable::{
    parser_dsl::{str, re, GrammarBuilder},
    bytecode::{Backend, Parser},
};

let grammar = GrammarBuilder::new()
    .rule("number", re(r"[0-9]+"))
    .build();

// Create parser with bytecode backend
let mut parser = Parser::new(grammar, Backend::Bytecode);
let result = parser.parse("42");

Using packrat backend explicitly:

let mut parser = Parser::new(grammar, Backend::Packrat);
let result = parser.parse("42");

Optimization Passes

The bytecode backend applies 11 optimization passes automatically:

DeadCodeElimination - Remove unreachable code
JumpChainSimplification - Simplify jump chains
JumpToReturnSimplification - Direct returns
JumpToFailSimplification - Direct failures
CombineAdjacentChars - Char merging
SpanOptimization - CharSet* to Span
FullCaptureOptimization - Capture pairs to FullCapture
TestCharOptimization - Choice patterns to TestChar
TestSetOptimization - Choice patterns to TestSet
TailCallOptimization - Tail calls to jumps
LookaheadOptimization - Choice to PredChoice for predicates

Bytecode VM Architecture

Grammar (Atoms) ──► Compiler ──► Program (bytecode)
                                    │
                                    ▼
Input ──────────────────────────► VM ──► AstNode

The bytecode VM uses:

Backtracking stack: For choice point management
Capture stack: For building AST nodes
Instruction pointer: Sequential execution
Optimization passes: Peephole optimization on compiled bytecode

Instruction Set

The VM supports 28 instructions covering all PEG operations:

Category	Instructions
Matching	`Char`, `CharSet`, `String`, `Regex`, `Any`, `Custom`
Control Flow	`Jump`, `Call`, `Return`, `End`
Backtracking	`Choice`, `Commit`, `PartialCommit`, `BackCommit`, `Fail`, `FailTwice`
Captures	`OpenCapture`, `CloseCapture`, `FullCapture`
Tests	`TestChar`, `TestSet`, `TestAny`
Special	`Behind`, `Span`, `NoOp`, `PredChoice`

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    PARSANOL-RS                              │
│              (Generic PEG Parser Library)                   │
├─────────────────────────────────────────────────────────────┤
│  • Parser combinators (PEG atoms)                           │
│  • Grammar representation                                   │
│  • Packrat memoization                                      │
│  • Arena allocation                                         │
│  • Infix expression parsing                                 │
│  • Rich error reporting (tree structure)                    │
│  • Transform DSL (pattern matching)                         │
│  • Derive macros for typed ASTs                             │
│  • Optional Ruby FFI / WASM bindings                        │
└─────────────────────────────────────────────────────────────┘
          ▲                                    ▲
          │ (build ON TOP)                     │ (build ON TOP)
          │                                    │
┌─────────┴──────────┐               ┌─────────┴─────────┐
│   parsanol-express │               │   Your Language   │
│   (EXPRESS lexer)  │               │   (Your DSL)      │
└────────────────────┘               └───────────────────┘

[!IMPORTANT] Parsanol-rs is a GENERIC parser library. It has no knowledge of any specific domain (EXPRESS, Ruby, JSON, YAML, etc.). Domain-specific parsers should be built ON TOP of this library.

Workspace Structure

This repository uses a Cargo workspace with two crates:

parsanol-rs/
├── parsanol/              # Main parser library
│   ├── src/
│   └── Cargo.toml
├── parsanol-derive/       # Derive macros (always included)
│   ├── src/
│   └── Cargo.toml
├── examples/              # 39 example parsers
└── Cargo.toml             # Workspace root

Installation

Add this to your Cargo.toml:

[dependencies]
parsanol = "0.1"

The parsanol-derive crate is automatically included as a dependency, providing the #[derive(FromAst)] macro for typed AST conversion.

Optional Features

ruby - Enable Ruby FFI bindings (requires magnus, rb-sys)
wasm - Enable WebAssembly bindings (requires wasm-bindgen, js-sys)
parallel - Enable parallel parsing (requires rayon)

[dependencies]
parsanol = { version = "0.1", features = ["ruby", "parallel"] }

Quick Start

Basic Parsing

use parsanol::portable::{Grammar, PortableParser, AstArena, parser_dsl::*};

// Build a simple grammar
let grammar = GrammarBuilder::new()
    .rule("greeting", str("hello").then(str("world")))
    .build();

let input = "helloworld";
let mut arena = AstArena::for_input(input.len());
let mut parser = PortableParser::new(&grammar, input, &mut arena);

match parser.parse() {
    Ok(ast) => println!("Parsed successfully: {:?}", ast),
    Err(e) => println!("Parse error: {:?}", e),
}

Calculator with Operator Precedence

use parsanol::portable::{
    GrammarBuilder, PortableParser, AstArena, Grammar,
    parser_dsl::{str, re, ref_, seq, choice, dynamic},
    infix::{InfixBuilder, Assoc},
};

fn build_calculator_grammar() -> Grammar {
    let mut builder = GrammarBuilder::new();

    // Define atoms
    builder = builder.rule("number", re(r"[0-9]+"));
    builder = builder.rule("primary", choice(vec![
        dynamic(seq(vec![
            dynamic(str("(")),
            dynamic(ref_("expr")),
            dynamic(str(")")),
        ])),
        dynamic(ref_("number")),
    ]));

    // Build infix with precedence
    let expr_atom = InfixBuilder::new()
        .primary(ref_("primary"))
        .op("*", 2, Assoc::Left)
        .op("/", 2, Assoc::Left)
        .op("+", 1, Assoc::Left)
        .op("-", 1, Assoc::Left)
        .build(&mut builder);

    builder.update_rule("expr", expr_atom);
    builder.build()
}

Parser DSL

Atom Types

Atom	Description	Example
`str("literal")`	Match exact string	`str("hello")`
`re("pattern")`	Match regex pattern	`re(r"[0-9]+")`
`any()`	Match any single character	`any()`
`ref_("rule")`	Reference to named rule	`ref_("expr")`
`seq([...])`	Sequence of atoms	`seq(vec![a, b, c])`
`choice([...])`	Alternative atoms	`choice(vec![a, b])`
`cut()`	Commit to this branch (prevent backtracking)	`cut()`
`capture("name", atom)`	Extract named value during parsing	`capture("id", re(r"[a-z]+"))`
`scope(atom)`	Create isolated capture context	`scope(seq([...]))`
`dynamic(callback)`	Runtime-determined parsing via callback	`dynamic(callback_id)`

Combinators

All atoms implement the ParsletExt trait with these methods:

use parsanol::portable::parser_dsl::*;

// Sequence: A >> B
let parser = str("hello").then(str("world"));

// Alternative: A | B
let parser = str("foo").or(str("bar"));

// Repetition
let parser = str("a").repeat(1, None);    // One or more
let parser = str("a").repeat(0, Some(3)); // Zero to three
let parser = str("a").many();              // Zero or more
let parser = str("a").many1();             // One or more
let parser = str("a").optional();          // Zero or one

// Named capture
let parser = re(r"[0-9]+").label("number");

// Ignore (don't include in AST)
let parser = str(" ").ignore();

// Lookahead (don't consume)
let parser = str("hello").lookahead();     // Positive: must match
let parser = str("hello").not_ahead();     // Negative: must NOT match

Grammar Macro

For declarative grammar definition:

use parsanol::portable::parser_dsl::grammar;

let grammar = grammar! {
    "hello" => str("hello"),
    "world" => str("world"),
    "greeting" => ref_("hello").then(ref_("world")),
};

Capture Atoms

Capture atoms extract named values during parsing, similar to regex named groups. They work with all backends (Packrat, Bytecode, Streaming).

Basic Usage

use parsanol::portable::{
    parser_dsl::{capture, dynamic, re, seq, GrammarBuilder},
    PortableParser, AstArena,
};

let grammar = GrammarBuilder::new()
    .rule("greeting", seq(vec![
        capture("word", dynamic(re(r"[a-zA-Z]+"))),
    ]))
    .build();

let mut arena = AstArena::for_input(64);
let mut parser = PortableParser::packrat(grammar);
let result = parser.parse_from_pos(0, "hello world", &mut arena)?;

// Access captures
if let Some(text) = result.get_capture("word", "hello world") {
    println!("Captured: {}", text); // Prints: "hello"
}

Capture API

// Get a single capture by name
let value = result.get_capture("name", input);

// Get all capture names
for name in result.capture_names() {
    println!("Capture: {}", name);
}

// Check if capture exists
if result.has_capture("name") {
    // ...
}

Backend Compatibility

Backend	Capture Support	Notes
Packrat	Full	Native support
Bytecode	Full	Uses capture instructions
Streaming	Full	Captures persist across chunks

Scope Atoms

Scope atoms create isolated capture contexts. Captures made inside a scope are discarded when the scope exits, preventing pollution of the parent context.

Use Cases

Nested parsing where inner captures shouldn't affect outer state
Repetitive patterns where each iteration starts fresh
Context isolation in recursive grammars

Basic Usage

use parsanol::portable::parser_dsl::{scope, seq, capture, dynamic, re, GrammarBuilder};

let grammar = GrammarBuilder::new()
    .rule("outer", seq(vec![
        capture("outer_name", dynamic(re(r"[a-z]+"))),
        scope(seq(vec![
            capture("inner_name", dynamic(re(r"[0-9]+"))),
        ])),
        // "inner_name" is NOT available here
    ]))
    .build();

Dynamic Atoms

Dynamic atoms enable runtime-determined parsing via registered callbacks. This allows context-sensitive parsing where the grammar itself depends on input or previously captured values.

Registering a Callback

use parsanol::portable::{
    Grammar, Atom, Parser,
    dynamic::{DynamicCallback, DynamicContext, register_dynamic_callback},
    parser_dsl::*,
};

struct KeywordCallback;

impl DynamicCallback for KeywordCallback {
    fn call(&self, ctx: &DynamicContext) -> Option<Atom> {
        // Access current position
        let pos = ctx.pos();
        // Access input
        let input = ctx.input();
        // Access captures made so far
        if let Some(lang) = ctx.get_capture("language") {
            match lang {
                "ruby" => Some(Atom::Str { pattern: "def".into() }),
                "python" => Some(Atom::Str { pattern: "lambda".into() }),
                _ => None,
            }
        } else {
            None
        }
    }

    fn description(&self) -> &str {
        "keyword_callback"
    }
}

let callback_id = register_dynamic_callback(Box::new(KeywordCallback));

Using Dynamic Atoms in Grammars

let grammar = GrammarBuilder::new()
    .rule("keyword", dynamic_with_id(callback_id))
    .build();

Backend Compatibility

Backend	Dynamic Support	Notes
Packrat	Full	Native support (recommended)
Bytecode	Fallback	Uses Packrat internally
Streaming	Fallback	Uses Packrat internally

Note: For heavy dynamic atom usage, prefer the Packrat backend for best performance.

Streaming with Captures

The streaming parser supports captures while maintaining bounded memory usage. Captures persist across streaming parse operations.

Basic Usage

use parsanol::portable::{
    parser_dsl::{capture, dynamic, re, GrammarBuilder},
    streaming::{StreamingParser, ChunkConfig},
    arena::AstArena,
};
use std::io::Cursor;

let grammar = GrammarBuilder::new()
    .rule("word", capture("word", dynamic(re(r"[a-zA-Z]+"))))
    .build();

let config = ChunkConfig {
    chunk_size: 65536,  // 64 KB chunks
    window_size: 2,      // Keep 2 chunks in memory
};

let mut parser = StreamingParser::new(&grammar, config);
let mut arena = AstArena::for_input(65536);
let mut cursor = Cursor::new(input.as_bytes());

let result = parser.parse_from_reader(&mut cursor, &mut arena)?;

if let Some(captures) = &result.capture_state {
    for name in captures.names() {
        if let Some(value) = captures.get(&name) {
            println!("{} = {:?}", name, value.get_text(input));
        }
    }
}

Chunk Configuration

Preset	Chunk Size	Window	Use Case
`small()`	16 KB	2	Real-time feeds
`medium()`	64 KB	3	Default
`large()`	256 KB	4	Log files
`huge()`	1 MB	5	Large files

Performance Notes

Memory: O(chunk_size × window_size + capture_state)
Captures accumulate during parse, available at end
For very large captures, use reset() to process incrementally

Transform System

The transform system converts generic parse trees into typed Rust data structures, similar to Parslet’s transformation system.

Value Types

The Value enum represents transformed data:

pub enum Value {
    Nil,
    Bool(bool),
    Int(i64),
    Float(f64),
    String(String),
    Array(Vec<Value>),
    Hash(HashMap<String, Value>),
}

Basic Transformations

use parsanol::portable::transform::{Transform, Value, TransformError};

let transform = Transform::new()
    // Transform "int" captures by doubling
    .rule("int", |v| {
        let n = v.as_int().ok_or_else(|| TransformError::Custom("not int".into()))?;
        Ok(Value::int(n * 2))
    });

let value = Value::hash(vec![("int", Value::int(21))]);
let result = transform.apply(&value)?;
assert_eq!(result.as_int(), Some(42));

Pattern Matching

Pattern-based transformations similar to Parslet:

use parsanol::portable::transform::{Transform, Pattern, Value};

let transform = Transform::new()
    // Match hash with specific fields
    .pattern(
        Pattern::hash()
            .field("left", "l")
            .field("op", Pattern::str("+"))
            .field("right", "r"),
        |bindings| {
            let l = bindings.get_int("l")?;
            let r = bindings.get_int("r")?;
            Ok(Value::int(l + r))
        }
    );

Pattern Types

Pattern	Description	Example
`Pattern::simple("x")`	Match any leaf value and bind to variable	`Pattern::simple("n")` matches `42`
`Pattern::str("value")`	Match a specific string value	`Pattern::str("+")` matches `"+"`
`Pattern::int(n)`	Match a specific integer	`Pattern::int(42)` matches `42`
`Pattern::sequence("x")`	Match an array and bind to variable	`Pattern::sequence("items")`
`Pattern::subtree("x")`	Match anything and bind to variable	`Pattern::subtree("node")`
`Pattern::hash()`	Match a hash with specific fields	See example above

Converting AST to Value

use parsanol::portable::transform::{ast_to_value, Value};

// After parsing
let ast = parser.parse()?;
let value = ast_to_value(&ast, &arena, input);

// Now apply transforms
let result = transform.apply(&value)?;

Derive Macros

The FromAst derive macro automatically generates code to convert Value types into typed Rust structs and enums. This eliminates boilerplate code for AST transformation.

Basic Usage

use parsanol::derive::FromAst;
use parsanol::portable::transform::Value;

#[derive(FromAst, Debug)]
pub enum Expr {
    #[parsanol(tag = "number")]
    Number(i64),

    #[parsanol(tag = "binop")]
    BinOp {
        left: Box<Expr>,
        op: String,
        right: Box<Expr>,
    },
}

// Convert Value to typed Expr
let value: Value = /* ... parsed value ... */;
let expr: Expr = value.try_into()?;

Container Attributes

Attribute	Description
`#[parsanol(rule = "name")]`	Specify the grammar rule name

Variant Attributes (for enums)

Attribute	Description
`#[parsanol(tag = "literal")]`	Match by literal tag string
`#[parsanol(tag_expr = expr)]`	Match by expression (for dynamic tags)

Field Attributes

Attribute	Description
`#[parsanol(field = "name")]`	Map to different hash field name
`#[parsanol(default)]`	Use `Default::default()` if missing
`#[parsanol(default = expr)]`	Use expression if missing

Complete Example

use parsanol::derive::FromAst;
use parsanol::portable::transform::Value;

#[derive(FromAst, Debug)]
#[parsanol(rule = "statement")]
pub enum Statement {
    #[parsanol(tag = "assignment")]
    Assignment {
        #[parsanol(field = "name")]
        variable: String,
        value: Box<Expr>,
    },

    #[parsanol(tag = "return")]
    Return {
        #[parsanol(default)]
        value: Option<Box<Expr>>,
    },

    #[parsanol(tag = "if")]
    If {
        condition: Box<Expr>,
        then_block: Vec<Statement>,
        #[parsanol(default)]
        else_block: Option<Vec<Statement>>,
    },
}

// Usage
fn parse_statement(value: Value) -> Result<Statement, parsanol::derive::FromAstError> {
    value.try_into()
}

Single-Field Tuple Structs

Single-field tuple structs automatically get transparent conversion:

#[derive(FromAst)]
pub struct Identifier(pub String);

// Value::String("foo") directly converts to Identifier("foo")

Error Handling

use parsanol::derive::FromAstError;

match value.try_into() {
    Ok(expr) => println!("Parsed: {:?}", expr),
    Err(FromAstError::MissingField(field)) => {
        eprintln!("Missing field: {}", field);
    }
    Err(FromAstError::UnknownTag) => {
        eprintln!("Unknown tag in enum");
    }
    Err(e) => eprintln!("Conversion error: {}", e),
}

Streaming Builder

The streaming builder API allows single-pass parsing without intermediate AST construction. This is ideal for:

Maximum performance (eliminates AST allocation)
Custom output formats
Memory-constrained environments

Implementing StreamingBuilder

use parsanol::portable::streaming_builder::{StreamingBuilder, BuildResult, BuildError};

// Custom builder that collects all strings
struct StringCollector {
    strings: Vec<String>,
}

impl StreamingBuilder for StringCollector {
    type Output = Vec<String>;

    fn on_string(&mut self, value: &str, _offset: usize, _length: usize) -> BuildResult<()> {
        self.strings.push(value.to_string());
        Ok(())
    }

    fn finish(&mut self) -> BuildResult<Self::Output> {
        Ok(std::mem::take(&mut self.strings))
    }
}

Using parse_with_builder

use parsanol::portable::{Grammar, PortableParser, AstArena};

let grammar = /* ... */;
let input = "hello world";
let mut arena = AstArena::for_input(input.len());
let mut parser = PortableParser::new(&grammar, input, &mut arena);

// Create builder
let mut builder = StringCollector { strings: vec![] };

// Parse with streaming builder
let result = parser.parse_with_builder(&mut builder)?;
// result: Vec<String>

Built-in Builders

Several useful builders are provided:

Builder	Description
`DebugBuilder`	Collects all events as strings for debugging
`BuilderStringCollector`	Collects all string values
`BuilderNodeCounter`	Counts nodes by type

Ruby Integration

The streaming builder works with Ruby callbacks via FFI:

require 'parsanol'

class MyBuilder
  include Parsanol::BuilderCallbacks

  def initialize
    @strings = []
  end

  def on_string(value, offset, length)
    @strings << value
  end

  def finish
    @strings
  end
end

builder = MyBuilder.new
result = Parsanol::Native.parse_with_builder(grammar_json, input, builder)

Parallel Parsing

Parse multiple inputs in parallel using rayon for linear speedup on multi-core systems.

Enabling Parallel Feature

[dependencies]
parsanol = { version = "0.1", features = ["parallel"] }

Batch Parallel Parsing

use parsanol::portable::{Grammar, parse_batch_parallel};

let grammar = /* ... */;
let inputs = vec!["file1.exp", "file2.exp", "file3.exp"];

// Parse all inputs in parallel
let results = parse_batch_parallel(&grammar, &inputs);

// Results are in same order as inputs
for (i, result) in results.iter().enumerate() {
    match result {
        Ok(ast) => println!("File {} parsed successfully", i),
        Err(e) => eprintln!("File {} failed: {}", i, e),
    }
}

Parallel Configuration

use parsanol::portable::parallel::{parse_batch_parallel, ParallelConfig};

let config = ParallelConfig::new()
    .with_num_threads(4)        // Use 4 threads
    .with_min_chunk_size(10);   // Minimum inputs per thread

let results = parse_batch_parallel(&grammar, &inputs);

Performance

Scenario	Speedup
8 cores, 100 files	~8x faster than sequential
4 cores, 50 files	~4x faster than sequential
Single core	Same as sequential (graceful fallback)

When the parallel feature is not enabled, the functions fall back to sequential parsing automatically.

Infix Expression Parsing

Built-in support for parsing infix expressions with operator precedence and associativity.

Using InfixBuilder

use parsanol::portable::infix::{InfixBuilder, Assoc};

let mut builder = GrammarBuilder::new();

let expr_idx = InfixBuilder::new()
    .primary(ref_("atom"))           // Base expression (numbers, parens)
    .op("*", 2, Assoc::Left)         // Higher precedence
    .op("/", 2, Assoc::Left)
    .op("+", 1, Assoc::Left)         // Lower precedence
    .op("-", 1, Assoc::Left)
    .op("^", 3, Assoc::Right)        // Right-associative
    .build(&mut builder);

Associativity

Associativity	Meaning	Example
`Assoc::Left`	Left-to-right evaluation	`a` `-` `b` `-` `c` = `(a` `-` `b)` `-` `c`
`Assoc::Right`	Right-to-left evaluation	`a` `=` `b` `=` `c` = `a` `=` `(b` `=` `c)`
`Assoc::NonAssoc`	Cannot chain	`a` `<` `b` `<` `c` is an error

Rich Error Reporting

Tree-structured error messages similar to Parslet for better debugging.

Basic Usage

use parsanol::portable::error::{RichError, ErrorBuilder, Span};

// Create rich errors
let error = ErrorBuilder::new("Failed to parse expression")
    .at(10, 2, 5)  // offset, line, column
    .context("expression")
    .child(
        ErrorBuilder::new("Expected '+' or '-'")
            .at(10, 2, 5)
            .build(),
    )
    .build();

// Print as ASCII tree
println!("{}", error.ascii_tree());

Example Output

Error at line 3, column 5:
`- Failed to parse expression (in expression)
   `- Expected '+' or '-'

Source Context

// Format error with source code context
let formatted = error.format_with_source(input);
println!("{}", formatted);

Output:

Error at line 3, column 5:
let x = foo bar
            ^
`- Failed to parse expression (in expression)
   `- Expected '+' or '-'

Source Location Tracking

Track source positions through the parsing and transformation pipeline.

Using SourceSpan

use parsanol::portable::source_location::{SourceSpan, SourcePosition};
use parsanol::portable::transform::{ast_to_value_with_span};

// Create a span from offsets
let span = SourceSpan::from_offsets(input, 10, 20);
println!("Line {}, Column {}", span.start.line, span.start.column);

// Merge adjacent spans
let merged = span1.merge(&span2);

// Check overlap
if span1.overlaps(&span2) {
    // Spans overlap
}

// Transform AST with source spans preserved
let (value, spans) = ast_to_value_with_span(&ast, &arena, input);

Grammar Composition

Build complex grammars by importing and composing smaller grammars.

Importing Grammars

use parsanol::portable::parser_dsl::*;

let mut builder = GrammarBuilder::new();

// Import another grammar with a prefix
builder.import(&expression_grammar, Some("expr"));
builder.import(&type_grammar, Some("type"));

// Reference imported rules
let combined = seq(vec![
    ref_("expr:root"),  // References expression_grammar's root
    str(":"),
    ref_("type:root"),  // References type_grammar's root
]);

builder.rule("typed_expr", combined);
let grammar = builder.build();

Ruby FFI

Parsanol-rs can be compiled as a Ruby extension for use with parsanol-ruby.

Features

The Ruby FFI provides:

26x faster parsing than pure Ruby (Parslet)
Single parse() API - no confusing options
Lazy line/column - zero overhead unless needed
Streaming Builder - single-pass parsing with callbacks

Building for Ruby

# Build with Ruby support
cargo build --features ruby

# The resulting library can be loaded as a Ruby extension

Ruby API

require 'parsanol/native'

# Serialize grammar once
grammar = str('hello').as(:greeting) >> str(' ').maybe >> match('[a-z]').repeat(1).as(:name)
grammar_json = Parsanol::Native.serialize_grammar(grammar)

# Parse - simple and clean
result = Parsanol::Native.parse(grammar_json, "hello world")
# => {greeting: "hello"@0, name: "world"@6}

# Line/column available when needed (computed lazily)
result[:greeting].line_and_column  # => [1, 1]
result[:name].line_and_column      # => [1, 7]

Lazy Line/Column

Slice objects support lazy line/column computation:

slice.offset - character position (always available, zero cost)
slice.content - string value (always available, zero cost)
slice.line_and_column - [line, column] tuple (computed lazily, cached)

This provides zero overhead for users who don't need position info, while keeping line/column always available when needed.

Streaming Builder (Ruby)

For maximum performance, use the streaming builder API:

require 'parsanol'

# Define a builder class
class StringCollector
  include Parsanol::BuilderCallbacks

  def initialize
    @strings = []
  end

  def on_string(value, offset, length)
    @strings << value
  end

  def on_int(value)
    @strings << value.to_s
  end

  def finish
    @strings
  end
end

# Parse with streaming builder
builder = StringCollector.new
result = Parsanol::Native.parse_with_builder(grammar_json, input, builder)
# result: ["42", "+", "8"]

See parsanol-ruby for full documentation.

WASM Support

Parsanol-rs can be compiled to WebAssembly for use in browsers or Node.js.

Building for WASM

# Install wasm-pack
cargo install wasm-pack

# Build for web
wasm-pack build --features wasm --target web

JavaScript API

import { Parser, Grammar } from 'parsanol';

const grammar = Grammar.fromJson({
  atoms: [
    { Str: { pattern: "hello" } }
  ],
  root: 0
});

const parser = new Parser(grammar);
const result = parser.parse("hello");

Debug Tools

Parser Tracing

Enable tracing for debugging:

let (result, trace) = parser.parse_with_trace();

// Print trace
println!("{}", trace.format(&grammar));

Grammar Visualization

use parsanol::portable::debug::GrammarVisualizer;

let viz = GrammarVisualizer::new(&grammar);

// Generate Mermaid diagram
println!("{}", viz.to_mermaid());

// Generate GraphViz DOT
println!("{}", viz.to_dot());

Performance

Parsanol-rs is designed for high performance:

18-44x Faster than pure Ruby parsers (Parslet)
99.5% Fewer Allocations through arena allocation
O(n) Parsing via packrat memoization
SIMD Optimization: Fast character matching via memchr
AHash: Fast hashing for cache lookups
SmallVec: Stack-allocated small collections

Benchmarks

Parser	Input Size	Time
parsanol-rs (Ruby Transform)	1KB JSON	~50µs
parsanol-rs (Serialized)	1KB JSON	~30µs
parsanol-rs (Native)	1KB JSON	~20µs
Pure Ruby (Parslet)	1KB JSON	~800µs

Security

Parsanol-rs includes built-in protection against denial-of-service attacks.

Default Limits

Limit	Default Value	Description
`max_input_size`	100 MB	Maximum input size in bytes
`max_recursion_depth`	1000	Maximum recursion depth for nested structures

Custom Limits

For untrusted input, configure custom limits:

use parsanol::portable::{PortableParser, AstArena, Grammar, ParseError};

// For untrusted input, use stricter limits
let mut parser = PortableParser::with_limits(
    &grammar,
    input,
    &mut arena,
    10 * 1024 * 1024,  // 10 MB max input
    100,                // 100 max recursion depth
);

match parser.parse() {
    Ok(ast) => { /* success */ },
    Err(ParseError::InputTooLarge { input_size, max_size }) => {
        eprintln!("Input too large: {} > {}", input_size, max_size);
    },
    Err(ParseError::RecursionLimitExceeded { depth, max_depth }) => {
        eprintln!("Recursion too deep: {} > {}", depth, max_depth);
    },
    Err(e) => { /* other errors */ },
}

Best Practices

Always limit input size when parsing untrusted data
Use external timeouts for network services (e.g., tokio::time::timeout)
Monitor memory usage in production environments

See SECURITY.md for complete security documentation.

Module Reference

Core Modules

Module	Description
`portable::parser`	PEG parsing engine with packrat memoization
`portable::grammar`	Grammar representation and serialization
`portable::ast`	AST node types
`portable::arena`	Arena allocator for AST nodes
`portable::cache`	Dense cache for memoization
`portable::parser_dsl`	Fluent API for grammar definition
`portable::transform`	Transform system for converting parse trees
`portable::error`	Rich error reporting
`portable::infix`	Infix expression parsing with precedence
`portable::debug`	Debugging and visualization tools
`portable::source_location`	Source span tracking with line/column info
`portable::streaming`	Streaming parser support for large inputs
`portable::streaming_builder`	Single-pass parsing with custom builders
`portable::parallel`	Parallel parsing for batch processing
`portable::incremental`	Incremental parsing for editor integration
`portable::visitor`	AST visitor pattern implementation
`portable::source_map`	Source map generation for debugging

Examples

See the examples/ directory for 39 complete examples demonstrating real-world parsing scenarios:

Expression Parsers

Example	Description
`calculator-pattern`	Parse expressions with pattern-based transforms
`calculator-transform`	Parse and evaluate expressions with native transforms
`boolean-algebra`	Parse boolean expressions with AND, OR, NOT operators
`expression-evaluator`	Evaluate expressions with variables and function calls
`prec-calc`	Precedence climbing algorithm for infix expressions

Data Formats

Example	Description
`json-pattern`	JSON parser with pattern matching
`json-transform`	JSON parser with native transforms
`csv-pattern`	CSV parser handling quoted fields (pattern mode)
`csv-transform`	CSV parser handling quoted fields (transform mode)
`ini`	INI configuration file parser
`simple-xml`	XML parser with tag matching
`markup`	Lightweight markup language parser
`toml`	TOML configuration file parser
`yaml`	YAML subset parser
`markdown`	Markdown subset parser with headers and lists
`iso-8601`	ISO 8601 date/time/duration parser
`iso-6709`	ISO 6709 geographic coordinate parser

URLs & Network

Example	Description
`url`	URL parser with scheme, host, path components
`email`	Email address parser with validation
`ip-address`	IPv4/IPv6 address parser with validation

Code & Templates

Example	Description
`erb`	ERB template parser for Ruby templates
`sexp`	S-expression parser for Lisp-style syntax
`minilisp`	MiniLisp parser demonstrating recursive grammars

Text Processing

Example	Description
`balanced-parens`	Balanced parentheses parser
`string-literal`	String literal parser with escape sequences
`sentence`	Sentence parser with Unicode support
`comments`	Comment parser (line and block comments)

Error Handling

Example	Description
`error-reporting`	Rich error reporting with tree structure
`error-recovery`	Error recovery strategies
`deepest-errors`	Deepest error point tracking
`nested-errors`	Nested error tree visualization

Advanced Features

Example	Description
`streaming`	Streaming parser for large inputs
`incremental`	Incremental parsing for editor integration
`linter`	Code linter with custom validation
`custom-atoms`	Custom atom registration
`modularity`	Grammar composition from modules

Run examples with:

cargo run --example calculator-transform
cargo run --example json-pattern
cargo run --example url

Full documentation and interactive examples available at the website.

API Stability

The API is currently in active development. Version 0.x indicates that breaking changes may occur.

Stable APIs:

Grammar and GrammarBuilder
PortableParser basic parsing
AstArena and AstNode
Parser DSL combinators
Streaming builder trait and built-in builders
Parallel parsing functions

Experimental APIs (may change):

Transform and pattern matching
Rich error reporting
Infix expression parsing
Debug/trace tools

Documentation

Architecture

See docs/ARCHITECTURE.md for the overall system architecture.

Development

docs/refactoring-plan.md - Current refactoring roadmap
docs/continuation-prompt.md - Prompt for continuing work
docs/MIGRATION.md - Migration guide from Parslet

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit issues and pull requests at GitHub.

Development Setup

# Clone the repository
git clone https://github.com/parsanol/parsanol-rs.git
cd parsanol-rs

# Build (workspace)
cargo build

# Run tests (234 unit tests)
cargo test --lib

# Run all examples
cargo build --examples

# Run benchmarks
cargo bench

# Check code quality
cargo clippy --lib -- -D warnings
cargo fmt --check

Testing

The test suite consists of multiple types of tests:

Unit tests: 234 tests covering internal functionality of each module (parser, arena, cache, transform, derive, etc.).

Integration tests: Located in tests/ directory, test end-to-end parsing scenarios.

Examples: 39 runnable parsers in examples/ directory demonstrating real-world usage. Examples are compiled and tested via cargo build --examples.

Documentation tests (doc tests): Code examples in documentation comments. Note that many doc tests are marked with ignore because they show incomplete code snippets (e.g., method signatures or pseudocode) rather than complete runnable examples. This is intentional - the doc tests illustrate API patterns, while the examples/ directory contains fully runnable code that is verified by CI.

To run all tests:

# Unit + integration tests
cargo test

# Include doc tests (most will be ignored as designed)
cargo test --doc

# Test examples compile
cargo build --examples

# Run ignored doc tests (will fail if not complete)
cargo test -- --ignored

Release Process

This project uses release-plz for automated releases.

How It Works

Push to main → release-plz creates/updates a Release PR
Review and merge the Release PR → Version is updated in main
After merge → release-plz automatically:
- Creates a git tag (e.g., v0.1.2)
- Publishes to crates.io
- Creates a GitHub release
Build artifacts → CI builds native libraries and uploads them to the GitHub release

Maintainer Workflow

Normal Release (Recommended)

Just push commits with conventional commit messages:

git commit -m "feat: add new parser combinator"
git push origin main

release-plz will:

Create a Release PR with version bump (e.g., 0.1.1 → 0.1.2 for feat:)
Wait for you to review and merge
Publish automatically after merge

Manual Release

If you need to trigger a release manually:

Go to Actions → Release workflow
Click Run workflow
Select action:
- auto (default): Let release-plz decide
- release-pr: Just create/update the Release PR
- release: Force a release immediately

Version Bump Rules

release-plz uses conventional commits:

Commit Type	Version Bump
`feat:`	Minor (0.1.0 → 0.2.0)
`fix:`	Patch (0.1.0 → 0.1.1)
`feat!:` or `fix!:`	Major (0.1.0 → 1.0.0)
`docs:`, `chore:`, etc.	No bump (changelog only)

What Gets Released

crates.io: parsanol crate
GitHub Release: With release notes
Build Artifacts: Native libraries for Linux, macOS, Windows (x64, ARM64)

Troubleshooting

"Already published" error:

release-plz sees an existing tag and thinks the version is already published
Solution: Ensure Cargo.toml version matches what you want to publish

No Release PR created:

Check that commits follow conventional commit format
Check GitHub Actions logs for the release-pr job

Publish failed:

Check crates.io API token is valid
Check version doesn't already exist on crates.io

FFI Feature Testing

This crate supports optional Ruby and WebAssembly (WASM) FFI features. These must be tested explicitly.

[!IMPORTANT] FFI features require additional setup and may not compile/link in all environments. Always verify FFI code compiles before pushing to CI.

Ruby FFI Testing

The Ruby FFI uses the magnus crate to provide Ruby bindings.

Prerequisites:

Ruby 3.0+ installed
Ruby development headers (macOS: brew install ruby, Ubuntu: sudo apt-get install ruby-dev)

Testing:

# Compile-time check (no Ruby required for linking)
cargo check --features ruby
cargo clippy --features ruby --lib -- -D warnings

# Full integration tests (requires Ruby runtime)
# Note: These tests are marked #[ignore] - run manually
cargo test --features ruby --test ruby_ffi -- --ignored

Test coverage:

tests/ruby_ffi.rs - Comprehensive tests for RubyBuilder, RubyObject trait
Magnus type annotations (e.g., funcall::<&str, (), Value>)
Error handling from Ruby callbacks
Parse result conversion

WebAssembly FFI Testing

The WASM FFI uses wasm-bindgen for JavaScript bindings.

Prerequisites:

wasm-pack installed (cargo install wasm-pack)

Testing:

# Compile-time check
cargo check --features wasm
cargo clippy --features wasm --lib -- -D warnings

# Full WASM build and test
wasm-pack build --features wasm
wasm-pack test --node --features wasm

Test coverage:

tests/wasm_ffi.rs - Tests for WASM exports, grammar serialization
JsValue conversions
Error handling for WASM

CI Integration

CI automatically tests FFI features:

# From .github/workflows/ci.yml
strategy:
  matrix:
    feature: ["", "logging", "ruby", "wasm"]

The Ruby and WASM feature tests run on every push to catch FFI regressions early.

Release Process

This project uses release-plz for automated releases.

How It Works

┌─────────────────────────────────────────────────────────────────────────────┐
│                         RELEASE-PLZ WORKFLOW                                │
└─────────────────────────────────────────────────────────────────────────────┘

  Push to main
       │
       ▼
  ┌─────────────────┐
  │  release-pr job │  Creates/updates Release PR
  └────────┬────────┘
           │
           ▼
  ┌─────────────────┐
  │   Release PR    │  Contains version bump + changelog
  │   (on GitHub)   │
  └────────┬────────┘
           │
           │  Maintainer reviews and merges
           ▼
  ┌─────────────────┐
  │  release job    │  Runs release-plz release
  └────────┬────────┘
           │
           ├──────────────────────────────┐
           │                              │
           ▼                              ▼
  ┌─────────────────┐          ┌─────────────────┐
  │  Create tag     │          │ Publish to      │
  │  (v0.1.2)       │          │ crates.io       │
  └────────┬────────┘          └─────────────────┘
           │
           ▼
  ┌─────────────────┐
  │  GitHub Release │  With release notes
  └────────┬────────┘
           │
           ▼
  ┌─────────────────┐
  │  Build jobs     │  Build native libraries
  └────────┬────────┘
           │
           ▼
  ┌─────────────────┐
  │  Update Release │  Upload artifacts
  └─────────────────┘

Maintainer Workflow

Normal Release (Recommended)

Just push commits with conventional commit messages:

git commit -m "feat: add new parser combinator"
git push origin main

release-plz will:

Create a Release PR with version bump (e.g., 0.1.1 → 0.1.2 for feat:)
Wait for you to review and merge
Publish automatically after merge

Manual Release

If you need to trigger a release manually:

Go to Actions → Release workflow
Click Run workflow
Select action:
- auto (default): Let release-plz decide
- release-pr: Just create/update the Release PR
- release: Force a release immediately

Version Bump Rules

release-plz uses conventional commits:

Commit Type	Version Bump
`feat:`	Minor (0.1.0 → 0.2.0)
`fix:`	Patch (0.1.0 → 0.1.1)
`feat!:` or `fix!:`	Major (0.1.0 → 1.0.0)
`docs:`, `chore:`, etc.	No bump (changelog only)

What Gets Released

crates.io: parsanol crate
GitHub Release: With release notes
Build Artifacts: Native libraries for Linux, macOS, Windows (x64, ARM64)

Troubleshooting

"Already published" error:

release-plz sees an existing tag and thinks the version is already published
Solution: Ensure Cargo.toml version matches what you want to publish

No Release PR created:

Check that commits follow conventional commit format
Check GitHub Actions logs for the release-pr job

Publish failed:

Check that the crates.io environment is configured in repository settings
Check that trusted publishing is enabled

parsanol 0.4.0

Parsanol-rs

Purpose

Features

Bytecode Backend

Backend Comparison

Performance Characteristics

Decision Matrix

Automatic Selection

Why Nested Repetitions Are the Criterion

How the Analysis Works

When to Override Auto-Selection

Problematic Grammar Patterns

Analyzing Your Grammar

Using the Bytecode Backend

Known Differences

Backend Abstraction

Using the ParsingBackend Trait

Runtime Backend Selection

Backend Characteristics

Implementing Custom Backends

Dynamic Backend Dispatch

Quick Start Examples

Optimization Passes

Bytecode VM Architecture

Instruction Set

Architecture

Workspace Structure

Installation

Optional Features

Quick Start

Basic Parsing

Calculator with Operator Precedence

Parser DSL

Atom Types

Combinators

Grammar Macro

Capture Atoms

Basic Usage

Capture API

Backend Compatibility

Scope Atoms

Use Cases

Basic Usage

Dynamic Atoms

Registering a Callback

Using Dynamic Atoms in Grammars

Backend Compatibility

Streaming with Captures

Basic Usage

Chunk Configuration

Performance Notes

Transform System

Value Types

Basic Transformations

Pattern Matching

Pattern Types

Converting AST to Value

Derive Macros

Basic Usage

Container Attributes

Variant Attributes (for enums)

Field Attributes

Complete Example

Single-Field Tuple Structs

Error Handling

Streaming Builder

Implementing StreamingBuilder

Using parse_with_builder

Built-in Builders

Ruby Integration

Parallel Parsing

Enabling Parallel Feature

Batch Parallel Parsing

Parallel Configuration

Performance

Infix Expression Parsing

Using InfixBuilder

Associativity

Rich Error Reporting