camxes-rs 0.2.0

A Parsing Expression Grammar (PEG) parser generator with enhanced error reporting and semantic actions
Documentation
camxes-rs-0.2.0 has been yanked.

camxes-rs

Crates.io Documentation License

A Parsing Expression Grammar (PEG) parser generator with enhanced error reporting and semantic actions support.

⚠️ Version 0.2.0 Breaking Changes

If you're upgrading from 0.1.x, please read the CHANGELOG.md for migration instructions. The main change is:

  • ParseResult now has 4 fields instead of 3 (added error position tracking)
  • Access parse result at index 3 instead of index 2

Features

  • Zero-Copy Parsing: Efficient parsing without unnecessary string allocations
  • Enhanced Error Reporting: Track furthest error position for better diagnostics
  • Semantic Actions: Build typed ASTs with bottom-up reducers
  • Embedded Lojban Grammar: Full camxes-style Lojban PEG included
  • Thread-Safe: Designed for concurrent use
  • Rich Debugging: Detailed logging via the log crate

Installation

Add this to your Cargo.toml:

[dependencies]
camxes-rs = "0.2.0"

Quick Start

Basic Usage

use camxes_rs::peg::grammar::Peg;

fn main() {
    // Define your grammar
    let grammar = r#"
    expression <- term (('+' / '-') term)*
    term <- factor (('*' / '/') factor)*
    factor <- number / '(' expression ')'
    number <- [0-9]+
    "#;

    // Create parser
    let parser = Peg::new("expression", grammar).unwrap();
    
    // Parse input
    let result = parser.parse("2+3*4");
    
    // Access the result (note: index 3 in version 0.2.0)
    match result.3.as_ref() {
        Ok(nodes) => println!("Parse succeeded with {} nodes", nodes.len()),
        Err(err) => println!("Parse failed at position {}", err.position),
    }
}

Using the Embedded Lojban Grammar

use camxes_rs::peg::grammar::Peg;
use camxes_rs::LOJBAN_GRAMMAR;

fn main() {
    let (start_rule, grammar_text) = LOJBAN_GRAMMAR;
    let parser = Peg::new(start_rule, grammar_text).unwrap();
    
    let result = parser.parse("mi klama le zarci");
    match result.3.as_ref() {
        Ok(nodes) => println!("Valid Lojban!"),
        Err(err) => println!("Parse error at position {}", err.position),
    }
}

Semantic Actions (Building ASTs)

use camxes_rs::peg::grammar::Peg;
use camxes_rs::peg::{parse_with_semantics, ReducerTable, SemanticNode};

fn main() {
    let grammar = r#"number <- [0-9]+"#;
    let parser = Peg::new("number", grammar).unwrap();
    
    // Define reducers to build typed values
    let mut reducers = ReducerTable::new();
    reducers.insert("number", |input, span, _children| {
        let text = &input[span.0..span.1];
        let value: i32 = text.parse().unwrap();
        SemanticNode::Int(value)
    });
    
    let result = parse_with_semantics(&parser, "42", &reducers).unwrap();
    println!("Parsed value: {:?}", result);
}

Grammar Syntax

The parser supports standard PEG operators:

Operator Description Example
<- Definition rule <- expression
/ Ordered choice a / b
* Zero or more [0-9]*
+ One or more [a-z]+
? Optional [A-Z]?
& And-predicate &[a-z]
! Not-predicate ![0-9]
() Grouping (a / b)
[] Character class [a-zA-Z0-9]
. Any character .

API Reference

ParseResult Structure (v0.2.0)

pub struct ParseResult(
    pub u32,                                      // cost
    pub usize,                                    // consumed position
    pub usize,                                    // error position (furthest failure)
    pub Arc<Result<Vec<ParseNode>, ParseError>>, // parse result
);

ParseNode

pub enum ParseNode {
    Terminal { span: Span },
    NonTerminal {
        name: String,
        span: Span,
        children: Vec<ParseNode>,
    },
}

pub struct Span(pub usize, pub usize);  // (start, end)

Key Functions

  • Peg::new(start_rule, grammar) - Create a parser from grammar text
  • parser.parse(input) - Parse input string
  • parse_with_semantics(parser, input, reducers) - Parse and build AST

Debugging

Enable debug logging to see detailed parsing information:

RUST_LOG=camxes_rs=debug cargo run

Or in code:

env_logger::builder()
    .filter_level(log::LevelFilter::Debug)
    .init();

Multi-threaded Usage

For web servers or multi-threaded applications, create one Peg instance per thread:

use std::collections::HashMap;
use std::sync::Arc;
use camxes_rs::peg::grammar::Peg;
use camxes_rs::LOJBAN_GRAMMAR;

// In your server initialization
let grammar_texts: Arc<HashMap<i32, String>> = Arc::new({
    let mut map = HashMap::new();
    map.insert(1, LOJBAN_GRAMMAR.1.to_string());
    map
});

// In each worker thread
let mut parsers = HashMap::new();
for (lang_id, grammar_text) in grammar_texts.iter() {
    match Peg::new("text", grammar_text) {
        Ok(parser) => {
            parsers.insert(*lang_id, parser);
        }
        Err(e) => {
            log::error!("Failed to initialize parser: {}", e);
        }
    }
}

Migration from 0.1.x

See CHANGELOG.md for detailed migration instructions.

Quick summary:

  • Change result.2result.3 to access parse result
  • Update tuple destructuring: ParseResult(cost, pos, result)ParseResult(cost, pos, error_pos, result)

License

MIT

Contributing

Contributions are welcome! This crate is part of the tersmu project.

Links