# camxes-rs
[](https://crates.io/crates/camxes-rs)
[](https://docs.rs/camxes-rs)
[](LICENSE)
A Parsing Expression Grammar (PEG) parser generator with enhanced error reporting and semantic actions support.
## ⚠️ Version 0.2.0 Breaking Changes
**If you're upgrading from 0.1.x**, please read the [CHANGELOG.md](CHANGELOG.md) for migration instructions. The main change is:
- `ParseResult` now has 4 fields instead of 3 (added error position tracking)
- Access parse result at index **3** instead of index **2**
## Features
- **Zero-Copy Parsing**: Efficient parsing without unnecessary string allocations
- **Enhanced Error Reporting**: Track furthest error position for better diagnostics
- **Semantic Actions**: Build typed ASTs with bottom-up reducers
- **Embedded Lojban Grammar**: Full camxes-style Lojban PEG included
- **Thread-Safe**: Designed for concurrent use
- **Rich Debugging**: Detailed logging via the `log` crate
## Installation
Add this to your `Cargo.toml`:
```toml
[dependencies]
camxes-rs = "0.2.0"
```
## Quick Start
### Basic Usage
```rust
use camxes_rs::peg::grammar::Peg;
fn main() {
// Define your grammar
let grammar = r#"
expression <- term (('+' / '-') term)*
term <- factor (('*' / '/') factor)*
factor <- number / '(' expression ')'
number <- [0-9]+
"#;
// Create parser
let parser = Peg::new("expression", grammar).unwrap();
// Parse input
let result = parser.parse("2+3*4");
// Access the result (note: index 3 in version 0.2.0)
match result.3.as_ref() {
Ok(nodes) => println!("Parse succeeded with {} nodes", nodes.len()),
Err(err) => println!("Parse failed at position {}", err.position),
}
}
```
### Using the Embedded Lojban Grammar
```rust
use camxes_rs::peg::grammar::Peg;
use camxes_rs::LOJBAN_GRAMMAR;
fn main() {
let (start_rule, grammar_text) = LOJBAN_GRAMMAR;
let parser = Peg::new(start_rule, grammar_text).unwrap();
let result = parser.parse("mi klama le zarci");
match result.3.as_ref() {
Ok(nodes) => println!("Valid Lojban!"),
Err(err) => println!("Parse error at position {}", err.position),
}
}
```
### Semantic Actions (Building ASTs)
```rust
use camxes_rs::peg::grammar::Peg;
use camxes_rs::peg::{parse_with_semantics, ReducerTable, SemanticNode};
fn main() {
let grammar = r#"number <- [0-9]+"#;
let parser = Peg::new("number", grammar).unwrap();
// Define reducers to build typed values
let mut reducers = ReducerTable::new();
reducers.insert("number", |input, span, _children| {
let text = &input[span.0..span.1];
let value: i32 = text.parse().unwrap();
SemanticNode::Int(value)
});
let result = parse_with_semantics(&parser, "42", &reducers).unwrap();
println!("Parsed value: {:?}", result);
}
```
## Grammar Syntax
The parser supports standard PEG operators:
| `<-` | Definition | `rule <- expression` |
| `/` | Ordered choice | `a / b` |
| `*` | Zero or more | `[0-9]*` |
| `+` | One or more | `[a-z]+` |
| `?` | Optional | `[A-Z]?` |
| `&` | And-predicate | `&[a-z]` |
| `!` | Not-predicate | `![0-9]` |
| `()` | Grouping | `(a / b)` |
| `[]` | Character class | `[a-zA-Z0-9]` |
| `.` | Any character | `.` |
## API Reference
### ParseResult Structure (v0.2.0)
```rust
pub struct ParseResult(
pub u32, // cost
pub usize, // consumed position
pub usize, // error position (furthest failure)
pub Arc<Result<Vec<ParseNode>, ParseError>>, // parse result
);
```
### ParseNode
```rust
pub enum ParseNode {
Terminal { span: Span },
NonTerminal {
name: String,
span: Span,
children: Vec<ParseNode>,
},
}
pub struct Span(pub usize, pub usize); // (start, end)
```
### Key Functions
- `Peg::new(start_rule, grammar)` - Create a parser from grammar text
- `parser.parse(input)` - Parse input string
- `parse_with_semantics(parser, input, reducers)` - Parse and build AST
## Debugging
Enable debug logging to see detailed parsing information:
```bash
RUST_LOG=camxes_rs=debug cargo run
```
Or in code:
```rust
env_logger::builder()
.filter_level(log::LevelFilter::Debug)
.init();
```
## Multi-threaded Usage
For web servers or multi-threaded applications, create one `Peg` instance per thread:
```rust
use std::collections::HashMap;
use std::sync::Arc;
use camxes_rs::peg::grammar::Peg;
use camxes_rs::LOJBAN_GRAMMAR;
// In your server initialization
let grammar_texts: Arc<HashMap<i32, String>> = Arc::new({
let mut map = HashMap::new();
map.insert(1, LOJBAN_GRAMMAR.1.to_string());
map
});
// In each worker thread
let mut parsers = HashMap::new();
for (lang_id, grammar_text) in grammar_texts.iter() {
match Peg::new("text", grammar_text) {
Ok(parser) => {
parsers.insert(*lang_id, parser);
}
Err(e) => {
log::error!("Failed to initialize parser: {}", e);
}
}
}
```
## Migration from 0.1.x
See [CHANGELOG.md](CHANGELOG.md) for detailed migration instructions.
**Quick summary:**
- Change `result.2` → `result.3` to access parse result
- Update tuple destructuring: `ParseResult(cost, pos, result)` → `ParseResult(cost, pos, error_pos, result)`
## License
MIT
## Contributing
Contributions are welcome! This crate is part of the [tersmu](https://github.com/lojban/tersmu) project.
## Links
- [Documentation](https://docs.rs/camxes-rs)
- [Crates.io](https://crates.io/crates/camxes-rs)
- [Repository](https://github.com/lojban/tersmu)
- [Issue Tracker](https://github.com/lojban/tersmu/issues)