<div align="center">

_A Rust implementation of the Python library [lextrail](https://github.com/miftahmoha/lextrail)._
</div>
## Features
- Zero dependencies
- Parses all context-free grammars, including ambiguous grammars
- Returns tokens constrained to a specified vocabulary if needed
- Native Rust performance
## Quick Start
### Installation
``` bash
cargo add lextrail
```
## Usage Modes
The library supports two ways to generate constrained text, depending on your use case:
### Trail
Use a **Trail** object when you want to generate the complete next element without vocabulary constraints.
**CFG**
```rust
use lextrail::guide::trail_cfg;
let example = r#"
start: expression
expression: term (("+" | "-") term)
term: factor (("*" | "/") factor)
factor: NUMBER
NUMBER: /-?[0-9]+/
"#;
let (schema, mut state) = trail_cfg(example).expect("Expected `Trail`, but got a `TrailError`.");
```
**Regex**
```rust
use lextrail::guide::trail_rex;
let (schema, mut state) = trail_rex(example).expect("Expected `Trail`, but got a `TrailError`.");
```
You can also combine both TERMINAL and REGEX expressions using `trail_exp`.
```rust
use lextrail::guide::trail_exp;
let example = r#"/[0-9]\.[0-9]/ "+" /[0-9]\.[0-9]/"#;
let (schema, mut state) = trail_exp(example).expect("Expected `Trail`, but got a `TrailError`.");
```
**JSON**
_This is an experimental version. Not intended for production use._
- Currently supported keywords: `type`, `enum`, `const`, `properties`, `required`, `items`, `prefixItems`, `oneOf`
- Constraint intersection (e.g., combining `prefixItems` with `items`, or `const` with `enum`) is not yet implemented
```rust
use lextrail::json::trail_json;
let example = r#"
{
"type": "object",
"properties": {
"user": {
"type": "object",
"properties": {
"name": {"type": "string"},
"email": {"type": "string"}
},
"required": ["email"]
}
}
}
"#;
let (schema, mut state) = trail_json(example).expect("Expected `Trail`, but got a `TrailError`.");
```
Then, run a random simulation.
```rust
use rand::prelude::IteratorRandom;
use rand::rng;
use lextrail::guide::get_next_proposals;
let (mut response, mut value) = (Vec::new(), String::new());
loop {
let values = get_next_proposals(&schema, &mut state, &value).expect("Expected `Vec<String>`, but got a `TrailError`.");
if values.is_empty() {
break;
}
value = values.into_iter().choose(&mut rng()).unwrap();
response.push(value.clone());
}
println!("{}", response.join(""));
```
_You can pretty-print JSON output using `lextrail::json::format_json_instance`._
### ASM
Use an **ASM** object when you need to constrain the next token to a predefined vocabulary.
#### Example
```rust
use lextrail::assemble::asm_cfg;
let example = r#"
start: L0
L0: ("A" | "B")+ L1
L1: ("C" | "D") L2
L2: "E" L3*
L3: /FGH/
"#
let (schema, mut state) = asm_cfg(example, vec![String::from("AD"), String::from("EF"), String::from("GH")]).expect("Expected `ASM`, but got a `TrailError`.");
```
If you launch a simulation, then the proposals will be elements of the provided vocabulary.
```rust
use rand::prelude::IteratorRandom;
use rand::rng;
use lextrail::assemble::get_next_tokens;
let (mut response, mut value) = (Vec::new(), String::new());
loop {
let values = get_next_tokens(&schema, &mut state, &value).expect("Expected `Vec<String>`, but got a `TrailError`.");
if values.is_empty() {
break;
}
value = values.into_iter().choose(&mut rng()).unwrap();
response.push(value.clone());
}
assert_eq!(response, vec![String::from("AD"), String::from("EF"), String::from("GH"), String::new()]);
println!("{}", response.join(""));
```
You can do it with any of the formats:
```rust
# CFG
use lextrail::assemble::asm_cfg;
asm_cfg(.., .., vec![..]);
# REGEX
use lextrail::assemble::asm_rex;
asm_rex(.., .., vec![..]);
# MIXED
use lextrail::assemble::asm_exp;
asm_exp(.., .., vec![..]);
# JSON
use lextrail::json::asm_json;
asm_json(.., .., vec![..]);
```
## Playground
_I've built a playground to showcase the different simulations, see the Python [implementation](https://github.com/miftahmoha/lextrail)._