A Rust implementation of the Python library lextrail.
Features
- Zero dependencies
- Parses all context-free grammars, including ambiguous grammars
- Returns tokens constrained to a specified vocabulary if needed
- Native Rust performance
Quick Start
Installation
Usage Modes
The library supports two ways to generate constrained text, depending on your use case:
Trail
Use a Trail object when you want to generate the complete next element without vocabulary constraints.
CFG
use trail_cfg;
let example = r#"
start: expression
expression: term (("+" | "-") term)
term: factor (("*" | "/") factor)
factor: NUMBER
NUMBER: /-?[0-9]+/
"#;
let = trail_cfg.expect;
Regex
use trail_rex;
let example = r#"[a-z]+@[a-z]+\.(com|org|net)"#;
let = trail_rex.expect;
You can also combine both TERMINAL and REGEX expressions using trail_exp.
use trail_exp;
let example = r#"/[0-9]\.[0-9]/ "+" /[0-9]\.[0-9]/"#;
let = trail_exp.expect;
JSON
This is an experimental version. Not intended for production use.
- Currently supported keywords:
type,enum,const,properties,required,items,prefixItems,oneOf - Constraint intersection (e.g., combining
prefixItemswithitems, orconstwithenum) is not yet implemented
use trail_json;
let example = r#"
{
"type": "object",
"properties": {
"user": {
"type": "object",
"properties": {
"name": {"type": "string"},
"email": {"type": "string"}
},
"required": ["email"]
}
}
}
"#;
let = trail_json.expect;
Then, run a random simulation.
use IteratorRandom;
use rng;
use get_next_proposals;
let = ;
loop
println!;
You can pretty-print JSON output using lextrail::json::format_json_instance.
ASM
Use an ASM object when you need to constrain the next token to a predefined vocabulary.
Example
use asm_cfg;
let example = r#"
start: L0
L0: ("A" | "B")+ L1
L1: ("C" | "D") L2
L2: "E" L3*
L3: /FGH/
"#
let asm = asm_cfg.expect;
If you launch a simulation, then the proposals will be elements of the provided vocabulary.
use IteratorRandom;
use rng;
use get_next_tokens;
let = ;
loop
assert_eq!;
println!;
You can do it with any of the formats.
# CFG
use asm_cfg;
asm_cfg;
# REGEX
use asm_rex;
asm_rex;
# MIXED
use asm_exp;
asm_exp;
# JSON
use asm_json;
asm_json;
Playground
I've built a playground to showcase the different simulations, see the Python implementation.