chasa
Parser combinators with explicit rollback control (cut) and streaming-friendly inputs.
This crate is a small parser-combinator core used in the YuLang workspace. The design goal is to keep backtracking behavior predictable:
cut: a branch-pruning signal (and root-level commit trigger for streaming inputs)- rollback: combinator-driven (not "failure implies rollback")
- errors: accumulated, and rolled back together with input when backtracking
TL;DR
use *;
// Parse "let x" and extract the identifier
let mut input = "let x";
let name = parse_ok_once.unwrap;
assert_eq!;
Key combinators:
tag("...")– match an exact stringws1/ws– match whitespace (one-or-more / zero-or-more)any– match any single characterright(q)– parse both, return right resultmany()– repeat zero or more timessep(s)– parse separated list
Most combinators are available as methods (via ParserOnce / ParserMut / Parser),
and the prelude imports those traits so you can write p.right(q) / p.many() / p.sep(...).
What makes chasa different?
1) Explicit cut for branch pruning and error recovery
cut is a control signal that prevents backtracking across a commit point. This is useful for:
Error recovery: Once you've seen a keyword, the rest of the syntax becomes mandatory.
use *;
let mut input = "let x";
// After seeing "let", we MUST see an identifier (no backtracking to other branches)
let ident = one_of.;
let var_decl = tag.cut.right.right;
let result = parse_ok_once.unwrap;
assert_eq!;
Better error messages: Instead of "expected A or B or C", you get "expected identifier after 'let'".
2) Streaming inputs for large files
StreamInput allows parsing large files or network streams without loading everything into memory.
use *;
use File;
use ;
// Parse a large file lazily (only buffering what's needed)
let file = open.unwrap;
let mut input = new;
// After `cut()` commits, already-parsed data is dropped from the buffer
let my_parser = any.;
let result = parse_ok_once.unwrap;
3) Combinator-driven rollback (not automatic)
In many libraries, "failure" automatically implies rollback. In chasa, rollback is a semantic choice made by each combinator:
maybe(p)/p.or_not()– roll back on soft failurelookahead(p)– always roll back (peek without consuming)many(p)– roll back only the final terminating attempt
This keeps the control surface small and predictable: you can usually tell what rolls back by looking at the combinator you used.
Showcase
This README includes two example styles:
- Combinator style: chain methods to build parsers declaratively
- Imperative style: use
Inmethods (run,choice) for explicit control flow
Example 1: S-expressions (combinator style)
This example uses normal Rust functions as parsers. Functions automatically implement ParserOnce / ParserMut / Parser in this crate, so recursion stays ergonomic.
use *;
assert_eq!;
Note: In is the input wrapper that bundles the underlying input, error accumulator, and cut flag. Parsers receive In and return their output.
Example 2: key = value (imperative style)
This example uses In methods (run, choice) for more explicit control flow. This style is useful when combinator chains become too long or when you need conditional branching.
use *;
assert_eq!;
assert_eq!;
Quick API tour
Entry points:
parse_ok_once(&mut input, parser)– run aParserOnceand returnResult<T, Error>parse_ok_mut(&mut input, &mut parser)– run aParserMutby mutable referenceparse_ok(&mut input, &parser)– run aParserby shared referenceparser.test_ok(input)– ergonomic helper for quick experiments (input doesn't need to be&mut)
Building blocks:
- Items:
any,item(c),one_of("abc"),none_of("xyz") - Tags:
tag("keyword") - Whitespace:
ws,ws1 - Sequencing:
then,right,left,between - Choice:
or,choice - Repetition:
many,many1,many_map - Separated lists:
sep,sep1,sep_map,sep_reduce - Lookahead:
lookahead,not - Control:
cut,maybe,label
Quickstart
1) Match a fixed string
use *;
let mut input = "let x";
// tag("let"): match "let"
// right(ws1): match whitespace and discard it
// right(any): match any character and return it
let name = parse_ok_once.unwrap;
assert_eq!;
assert_eq!;
2) Repeat and collect
many() collects Option<T> outputs into any O: FromIterator<T>.
use *;
let mut input = "aaab";
let out: String = parse_ok_once.unwrap;
assert_eq!;
assert_eq!; // 'b' remains (terminating attempt is rolled back)
Important: The terminating attempt (the final item('a') that fails on 'b') is rolled back, so 'b' remains in the input.
3) Separated lists
use *;
let mut input = "a,a,";
let comma = item.to;
let out: String = parse_ok_once.unwrap;
assert_eq!;
assert_eq!; // trailing comma is consumed
Note: Trailing separators are allowed by default (matching common formats like JSON arrays, Rust struct literals). Use .no_trail() to forbid them:
use *;
let mut input = "a,a"; // no trailing comma
let comma = item.to;
let out: String = parse_ok_once.unwrap;
assert_eq!;
4) Try-with-rollback (maybe)
maybe(p) runs p and rolls back on soft failure (failure without cut).
use *;
let mut input = "b";
let out = parse_ok_once.unwrap;
assert_eq!;
assert_eq!; // rolled back because 'a' didn't match
5) Commit with cut
use *;
let mut input = "let 123"; // invalid: expected identifier after "let"
let var_decl = tag.cut.right.right;
let err = parse_ok_once.unwrap_err;
// Error message will report failure at position after "let ",
// not at the beginning (because `cut` prevented backtracking)
What is a parser combinator?
A parser combinator is a function (or value) that consumes some input and returns either:
- a value (success), or
- a failure (and sometimes an error).
The "combinator" part is that you build bigger parsers by composing smaller ones:
sequencing (then / left / right), repetition (many), separation (sep), choice (or),
and lookahead (lookahead / not).
In chasa, parsers are plain values implementing traits such as ParserOnce.
They run on an In wrapper, which bundles:
- the underlying input
- an error accumulator (
Merger) - the current
cutflag
Design details
cut is a branch-pruning signal (and root-level commit trigger)
cut is not "input was consumed". It is "do not backtrack across this point".
Additionally, when called in a root cut scope, it triggers Input::commit,
allowing streaming inputs to drop already-accepted prefixes.
use *;
// Stream the input lazily (no full buffer needed)
let mut input = new;
// Root cut commits the accepted prefix (here: after matching 'a')
let p = item.cut.right;
let out = parse_ok_once.unwrap;
assert_eq!;
assert_eq!; // 'a' has been dropped from the buffer
Errors are accumulated, and can be rolled back too
chasa uses Merger to keep the "best" error span (lexicographic (start, end)),
storing all errors that occurred at that span.
Backtracking combinators roll back the error accumulator together with the input, so speculative branches don't pollute successful parses.
Example: If choice((p, q, r)) tries p and it fails softly, both the input position and any errors from p are rolled back before trying q.
Where to look next
prelude: start here for importsparser: combinators and traitsinput: input abstractions and streaming inputsparse: helpers likeparse_ok_once/test_ok