embed-guide only.Expand description
capture!, bind!, and choosing bind shapes.
AI assistance: This chapter was drafted with AI assistance while the library is still young. The guide is expected to improve over time as APIs and examples stabilize. If anything looks wrong or confusing, please report it on GitHub.
§Capture and Binds
Most marser grammars use matchers to describe input and capture! to turn the
matched input into Rust values.
This chapter explains:
- what
capture!does - when to use
bind!,bind_span!, andbind_slice! - how single, repeated, and optional binds behave
- how to avoid common bind-shape mistakes
- when zero-copy parsing with
bind_slice!is useful
If you prefer the short version first, jump to Quick reference (bind shapes) and then come back for the worked explanations.
§Before you memorize syntax
Three ideas explain most of this page:
capture!runs matcher-shaped grammar and then builds parser output.- Every bind target has a shape: exactly one (
name), repeated (*name), or optional (?name). - That shape should follow control flow: repeated grammar needs
*, optional grammar needs?, and always-run grammar can use a plain name.
If one of those three ideas feels off, most confusing bind errors become easy to diagnose.
§Mental model
capture! is the bridge between matcher syntax and parser output:
input
-> matcher grammar runs
-> binders write capture slots
-> result expression receives bound values
-> parser returns outputThe macro shape is:
capture!(grammar => output_expression)The grammar side is a matcher expression. It can use literals, ranges, tuple
sequences, one_of(...), many(...), optional(...), commit_on(...), and bind
forms.
The output_expression side is ordinary Rust. It can use names introduced by
binds in the grammar.
§What expands behind the scenes
You usually should think in terms of the source macro, not the generated
Rust. But it helps to know the broad shape of what capture! builds:
- It scans the grammar for
bind!,bind_span!,bind_slice!, anduse_binds!. - It groups bindings by shape into three buckets:
- single (
name) - repeated (
*name) - optional (
?name)
- single (
- It assigns each binding name a slot inside those buckets.
- It rewrites each bind site into an internal binder helper that writes into the appropriate slot while the matcher runs.
- It builds a
Captureparser whose constructor receives the filled buckets and evaluates your=>result expression.
Conceptually, this:
capture!(
(
bind!(identifier(), name),
optional((':', bind!(ty_parser(), ?ty))),
) => Declaration { name, ty }
)turns into something closer to:
Capture::new(
|single_props, multiple_props, optional_props| {
(
bind_result(identifier(), single_props.name_slot),
optional((':', bind_result(ty_parser(), optional_props.ty_slot))),
)
},
|single_values, _multiple_values, optional_values| {
let name = /* read required single slot */;
let ty = /* read optional slot */;
Declaration { name, ty }
},
)That is not the literal emitted code, but it is the right mental model:
- the grammar becomes a matcher tree
- binds become writes into typed slots
- the result expression becomes a constructor over those slots
Internally, the buckets are tuple-shaped match results split into
(single, multiple, optional), and snapshots of the same layout are what make
use_binds! work inside diagnostic factories.
Why this matters:
- repeated compatible binds can be merged into one slot
- incompatible sigils / explicit types can be rejected at macro time
- backtracking can subtract captures from the current result when a branch is abandoned
use_binds!can read a stable snapshot of earlier captures without changing the parser result model
§Start small
The smallest useful capture binds one parser result and returns it:
use marser::capture;
use marser::parser::Parser;
let p = capture!(bind!('x', ch) => ch);
assert_eq!(p.parse_str("x").unwrap().0, 'x');Here:
'x'is a parser overcharinput.bind!('x', ch)runs that parser and stores its output inch.=> chreturns that captured value.
If the input starts with x, this parser returns Some('x'). If it does not,
the parser returns None.
§Parser arguments vs matcher arguments
The bind macros accept different kinds of inner expressions:
bind!(parser, name)needs a parser, because it stores parser output.bind_span!(matcher, span)needs a matcher, because it stores where a matcher succeeded.bind_slice!(matcher, text)needs a matcher, because it stores the source slice consumed by a matcher.
Use bind! for meaning, bind_span! for location, and bind_slice! for source
text.
§A progressive example
Suppose a small language has names, comma-separated name lists, and optional type annotations:
name
name, other, third
name: Type§Step 1: capture source text
Start with a name parser that returns the original source text:
use marser::capture;
use marser::matcher::multiple::many;
use marser::one_of::one_of;
use marser::parser::Parser;
fn name_parser<'src>() -> impl Parser<'src, &'src str, Output = &'src str> + Clone {
capture!(
bind_slice!(
(
one_of(('a'..='z', 'A'..='Z', '_')),
many(one_of(('a'..='z', 'A'..='Z', '0'..='9', '_'))),
),
text
) => text
)
}
assert_eq!(name_parser().parse_str("foo12").unwrap().0, "foo12");The matcher recognizes the spelling of a name. bind_slice! returns the slice of
input that was consumed. This avoids allocating a new String.
§Step 2: add a span
If later diagnostics need to point at the name, capture the span too. Nesting
bind_span! around bind_slice! is a common pattern (the capture! macro
rewrites the inner bind_slice! for you):
let name_node = capture!(
bind_span!(
bind_slice!(
(
one_of(('a'..='z', 'A'..='Z', '_')),
many(one_of(('a'..='z', 'A'..='Z', '0'..='9', '_'))),
),
text
),
span
) => Name { text, span }
);bind_span! stores the consumed (start, end) positions. The errors chapter
uses spans to add extra diagnostic labels; see
Errors and Recovery.
§Step 3: collect repeated values
A repeated bind uses *name and produces a Vec<_>:
use marser::capture;
use marser::matcher::multiple::many;
use marser::one_of::one_of;
use marser::parser::Parser;
fn name_parser<'src>() -> impl Parser<'src, &'src str, Output = &'src str> + Clone {
capture!(
bind_slice!(
(
one_of(('a'..='z', 'A'..='Z', '_')),
many(one_of(('a'..='z', 'A'..='Z', '0'..='9', '_'))),
),
text
) => text
)
}
fn name_list_parser<'src>() -> impl Parser<'src, &'src str, Output = Vec<&'src str>> + Clone {
let name = name_parser();
capture!(
(
bind!(name.clone(), *names),
many((',', bind!(name.clone(), *names))),
) => names
)
}
assert_eq!(
name_list_parser().parse_str("a,b,c").unwrap().0,
vec!["a", "b", "c"]
);Each successful bind!(name.clone(), *names) appends one value to names.
§Step 4: capture optional syntax
An optional bind uses ?name and produces an Option<_>:
use marser::capture;
use marser::matcher::optional::optional;
use marser::parser::{token_parser, Parser};
fn annotated_digit<'src>() -> impl Parser<'src, &'src str, Output = (u32, Option<u32>)> + Clone {
let digit = token_parser(
|c: &char| c.is_ascii_digit(),
|c| c.to_digit(10).unwrap(),
);
capture!((
bind!(digit.clone(), lhs),
optional((':', bind!(digit.clone(), ?rhs))),
) => (lhs, rhs))
}
assert_eq!(annotated_digit().parse_str("1:2").unwrap().0, (1, Some(2)));
assert_eq!(annotated_digit().parse_str("3").unwrap().0, (3, None));The earlier steps used slices for names; here a tiny digit grammar keeps the
optional-bind idea obvious: if the ':' branch is present, rhs is Some(...).
If it is absent, rhs is None.
§The three bind macros
§bind!
bind!(parser, name) runs a parser and stores its output.
Use it when you need the parsed meaning of a grammar part:
capture!(
(bind!(name.clone(), key), ':', bind!(expr.clone(), value))
=> Pair { key, value }
)Use bind! when the inner parser already produces the value you want: an AST
node, a normalized number, an enum variant, or any other semantic value.
§bind_span!
bind_span!(matcher, span) runs a matcher and stores only the consumed span:
use marser::capture;
use marser::parser::Parser;
fn at_sign_span<'src>() -> impl Parser<'src, &'src str, Output = (usize, usize)> + Clone {
capture!(bind_span!('@', at_sign) => at_sign)
}
assert_eq!(at_sign_span().parse_str("@").unwrap().0, (0, 1));Use it when diagnostics or later AST nodes need a source location, but not the matched text.
§bind_slice!
bind_slice!(matcher, text) runs a matcher and stores the input slice covered by
that matcher:
use marser::capture;
use marser::matcher::one_or_more::one_or_more;
use marser::parser::Parser;
fn digits_slice<'src>() -> impl Parser<'src, &'src str, Output = &'src str> + Clone {
capture!(bind_slice!(one_or_more('0'..='9'), digits) => digits)
}
assert_eq!(digits_slice().parse_str("4096").unwrap().0, "4096");Use it when exact source spelling matters: identifiers, number literals, invalid fragments, comments, and lossless syntax trees.
bind_slice! captures raw text, not semantic meaning. For example, a string
literal slice still contains escape syntax, and a number slice still needs to be
validated or converted if later code needs a number.
§Bind shapes
Every bind target has one of three shapes:
bind!(parser, name) // exactly one; result sees name: T
bind!(parser, *names) // zero or more; result sees names: Vec<T>
bind!(parser, ?name) // zero or one; result sees name: Option<T>The same shapes work with bind_span!, bind_slice!, and the span target in
bind!(parser, value, span).
Choose the shape based on how many times that bind can run on a successful parse path:
- Use plain
namewhen the grammar must execute the bind exactly once. - Use
*namewhen the bind appears in repeated grammar or should collect many occurrences. - Use
?namewhen the grammar may succeed without executing the bind.
§A fast decision rule
When you are choosing a bind form, ask:
- Do I want the semantic output of a parser? Use
bind!. - Do I only want the location of syntax? Use
bind_span!. - Do I want the exact source text that matched? Use
bind_slice!.
Then ask how many times that bind can run on a successful path:
- exactly once ->
name - zero or more times ->
*name - zero or one time ->
?name
§Bind placement rules
capture! stores values in generated slots:
- a plain bind writes one required slot
- an optional bind writes one optional slot
- a repeated bind appends to a vector
That means bind shape must follow control flow.
capture! is already defensive about some mistakes. Several invalid forms are rejected at macro expansion time, not later at runtime. The compile-fail tests under tests/ui/ cover examples such as:
- mixing incompatible sigils for the same binding name (
xvs*xvs?x) - giving the same binding conflicting explicit
astypes - using the same identifier for both the value and span targets in
bind!(..., value, span) - passing extra trailing arguments to
bind!
The remaining mistakes to watch for are the ones where the syntax is valid but the chosen bind shape does not match the grammar path.
Good repeated bind:
use marser::capture;
use marser::matcher::multiple::many;
use marser::parser::{token_parser, Parser};
fn many_digits<'src>() -> impl Parser<'src, &'src str, Output = Vec<u32>> + Clone {
let digit = token_parser(
|c: &char| c.is_ascii_digit(),
|c| c.to_digit(10).unwrap(),
);
capture!(many(bind!(digit, *digits)) => digits)
}
let _ = many_digits();Bad repeated bind (shape mismatch: many can run the bind multiple times, but
the target is a single value, not *digits):
capture!(
many(bind!(digit, digit)) => digit
)This kind of shape mismatch is logically wrong because the grammar can execute the bind many times while the target only has room for one value.
Good optional bind:
use marser::capture;
use marser::matcher::optional::optional;
use marser::parser::{token_parser, Parser};
fn optional_sign<'src>() -> impl Parser<'src, &'src str, Output = Option<char>> + Clone {
let sign_parser = token_parser(
|c: &char| *c == '+' || *c == '-',
|c| *c,
);
capture!(optional(bind!(sign_parser, ?sign)) => sign)
}
assert_eq!(optional_sign().parse_str("-").unwrap().0, Some('-'));
assert_eq!(optional_sign().parse_str("").unwrap().0, None);Bad optional bind (shape mismatch: optional may skip the bind, but sign is
not optional):
capture!(
optional(bind!(sign_parser, sign)) => sign
)This kind of shape mismatch is logically wrong because the grammar can succeed
without ever assigning the required sign slot.
When a shape mismatch is not rejected at macro time, the failure usually shows up later during parsing or output construction:
- a single or optional slot was set more than once
- a required single slot was never set before output construction
§Binds inside choices
one_of(...) tries alternatives from left to right. Be careful when alternatives
bind different required names.
This is usually wrong:
capture!(
one_of((
bind!(string_parser, string),
bind!(number_parser, number),
)) => Value::from_parts(string, number)
)Only one branch runs, so the other required bind is unset.
As a rule of thumb: if branches mean different semantic cases, make each
branch build its own output and then choose between those parsers. Do not try to
share several unrelated required bind names across one outer capture!.
Prefer making each branch produce its own parser output, then choose between those parsers:
let string_value = capture!(bind!(string_parser, value) => Value::String(value));
let number_value = capture!(bind!(number_parser, value) => Value::Number(value));
let value = one_of((string_value, number_value));If alternatives are different spellings of the same concept, bind the same name and shape from each branch only when that shape is valid for the result.
§Value plus span
bind! can capture parser output and the span consumed by that parser:
bind!(parser, value, span)
bind!(parser, *values, *spans)
bind!(parser, ?value, ?span)Use this when you need both parsed meaning and source location:
use marser::capture;
use marser::parser::{token_parser, Parser};
#[derive(Debug, PartialEq)]
struct IdentNode {
ident: char,
span: (usize, usize),
}
fn ident_with_span<'src>() -> impl Parser<'src, &'src str, Output = IdentNode> + Clone {
let identifier_parser = token_parser(
|c: &char| c.is_ascii_lowercase(),
|c| *c,
);
capture!(
bind!(identifier_parser, ident, ident_span) => IdentNode {
ident,
span: ident_span,
}
)
}
assert_eq!(
ident_with_span().parse_str("x").unwrap().0,
IdentNode {
ident: 'x',
span: (0, 1),
}
);Keep the value and span shapes the same unless you have a specific reason not to.
If the parser can run many times, both values and spans usually belong in
vectors. If the parser is optional, both usually belong in Option.
§Typed bind targets
Bind targets can include an explicit type when inference needs help:
bind!(digit, *digits as char)
bind!(maybe_sign, ?sign as char)
bind_slice!(number_matcher, text as &'src str)The sigil still controls the outer shape:
name as TgivesT*name as TgivesVec<T>?name as TgivesOption<T>
Use explicit types sparingly. They are most useful when Rust cannot infer a closure output, slice type, or repeated capture type.
§use_binds! in diagnostic factories
Most of this chapter is about building the result of a parser, but the same captured values can also help build diagnostics.
use_binds!(|ctx| { ... }) is meant for inline-error factories such as
err_if_no_match(...) and err_if_matched(...). It gives the factory access to
the binds that were already captured earlier in the same capture!, plus a
diagnostic context value.
That is useful when an error message should point back to syntax you already
matched. For example, after reading an opening parenthesis you may want a
“missing closing parenthesis” error that also highlights where the opening (
appeared.
Shape sketch:
capture!(
(
bind_span!('(', open_paren_span),
/* ... more grammar ... */
')'.err_if_no_match(use_binds!(|ctx| {
InlineError::new("missing closing parenthesis")
.with_span(Some(ctx.span()))
.with_annotation(
open_paren_span.copied().unwrap(),
"opened here",
AnnotationKind::Context,
)
}))
) => ...
)Things to remember:
use_binds!is for diagnostic builders, not normal parser output.- It only makes sense inside the grammar of
capture!(not in the=>result expression), where bind snapshots exist. - The macro expands each site to a
__UseBindsSite::<N>type that implementsBuildInlineErrorwith the same'snapsnapshot model asSnapshotFactory(hand-written factories can useSnapshotFactorydirectly). - It reads the captures that were already established on the successful path up to that point in the grammar.
- The
ctxargument gives you the current diagnostic span / insertion point, while the captured names let you refer back to earlier syntax.
If you want the full diagnostic story, continue with Errors and Recovery.
§Zero-copy parsing with bind_slice!
bind_slice! is the zero-copy bind form. Instead of building a new String or
Vec from matched tokens, it stores a borrowed view into the original input.
That is useful for performance:
- fewer allocations
- less copying
- exact source spelling is preserved
It is also useful for tooling. Formatters, diagnostics, and lossless syntax trees often need original text, not normalized values.
Example:
use marser::capture;
use marser::matcher::{multiple::many, one_or_more::one_or_more, optional::optional};
use marser::parser::Parser;
enum NumberLiteral<'a> {
Raw(&'a str),
}
fn raw_number<'src>() -> impl Parser<'src, &'src str, Output = NumberLiteral<'src>> + Clone {
capture!(
bind_slice!(
(
optional('-'),
one_or_more('0'..='9'),
optional(('.', one_or_more('0'..='9'))),
),
number_text
) => NumberLiteral::Raw(number_text)
)
}
assert!(matches!(
raw_number().parse_str("-12.34").unwrap().0,
NumberLiteral::Raw("-12.34")
));The trade-off is lifetime coupling. The output borrows from the input, so it cannot outlive the source text. If you need owned data, convert the slice at the boundary where ownership is required:
use marser::capture;
use marser::matcher::multiple::many;
use marser::one_of::one_of;
use marser::parser::Parser;
fn ident_to_string<'src>() -> impl Parser<'src, &'src str, Output = String> + Clone {
let identifier_matcher = many(one_of(('a'..='z', 'A'..='Z', '0'..='9', '_')));
capture!(bind_slice!(identifier_matcher, text) => String::from(text))
}
assert_eq!(ident_to_string().parse_str("abc").unwrap().0, "abc");§Bind form matrix
Use this as a compact reference:
bind!(parser, value)
captures: parser output
result: value: T
use when: exactly one semantic value is required
bind!(parser, *values)
captures: parser outputs
result: values: Vec<T>
use when: repeated grammar collects many values
bind!(parser, ?value)
captures: parser output if present
result: value: Option<T>
use when: optional grammar may not run
bind!(parser, value, span)
captures: parser output and source span
result: value: T, span: (Pos, Pos)
use when: semantic value also needs a diagnostic/source location
bind_span!(matcher, span)
captures: source span
result: span: (Pos, Pos)
use when: only the location matters
bind_slice!(matcher, text)
captures: source slice
result: text: Inp::Slice
use when: exact source text should be borrowedAdd * or ? to bind_span! and bind_slice! targets the same way as
bind!: *spans becomes Vec<(Pos, Pos)>, and ?text becomes
Option<Inp::Slice>.
§Common mistakes
§Plain bind inside repetition
Use *items, not item, inside many(...).
§Plain bind inside optional grammar
Use ?item, not item, inside optional(...) unless another part of the
grammar guarantees the bind runs.
§Different required binds in one_of(...)
If only one branch runs, required binds from the other branches are unset. Prefer
branch-local capture! parsers that each produce the same output type.
§Assuming every bind mistake becomes a runtime bug
Some mistakes are caught earlier. capture! already rejects a number of invalid
bind forms during macro expansion, and the trybuild tests in tests/ui/
exercise examples such as incompatible sigils, conflicting explicit types,
duplicate value/span names, and trailing bind! arguments.
§Using bind_slice! when owned data is needed
bind_slice! borrows from input. If the parsed value must outlive the source,
convert to an owned value at the boundary.
§Treating raw slices as normalized values
A slice preserves spelling. It does not decode escapes, validate a number, or intern an identifier by itself.
§Forgetting * always means a vector
*items is Vec<T> even if it matched exactly once.
§Designing captures
Good capture design keeps parser rules predictable:
- Match structure with matchers and capture only what the AST or diagnostic layer needs.
- Let bind shape follow control flow: repeated grammar gets
*, optional grammar gets?, and required grammar gets a plain bind. - Prefer branch-local
capture!parsers for alternatives that produce different output shapes. - Prefer
bind_slice!for source text you can borrow, especially identifiers, literals, invalid fragments, and lossless parsing. - Prefer
bind_span!when diagnostics only need a location. - Parse into owned or normalized values only when later code benefits from that representation.
A useful workflow when designing a new capture!:
- Write the matcher grammar first.
- Mark each interesting piece as one, many, or optional.
- Pick
bind!,bind_span!, orbind_slice!based on whether you need meaning, location, or source text. - Only then write the output expression.
That order tends to prevent most bind-shape mistakes before the compiler or tests need to point them out.
For lower-level implementation details, see the Capture, ResultBinder,
SpanBinder, and SliceBinder entries in
Parser and Matcher Reference.
§Quick reference (bind shapes)
bind!(parser, value) // value: T
bind!(parser, *values) // values: Vec<T>
bind!(parser, ?value) // value: Option<T>
bind!(parser, value, span) // value: T, span: (Pos, Pos)
bind!(parser, *values, *spans) // values: Vec<T>, spans: Vec<(Pos, Pos)>
bind!(parser, ?value, ?span) // value: Option<T>, span: Option<(Pos, Pos)>
bind_span!(matcher, span) // span: (Pos, Pos)
bind_span!(matcher, *spans) // spans: Vec<(Pos, Pos)>
bind_span!(matcher, ?span) // span: Option<(Pos, Pos)>
bind_slice!(matcher, text) // text: Inp::Slice
bind_slice!(matcher, *texts) // texts: Vec<Inp::Slice>
bind_slice!(matcher, ?text) // text: Option<Inp::Slice>