marser 0.1.1

Parser combinator toolkit with matcher-level backtracking and rich error reporting.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
API reference for parsers and matchers.

<div style="background-color: #fff8e1; border-left: 4px solid #f9a825; padding: 0.75em 1em; margin: 1em 0;">

**AI assistance:** This chapter was drafted with AI assistance while the library is still young. The guide is expected to improve over time as APIs and examples stabilize. If anything looks wrong or confusing, please [report it on GitHub](https://github.com/ArneCode/marser/issues/new).

</div>

# Parser and Matcher Reference

This chapter is a compact map of the parser and matcher building blocks in
`marser`.

If you are still learning how `capture!`, `bind!`, `bind_span!`, and
`bind_slice!` work, read [Capture and Binds](crate::guide::capture_and_binds)
first. This page is meant as a reference once you know the basic grammar-writing
model. For recipe-style snippets (whitespace, lists, recursion), see
[Common patterns](crate::guide::common_patterns).

## Parser vs matcher

`marser` separates parsing into two layers:

- A **matcher** checks that an input shape is present. It may consume input, look
  ahead, record diagnostics, or bind values into a `capture!`.
- A **parser** produces a Rust value. Parsers can be primitive token parsers,
  `capture!` parsers, choices, recursive parsers, or wrappers around other
  parsers.

In day-to-day grammar code:

- Use matchers to describe the syntax shape.
- Use parsers when a rule should produce a typed value.
- Use `capture!` to run matcher logic and build parser output.

## Core parser traits

### `Parser`

`Parser` is implemented by values that can parse input and produce an `Output`.
You normally do not implement it yourself; compose the parser types in this crate
instead.

A parser has three important outcomes:

- `Some(output)` means it matched and produced a value.
- `None` means it did not match at the current input position.
- `Err(FurthestFailError)` means a committed parse failed and should be reported.

### `ParserCombinator`

Every parser gets these extension methods:

- `memoized()` caches parse results at each input position.
- `recover_with(recovery_parser)` turns a hard failure into fallback output when
  recovery succeeds.
- `add_error_info(error_parser)` enriches hard errors with notes, help, or extra
  labels.
- `ignore_result()` runs the parser as a matcher and discards the output.
- `map_output(f)` maps successful output to another value.
- `erase_types()` boxes a parser behind a stable erased type.

## Parser building blocks

### `Capture`

`Capture` is the parser behind `capture!`. It runs a matcher, collects bound
values, spans, or slices, and calls a constructor to produce output.

Use `capture!` in normal code:

```rust,ignore
capture!(
    ('"', bind_slice!(many(non_quote_char), text), '"') => text
)
```

See [Capture and Binds](crate::guide::capture_and_binds) for the full guide.

### `TokenParser` and `token_parser`

`token_parser(check_fn, parse_fn)` reads one token, checks it with `check_fn`, and
maps it with `parse_fn`. If the check fails, it rewinds and returns `None`.

```rust
use marser::parser::{token_parser, MultipleParser, Parser};

fn digit_parser<'src>() -> impl Parser<'src, &'src str, Output = u32> + Clone {
    token_parser(
        |c: &char| c.is_ascii_digit(),
        |c| c.to_digit(10).unwrap(),
    )
}

assert_eq!(digit_parser().parse_str("7").unwrap().0, 7);
```

Use this when one token should produce a transformed value.

### `SingleTokenParser`

`SingleTokenParser::new(token)` parses exactly one token equal to `token` and
returns that token. Character literals also implement `Parser` for `char` input,
so `'{'` can be used directly as a parser.

Use it for exact one-token parser rules.

### `RangeParser`, `Range`, and `RangeInclusive`

`RangeParser::new(range)` parses one token contained in a range. Rust ranges such
as `'0'..'9'` and `'0'..='9'` can also be parsers for compatible token types.

Use ranges for compact token classes such as digits or ASCII letters.

### `MultipleParser`

`MultipleParser::new(parser, combine_fn)` repeatedly runs a parser until it no
longer matches, collects the outputs into a `Vec`, and maps that vector with
`combine_fn`.

Use it for parser-level repetition where the repeated parser output should be
combined outside the matcher layer. Inside `capture!`, matcher-level `many(...)`
is usually more natural.

### `OutputMapper`

`parser.map_output(f)` preserves the matching behavior of the parser and maps
only successful output.

Use it when the parse shape is already right but the output needs a small
conversion.

### `ErrorRecoverer`

`parser.recover_with(recovery_parser)` handles hard failures by rewinding to the
start position and trying the recovery parser. If recovery succeeds, `marser`
records the original error as a collected diagnostic and returns the recovery
output.

Use it when malformed input can be represented explicitly, such as
`Invalid(slice)` or `ErrorNode`.

### `Memoized`

`parser.memoized()` caches success or absence for a parser at an input position.
Successful memoized outputs are returned as `Rc<Output>`.

Use it for expensive or recursive rules that may be reached repeatedly from the
same position.

### `Deferred`, `DeferredWeak`, and `recursive`

`recursive(...)` creates a parser that can refer to itself while it is being
built.

```rust,ignore
let value = recursive(|value| {
    one_of((object(value.clone()), array(value), string, number))
});
```

Use it for nested languages such as JSON values, parenthesized expressions, and
blocks.

### `Erased`

`parser.erase_types()` stores the parser behind a boxed trait object. This can
make large combinator types easier to name at the cost of dynamic dispatch.

Use it when parser types become too large or unwieldy.

### `OneOf`

`one_of((a, b, c))` is ordered choice. As a parser, every branch must produce the
same output type. Alternatives are tried left to right.

```rust,ignore
one_of((object, array, string, number, boolean, null))
```

Use it for grammar alternatives.

### `Labeled`

`parser.with_label("value")` attaches a display label to a parser. When the
parser fails softly, the label can appear in expected-token diagnostics.

Use labels at user-facing grammar boundaries, such as `"object"`, `"array"`, or
`"string literal"`.

## Core matcher traits

### `Matcher`

`Matcher` is implemented by values that can match input inside a `Capture`. You
normally compose existing matchers rather than implementing it yourself.

A matcher returns `true` for success and `false` for ordinary absence. It may
also return `Err(FurthestFailError)` when a committed match fails.

### `MatcherCombinator`

Every matcher gets these extension methods:

- `add_error_info(error_parser)` enriches hard errors from the matcher.
- `try_insert_if_missing(message)` records a synthetic missing-token error when
  the matcher fails during real error collection.

## Matcher building blocks

### Tuple sequences

A tuple such as `('(', value, ')')` is a sequential matcher. Each element must
match in order.

Use tuples for ordinary grammar sequencing. Tuples are supported up to the arity
implemented by the crate.

### `()`

The unit value is an empty matcher. It always succeeds and consumes nothing.

Use it as a no-op, or as the first part of `commit_on((), matcher)` when you want
to commit immediately.

### `AnyToken`

`AnyToken` consumes one token and succeeds if input remains. It fails at the end
of input.

Use it for catch-all recovery, unknown tokens, or end-of-input checks with
`negative_lookahead(AnyToken)`.

### `StringMatcher`, `&str`, and `char`

`StringMatcher::new(text)` matches a fixed run of `char` tokens. String slices
and character literals also implement `Matcher`, so `"true"` and `'{'` can often
be used directly.

Use these for literal syntax.

### `Range` and `RangeInclusive`

Rust ranges also implement `Matcher` for compatible token streams. They consume
one token when it is inside the range.

```rust
use marser::matcher::one_or_more::one_or_more;

let _digit_run = one_or_more('0'..='9');
```

Use ranges for character or token classes.

### `many`

`many(matcher)` is greedy zero-or-more repetition. It always succeeds, stops when
the inner matcher fails, and also stops if the inner matcher succeeds without
making progress.

Use it for whitespace, comma tails, repeated digits, and similar syntax.

When binding inside `many(...)`, use a repeated bind such as `bind!(parser,
*items)`.

### `one_or_more`

`one_or_more(matcher)` requires at least one successful match, then behaves like
greedy repetition.

Use it when at least one item is required.

### `optional`

`optional(matcher)` tries the inner matcher once and always succeeds.

Use it for syntax that may or may not be present. When binding inside
`optional(...)`, use an optional bind such as `bind!(parser, ?item)`.

### `positive_lookahead`

`positive_lookahead(matcher)` checks that the inner matcher would match at the
current position, then restores the input position.

Use it to make a decision without consuming input.

### `negative_lookahead`

`negative_lookahead(matcher)` succeeds when the inner matcher does not match at
the current position, then restores the input position.

Use it to reject trailing input, stop before a delimiter, or express "not
followed by" constraints.

### `ParserMatcher`

`ParserMatcher::new(parser, expected_output)` runs a parser as a matcher and
succeeds only when the parser output equals `expected_output`.

Use it when a parser already recognizes the syntax but a matcher needs to check a
specific parsed value.

### `IgnoreResult`

`parser.ignore_result()` runs a parser as a matcher and succeeds when the parser
returns any output. The output is discarded.

Use it when parser recognition behavior is useful inside a matcher but the output
does not need to be bound.

### `commit_on`

`commit_on(prefix, rest)` first matches `prefix`. If `prefix` succeeds, failure
inside `rest` becomes a hard error instead of ordinary absence.

Use it after the grammar has seen enough input to know which rule the user meant.
For example, after seeing `{`, an object parser should report errors inside the
object rather than silently trying another value parser.

### `ErrorContextualizer`

`matcher.add_error_info(error_parser)` wraps a matcher so hard failures can be
enriched. The `error_parser` returns a function that mutates the
`FurthestFailError`.

Use it for extra notes, help, or labels that require local context.

The same wrapper is available for parsers through `ParserCombinator`.

### `InsertOnErrorMatcher`

`matcher.try_insert_if_missing(message)` wraps a matcher so that, during real
error collection, a soft failure can be treated as an inserted missing element
and recorded as a `MissingError`.

Use it for diagnostics like "missing closing bracket" where continuing the match
produces better follow-up errors.

### `IfErrorMatcher`

`if_error(matcher)` and `if_error_else_fail(matcher)` only run the inner matcher
when a real error handler is active. Outside error collection they return a fixed
success or failure result.

Use these for grammar pieces that are meaningful only while building diagnostics.

### `UnwantedMatcher`

`unwanted(matcher, message)` succeeds when the inner matcher succeeds, but records
an `UnwantedError` for the consumed span.

Use it to recognize and report explicitly forbidden syntax while still allowing
recovery to continue.

### `OneOf`

`one_of((a, b, c))` also works as a matcher. It tries alternatives from left to
right and succeeds on the first matching branch.

Use it for alternatives inside `capture!`, such as multiple literal keywords or
valid element forms.

### `Labeled`

`matcher.with_label("label")` attaches a display label to a matcher. The label
can appear in expected-token diagnostics.

Use labels when a grammar name is clearer than a raw literal or range.

## Parser repetition vs matcher repetition

Use matcher repetition when you are still describing syntax inside `capture!`:

```rust
use marser::matcher::multiple::many;

let _ws = many((' ', '\n', '\t'));
```

Use parser repetition when each repeated parse should produce a value that is
combined outside the matcher layer:

```rust
use marser::parser::{token_parser, MultipleParser, Parser};

fn decimal_string<'src>() -> impl Parser<'src, &'src str, Output = String> + Clone {
    let digit = token_parser(
        |c: &char| c.is_ascii_digit(),
        |c| c.to_digit(10).unwrap(),
    );
    MultipleParser::new(digit, |digits: Vec<u32>| {
        digits
            .into_iter()
            .map(|d| char::from_digit(d, 10).unwrap())
            .collect::<String>()
    })
}

let _ = decimal_string();
```

The difference is where values live: matcher repetition binds through capture
properties, while parser repetition returns parser output directly.