Expand description
§Overview
Parsing usually involves going through a sequence of bytes and branching off based on what was seen. This crate simplifies the task of recognizing complex patterns in a byte slices.
It is made up of three parts:
- The
Pattern
trait: represents the pattern to be recognized. - A list of common functions and combinators for composing
Patterns
together. - The
Buf
struct; aCursor
-like wrapper around an input slice that uses patterns to advance its position.
§Example
Recognizing JSON numbers can get tricky.
The spec allows for numbers like 12
, -398.42
, and even 12.4e-3
.
Here we incrementally build up a pattern called number
that can recognizes all JSON number forms.
use bparse::{Buf, Pattern, range, at_least, optional, oneof, end};
let sign = optional(oneof(b"-+"));
let onenine = range(b'1', b'9');
let digit = "0".or(onenine);
let digits = at_least(1, digit);
let fraction = optional(".".then(digits));
let exponent = optional("E".then(sign).then(digits).or("e".then(sign).then(digits)));
let integer = onenine
.then(digits)
.or("-".then(onenine).then(digits))
.or("-".then(digit))
.or(digit);
let number = integer.then(fraction).then(exponent);
let input = "234||344.5||0.43e12";
let mut buf = Buf::from(input);
let Some(b"234") = buf.consume(number) else {
panic!();
};
assert!(buf.skip("||"));
let Some(b"344.5") = buf.consume(number) else {
panic!();
};
assert!(buf.skip("||"));
let Some(b"0.43e12") = buf.consume(number) else {
panic!();
};
assert!(buf.consume(end()).is_some());
Structs§
- Buf
- A byte slice with a cursor. THe cursor is advanced using Patterns.
- Byte
Lookup Table - See
oneof
- Choice
- See
Pattern::or
- Codepoint
- See
codepoint
- Element
Range - See
range()
- Eof
- See
end
- Not
- See
not
- One
- See [
one
] - Prefix
- See
prefix
- Repetition
- Exppresses pattern repetition.
- Sequence
- See
Pattern::then
Traits§
- Pattern
- Expresses that the implementing type may be used to match a byte slice.
Functions§
- alpha
- Returns a pattern that will match any ascii letter at the start of the input
- any
- Returns a new pattern that matches as many repetitions as possible of the given
pattern
, including 0. - at_
least - Returns a new pattern that matches at least
n
repetitions ofpattern
. - at_most
- Returns a new pattern that matches at most
n
repetitions ofpattern
. - between
- Returns a new pattern that matches between
lo
andhi
repetitions ofpattern
. - byte
- Returns a new pattern that always matches the next byte in the input if it exists.
- codepoint
- Returns a new pattern that matches the next utf-8 codepoint if it exists.
- count
- Returns a new pattern that matches exactly
n
repetitions ofpattern
. - digit
- Returns a pattern that will match any ascii digit at the start of the input
- end
- Returns a pattern that matches the end of the slice.
- hex
- Returns a pattern that will match any hexadecimal character at the start of the input
- noneof
- Inverse of
oneof
. - not
- Returns a new pattern that matches only if
pattern
does not match. - oneof
- Returns a pattern that will match any byte in
bytes
at the start of the input - optional
- Returns a new pattern that matches 0 or 1 repetitions of
pattern
- prefix
- Returns a new pattern that will match if
slice
is a prefix of the input. - range
- Returns a new pattern that will match an element in the closed interval
[lo, hi]
- utf8
- Returns a new pattern that will match the utf8 string slice
s
at the start of the input.