Crate bparse

Source
Expand description

§Overview

Parsing usually involves going through a sequence of bytes and branching off based on what was seen. This crate simplifies the task of recognizing complex patterns in a byte slices.

It is made up of three parts:

  1. The Pattern trait: represents the pattern to be recognized.
  2. A list of common functions and combinators for composing Patterns together.
  3. The Buf struct; a Cursor-like wrapper around an input slice that uses patterns to advance its position.

§Example

Recognizing JSON numbers can get tricky. The spec allows for numbers like 12, -398.42, and even 12.4e-3. Here we incrementally build up a pattern called number that can recognizes all JSON number forms.

use bparse::{Buf, Pattern, range, at_least, optional, oneof, end};

let sign = optional(oneof(b"-+"));
let onenine = range(b'1', b'9');
let digit = "0".or(onenine);
let digits = at_least(1, digit);
let fraction = optional(".".then(digits));
let exponent = optional("E".then(sign).then(digits).or("e".then(sign).then(digits)));
let integer = onenine
    .then(digits)
    .or("-".then(onenine).then(digits))
    .or("-".then(digit))
    .or(digit);
let number = integer.then(fraction).then(exponent);

let input = "234||344.5||0.43e12";

let mut buf = Buf::from(input);

let Some(b"234") = buf.consume(number) else {
    panic!();
};

assert!(buf.skip("||"));

let Some(b"344.5") = buf.consume(number) else {
    panic!();
};

assert!(buf.skip("||"));

let Some(b"0.43e12") = buf.consume(number) else {
    panic!();
};

assert!(buf.consume(end()).is_some());

Structs§

Buf
A byte slice with a cursor. THe cursor is advanced using Patterns.
ByteLookupTable
See oneof
Choice
See Pattern::or
Codepoint
See codepoint
ElementRange
See range()
Eof
See end
Not
See not
One
See [one]
Prefix
See prefix
Repetition
Exppresses pattern repetition.
Sequence
See Pattern::then

Traits§

Pattern
Expresses that the implementing type may be used to match a byte slice.

Functions§

alpha
Returns a pattern that will match any ascii letter at the start of the input
any
Returns a new pattern that matches as many repetitions as possible of the given pattern, including 0.
at_least
Returns a new pattern that matches at least n repetitions of pattern.
at_most
Returns a new pattern that matches at most n repetitions of pattern.
between
Returns a new pattern that matches between lo and hi repetitions of pattern.
byte
Returns a new pattern that always matches the next byte in the input if it exists.
codepoint
Returns a new pattern that matches the next utf-8 codepoint if it exists.
count
Returns a new pattern that matches exactly n repetitions of pattern.
digit
Returns a pattern that will match any ascii digit at the start of the input
end
Returns a pattern that matches the end of the slice.
hex
Returns a pattern that will match any hexadecimal character at the start of the input
noneof
Inverse of oneof.
not
Returns a new pattern that matches only if pattern does not match.
oneof
Returns a pattern that will match any byte in bytes at the start of the input
optional
Returns a new pattern that matches 0 or 1 repetitions of pattern
prefix
Returns a new pattern that will match if slice is a prefix of the input.
range
Returns a new pattern that will match an element in the closed interval [lo, hi]
utf8
Returns a new pattern that will match the utf8 string slice s at the start of the input.