Crate bset[][src]

Fast and compact sets of bytes or ASCII characters, useful for searching, parsing and determining membership of a given byte in the given set. They don’t use any allocation, nor even any std features. In fact, all of the provided functions are const, so they can be freely constructed at the compile time

Sets

This crate exports two set types - ByteSet for a set of general bytes and AsciiSet for a set restricted to the range of ASCII characters. The is two times smaller in the memory, but comes with a slight performance trade-off in having to check whether the given character belongs to the ASCII range.

use bset::AsciiSet;

const OP: AsciiSet = AsciiSet::new().add_bytes(b"+-*/%&|^");
assert!(OP.contains(b'%'));

The sets are implemented as an array of pointer-sized bit masks.

Stack of sets

Inspired by this article by Maciej Hirsz, this crate provides a way to stack multiple sets into one structure to gain even more performance. To not slow it down, sets are “obtained” from the given stack at the type level. For this, the module bits contains types B0, B1, …, B7 representing indices of a set in the stack. Because const fns currently don’t support generic functions, the sets are indexed by the order they were added to the stack. Type aliases can be used to identify the sets within the stack:

use bset::{bits::*, ByteSet, ByteStack};
 
const BYTE_STACK: ByteStack<B3> = ByteStack::new()
    .add_set(ByteSet::DIGITS)
    .add_set(ByteSet::ALPHABETIC)
    .add_set(ByteSet::new().add_bytes(b"+-*/%&|^"));
type Digits = B0;
type Alphabetic = B1;
type Operations = B2;
assert!(BYTE_STACK.contains::<Operations>(b'%'));

Again, there are two versions, ByteStack for all bytes and AsciiStack restricted to the ASCII range. Benchmarks show that testing the set membership is about 20% faster with stacked sets. They come with 8 times larger memory size (128/256 bytes vs. 16/32), which does not increase with the stacks added, so when 8 sets (the maximum number) are used in one stack, the memory size is equivalent.

Modules

bits

Types that denote the position of a byte set within a byte stack.

Structs

AnyByteSet

A compact set of bytes. Only particular instances - AsciiSet and ByteSet can be constructed.

AnyByteStack

A compact stack of up to 8 byte sets for fast lookup. Only particular instances - AsciiStack and ByteStack can be constructed.

Constants

ASCII_RANGE_LEN

Range of ASCII characters.

BITS_PER_CHUNK

Number of bytes in one chunk of the mask.

CHUNKS

Number of chunks in the ASCII set.

CHUNK_SIZE

Size of one chunk of the mask in the implementation of byte sets.

Type Definitions

AsciiSet

A compact set of ASCII bytes. Spans only 16 bytes.

AsciiStack

A compact stack of up to 8 ASCII sets for fast lookup.

ByteSet

A compact set of all bytes. Spans only 32 bytes.

ByteStack

A compact stack of up to 8 full byte sets for fast lookup.