[][src]Crate rsdict

'RsDict' data structure that supports both rank and select over a bitmap.

This crate is an implementation of Navarro and Providel, "Fast, Small, Simple Rank/Select On Bitmaps", with heavy inspiration from a Go implementation.

use rsdict::RsDict;

let mut r = RsDict::new();
r.push(false);
r.push(true);
r.push(true);
r.push(false);

// There's one bit set to the left of index 2.
assert_eq!(r.rank(2, true), 1);

// The index of the second (zero-indexed as 1) bit is 3.
assert_eq!(r.select(1, false), Some(3));

Implementation notes

First, we store the bitmap in compressed form. Each block of 64 bits is stored with a variable length code, where the length is determined by the number of bits set in the block (its "class"). Then, we store the classes (i.e. the number of bits set per block) in a separate array, allowing us to iterate forward from a pointer into the variable length buffer.

To allow efficient indexing, we then break up the input into LARGE_BLOCK_SIZE blocks and store a pointer into the variable length buffer per block. As with other rank structures, we also store a precomputed rank from the beginning of the large block.

Finally, we store precomputed indices for selection in separate arrays. For every SELECT_BLOCK_SIZEth bit, we maintain a pointer to the large block this bit falls in. We also do the same for zeros.

Then, we can compute ranks by consulting the large block rank and then iterating over the small block classes before our desired position. Once we've found the boundary small block, we can then decode it and compute the rank within the block. The choice of variable length code allows computing its internal rank without decoding the entire block.

Select works similarly where we start with the large block indices, skip over as many small blocks as possible, and then select within a small block. As with rank, we're able to select within a small block directly.

Structs

RsDict

Data structure for efficiently computing both rank and select queries