Crate vers_vecs

source ·
Expand description

This crate provides a collection of data structures supported by fast implementations of rank and select queries. The data structures are static, meaning that they cannot be modified after they have been created.

§Data structures

  • Bit-Vector with no overhead. The only data structure that can be modified after creation.
  • Succinct Bit-Vector supporting fast rank and select queries.
  • Elias-Fano encoding of monotone sequences supporting constant time predecessor queries.
  • Two Range Minimum Query structures for constant time range minimum queries.

§Performance

Performance was benchmarked against publicly available implementations of the same (or similar) data structures on crates.io at the time of writing. The benchmark results can be found in the Github repository. At the time of writing, this crate is among the fastest implementations of all data structures. Some tradeoffs between average time, worst-case time, and available API features should be taken into consideration when selecting among the fastest libraries (see the Github repository for a discussion).

§Intrinsics

This crate uses compiler intrinsics for bit-manipulation. The intrinsics are supported by all modern x86_64 CPUs, but not by other architectures. Since the data structures depend very heavily on these intrinsics, they are forcibly enabled, which means the crate will not compile on non-x86_64 architectures, and will not work correctly on very old x86_64 CPUs.

The intrinsics in question are popcnt (supported since SSE4.2 resp. SSE4a on AMD, 2007-2008), pdep (supported with BMI2 since Intel Haswell resp. AMD Excavator, in hardware since AMD Zen 3, 2011-2013), and tzcnt (supported with BMI1 since Intel Haswell resp. AMD Jaguar, ca. 2013).

§Safety

This crate uses no unsafe code, with the only exception being compiler intrinsics for bit-manipulation. The intrinsics cannot fail with the provided inputs (provided they are supported by the target machine), so even if they were to be implemented incorrectly, no memory unsafety can occur (only incorrect results).

Re-exports§

Modules§

  • This module contains a simple bit vector implementation with no overhead and a fast succinct bit vector implementation with rank and select queries.
  • Elias-Fano encoding for sorted vectors of u64 values. It reduces the space required to represent all numbers (compression ratio dependent on data) and allows for constant time predecessor queries.
  • Range minimum query data structures. These data structures allow to calculate the index of the minimum element in a range of a static array in constant time. The implementations are located in the binary_rmq and fast_rmq modules.