Skip to main content

Crate hi_sparse_bitset

Crate hi_sparse_bitset 

Source
Expand description

Hierarchical sparse bitset.

Memory consumption does not depend on max index inserted.

Level0       128bit SIMD
              [u8;128]

            ┌           ┐
Level1  Vec │128bit SIMD│
            │ [u16;128] │
            └           ┘
            ┌           ┐
Data    Vec │128bit SIMD│
            └           ┘
────────────────────────────────────────────────────
                       1 0 1   ...   1  ◀══ bit-mask
Level0                 □ Ø □         □  ◀══ index-pointers
                     ┗━│━━━│━━━━━━━━━│┛
                    ╭──╯   ╰──────╮  ╰───────────────────╮
          1 0 0 1   ▽             ▽   1                  ▽
Level1    □ Ø Ø □  ...           ...  □  ...
        ┗━│━━━━━│━━━━━━━┛     ┗━━━━━━━│━━━━━━┛    ┗━━━━━━━━━━━━━━┛
          ╰──╮  ╰─────────────────╮   ╰───────────────╮
             ▽                    ▽                   ▽
Data     1 0 0 0 1 ...       0 0 1 1 0 ...       0 1 0 1 0 ...    ...
       ┗━━━━━━━━━━━━━━━┛   ┗━━━━━━━━━━━━━━━┛   ┗━━━━━━━━━━━━━━━┛

The very structure of BitSet acts as acceleration structure for intersection operation. All operations are incredibly fast - see benchmarks. (insert/contains in “traditional bitset” ballpark, intersection/union - orders of magnitude faster)

It is multi-level structure. Last level contains actual bit-data. Each previous level have bitmask, where each bit corresponds to !is_empty of bitblock in next level.

In addition to “non-empty-marker” bitmasks, there is pointers(indices) to non-empty blocks in the next level. In this way, only blocks with actual data allocated.

For inter-bitset operations, for example, intersection:

  • root level bitmasks AND-ed.
  • resulting bitmask traversed for bits with 1.
  • indexes of bits with 1, used for getting pointers to the next level for each bitset.
  • repeat for next level until the data level, then for each next 1 bit in each level.

Bitmasks allow to cutoff empty tree/hierarchy branches early for intersection operation, and traverse only actual data during iteration.

In addition to this, during the inter-bitset operation, level1 blocks of bitsets are cached for faster access. Empty blocks are skipped and not added to the cache container, which algorithmically speeds up bitblock computations at the data level. This has an observable effect on a merge operation between N non-intersecting bitsets: without this optimization - the data level bitmask would be OR-ed N times; with it - only once.

§Config

Max index BitSet can hold, depends on used bitblocks capacity. The bigger the bitblocks - the higher BitSet index range. The lower - the smaller memory footprint it has.

Max index for 64bit blocks = 262_144; for 256bit blocks = 16_777_216.

Use BitSet with predefined config:

type BitSet = hi_sparse_bitset::BitSet<hi_sparse_bitset::config::_128bit>;

§Inter-bitset operations

Inter-bitset operations can be applied between ANY BitSetInterface. Output of inter-bitset operations are lazy bitsets(which are BitSetInterfaces too). This means that you can combine different operations however you want without ever materializing them into actual BitSet.

Use reduce() to apply inter-bitset operation between elements of bitsets iterator.

Use apply() to apply inter-bitset operation between two bitsets. Also &, |, ^, - over &BitSet, or corresponding BitSetInterface members.

You can define your own inter-bitset operation by implementing BitSetOp.

§Laziness and materialization

You can collect any BitSetInterface into container with From::from(impl BitSetInterface). It is faster then collecting iterator.

§Eager inter-bitset operations

Thou we encourage use of lazy inter-bitset operations whenever possible, there are optimized eager in-place operations in BitSet:

§Cursor

BitSetInterface iterators can return cursor(), pointing to the current iterator position. You can use Cursor to move ANY BitSetInterface iterator to its position with move_to.

You can also build cursor from index.

§Iterator::for_each

BitSetInterface iterators have for_each specialization and stable try_for_each version - traverse. For tight loops, traversing is observably faster than iterating.

§TrustedHierarchy

TrustedHierarchy means that each raised bit in hierarchy bitblock is guaranteed to correspond to non-empty block. That may be not true for difference and symmetric difference operation result.

You can check if bitset has TrustedHierarchy with BitSetBase::TRUSTED_HIERARCHY.

Bitsets with TrustedHierarchy are faster to materialize, compare with Eq and have O(1) is_empty().

§DataBlocks

You can iterate DataBlocks instead of individual indices. DataBlocks can be moved, cloned and iterated for indices.

§Immutable bitset

ImmutableBitset is “linearized” version of BitSet - with almost no “meta” memory overhead. If you don’t need to change your data - prefer it. Also, consider using it when you need to materialize - as intermediate or final step in computation series - since it is often algorithmically faster to build a new bitset, then to mutate existent (intersection as example).

§Serialization

BitSet have fast and compact serialization. Deserialization is the fastest way to fill BitSet, since serialized data store contiguous hierarchy bitblocks.

Current and previous serialization formats see in /doc/serialization_format/.

§Serde

Enable feature serde - for serde serialization support.

§Zero-copy usage

DirectBitset works with any byte source that can provide serialized data. You can use this with serialized data in memory-mapped file.

§CPU extensions

Library uses popcnt/count_ones and tzcnt/trailing_zeros heavily. Make sure you compile with hardware support of these (on x86: target-feature=+popcnt,+bmi1).

DirectBitset and ImmutableBitset can utilize bzhi instructions, when available (on x86: target-feature=+bmi2).

§SIMD

128 and 256 bit configurations use SIMD, powered by wide. Make sure you compile with simd support enabled (on x86: sse2 for _128bit, avx for _256bit) to achieve the best performance. sse2 enabled by default in Rust for most desktop environments

If you don’t need “wide” configurations, you may disable the default feature simd.

Modules§

cache
Cache used during traverse by reduce operations.
config
Configurations for BitSet.
iter
Iteration always return ordered (or sorted) index sequences.
ops
Operations for apply and reduce.

Structs§

Apply
Binary operation application, as lazy bitset.
BitSet
Hierarchical sparse bitset.
DataBlock
Traversable bit block with offset.
DataBlockIter
DataBlock iterator.
DirectBitset
Bitset that work directly with any source of serialized data.
ImmutableBitset
Bitset with serialized-like linear data structure.
MemInfo
BitSet memory usage info.
Reduce
Bitsets iterator reduction, as lazy bitset.

Enums§

AccessError

Constants§

SERIALIZATION_FORMAT_VER
Current serialization format version.

Traits§

BitBlock
Bit block.
BitSetBase
BitSetInterface
Bitset interface.
DirectDataSource
Data source for DirectBitset.

Functions§

apply
Creates a lazy bitset, as BitSetOp application between two bitsets.
reduce
Creates a lazy bitset, as bitsets iterator reduction.
reduce_w_cache
reduce, using specific cache for iteration.