bitpacking 0.1.0

Fast integer compression/decompression via SIMD bit-packing. Port of simdcomp to rust.
docs.rs failed to build bitpacking-0.1.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build: bitpacking-0.9.2

Fast Bitpacking algorithms

This crate is a Rust port of Daniel Lemire's simdcomp C library. It contains different implementation of integers compression via bitpacking. Each implementation requires on a different CPU SIMD instruction set, for state of the art performance.

This crate is used by tantivy.

For instance, with SSE3, you can typically expect more than 4 billions ints per seconds. Check the Benchmark for more accurate details.

What is BitPacking ?

Compressing small integers or increasing integers is a very common problem in search engines, databases, analytics. The latter can be reduced to the earlier via delta-encoding (i.e. Encoding the difference between consecutive difference rather than the integers themselves). Traditional compression scheme like LZ4 is really not very efficient to address this problem.

There are different families of solutions to this problem. One of the most straightforward and efficient one is bitpacking :

  • Integers are first grouped into blocks of constant size (e.g. 128 when using the SSE2 implementation).
  • Compute the minimum number of bits b that makes it possible to represent all of the integers. In other words, the smallest b such that all integers in the block are stricly smaller than 2n.
  • The bitpacked representation is then some variation of the concatenation of the integers restricted to their least significant b-bits.

For instance, assuming a block of 4, when encoding 4, 9, 3, 2. Assuming that the highest value in the block is 9, b = 4. All values will then be encoded over 4 bits as follows.

original number binary representation
4 0100
9 1001
3 0011
2 0010

As a result

Usage

This crate contains different implementation, depending on the instruction set available with your processor. The resulting format are not compatible one with each other.

Currently the following are available :

  • scalar: implementation not using any SIMD instruction. This implementation is still very performant
  • SSE2:
  • AVX

Compressing

extern crate bitpacking;
use bitpacking::{SSE3BitPacker, BitPacker};

// ...
let num_bits: u8 = SSE3BitPacker::num_bits(&fake_data);
let mut compressed = vec![0u8; (num_bits as usize) * SSE3BitPacker::BLOCK_LEN / 8];
SSE3BitPacker::compress(&fake_data, &mut compressed[..], num_bits);

Decompressing

extern crate bitpacking;
use bitpacking::{SSE3BitPacker, BitPacker};

// ...
let mut decompressed = vec![0u32; SSE3BitPacker::BLOCK_LEN];
SSE3BitPacker::decompress(&compressed, &mut decompressed[..], num_bits);

assert_eq!(&fake_data, &decompressed);

Benchmark

What kind of software could benefit from this crate?

Make sure to compile with

RUSTFLAGS="-C target-cpu=native"

Reference

Other crates you might want to check out

  • stream vbyte A Stream-VByte implementation
  • mayda Another crate implementation the same algorithms.