bitpacking 0.2.0

Fast integer compression/decompression via SIMD bit-packing. Port of simdcomp to rust.
Documentation

Fast Bitpacking algorithms

This crate is a Rust port of Daniel Lemire's simdcomp C library. It contains different implementation of integers compression via bitpacking. Each implementation requires on a different CPU SIMD instruction set, for state of the art performance.

This crate is used by tantivy.

What is BitPacking ?

Compressing small integers is a very common problem in search engines, databases, and analytics.

Compressing increasing integers can also be reduced to compressing small integer via delta-encoding : instead of encoding the values themselves, one encodes the gaps between consecutive values.

Traditional compression scheme like LZ4 is really suited to address this problem efficiently. Instead, there are different families of solutions to this problem.

One of the most straightforward and efficient one is bitpacking :

  • Integers are first grouped into blocks of constant size (e.g. 128 when using the SSE2 implementation).
  • If not available implicitely, compute the minimum number of bits b that makes it possible to represent all of the integers. In other words, the smallest b such that all integers in the block are stricly smaller than 2n.
  • The bitpacked representation is then some variation of the concatenation of the integers restricted to their least significant b-bits.

For instance, assuming a block of 4, when encoding 4, 9, 3, 2. Assuming that the highest value in the block is 9, b = 4. All values will then be encoded over 4 bits as follows.

original number binary representation
4 0100
9 1001
3 0011
2 0010
... ...

As a result, each integer of this block will only require 4 bits.

Usage

This crate contains different implementation for bitpacking depending on the instruction set available with your processor. The resulting format are not compatible one with each other.

Currently the following are available :

  • scalar: implementation not using any SIMD instruction. A block has a size of 32. This implementation is still more performant naive solutions.
  • SSE3: bitpacking 4 integer at once. (block size of 128). Requires the sse3 feature to be enabled. This feature is enabled by default.
  • AVX: butpacking 8 integers at once. (block size of 256). Delta integration is comparatively expensive. Requires to enable the avx feature.

I recommend using the SSE3 implementation if you are not sure what you are doing and you are targetting x86_64 CPUs that have been produced after 2006.

Make sure to compile with

RUSTFLAGS="-C target-cpu=native"

Compressing small integers

extern crate bitpacking;
use bitpacking::{SSE3BitPacker, BitPacker};

// ...

// We compress one block at a time.
assert_eq!(my_data.len(), SSE3BitPacker::BLOCK_LEN);
// Each `BitPacker` has a method to efficiently compute the minimum number of bits required
// to represent all of the integers in the block.
let num_bits: u8 = SSE3BitPacker::num_bits(&my_data);

// The compressed block len is straightforward : BLOCK_LEN (here 128) * num_bits / 8.
let compressed_len = SSE3BitPacker::compressed_block_size(num_bits);
let mut compressed = vec![0u8; compressed_len];

// compressing
SSE3BitPacker::compress(&fake_data, &mut compressed[..], num_bits);

Decompressing small integers

extern crate bitpacking;
use bitpacking::{SSE3BitPacker, BitPacker};

// ...
let mut decompressed = vec![0u32; SSE3BitPacker::BLOCK_LEN];
SSE3BitPacker::decompress(&compressed, &mut decompressed[..], num_bits);

assert_eq!(&fake_data, &decompressed);

Compressing an increasing sequence of integers

Decompressing an increasing sequence of integers

Benchmark

The following benchmarks have been run on one thread on my laptop's CPU: Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz.

Throughput figures are expressed in term of decompressed 32-bits integers. To get figures expressed in integers/s, divide by 4.

You can reproduce it on your hardware by running the following command.

RUSTFLAGS="-C target-cpu=native" cargo bench

Scalar

num bits operation throughput
01 compress 7158 MB/s
01 compress_delta 4674 MB/s
01 decompress 12891 MB/s
01 decompress_delta 7634 MB/s
02 compress 7039 MB/s
02 compress_delta 4933 MB/s
02 decompress 9902 MB/s
02 decompress_delta 8195 MB/s
03 compress 7026 MB/s
03 compress_delta 4721 MB/s
03 decompress 11316 MB/s
03 decompress_delta 7659 MB/s
04 compress 7889 MB/s
04 compress_delta 4560 MB/s
04 decompress 9303 MB/s
04 decompress_delta 7516 MB/s
05 compress 7130 MB/s
05 compress_delta 4432 MB/s
05 decompress 10232 MB/s
05 decompress_delta 7309 MB/s
06 compress 6973 MB/s
06 compress_delta 4424 MB/s
06 decompress 10725 MB/s
06 decompress_delta 7343 MB/s
07 compress 6535 MB/s
07 compress_delta 4274 MB/s
07 decompress 9541 MB/s
07 decompress_delta 7101 MB/s
08 compress 7659 MB/s
08 compress_delta 4250 MB/s
08 decompress 9170 MB/s
08 decompress_delta 7030 MB/s
09 compress 6711 MB/s
09 compress_delta 4155 MB/s
09 decompress 8030 MB/s
09 decompress_delta 6857 MB/s
10 compress 6539 MB/s
10 compress_delta 4216 MB/s
10 decompress 8449 MB/s
10 decompress_delta 7173 MB/s
11 compress 6452 MB/s
11 compress_delta 4112 MB/s
11 decompress 7858 MB/s
11 decompress_delta 7480 MB/s
12 compress 6760 MB/s
12 compress_delta 4002 MB/s
12 decompress 8541 MB/s
12 decompress_delta 8665 MB/s
13 compress 6101 MB/s
13 compress_delta 3937 MB/s
13 decompress 7602 MB/s
13 decompress_delta 8686 MB/s
14 compress 6107 MB/s
14 compress_delta 3911 MB/s
14 decompress 7856 MB/s
14 decompress_delta 8729 MB/s
15 compress 5713 MB/s
15 compress_delta 3834 MB/s
15 decompress 7404 MB/s
15 decompress_delta 9561 MB/s
16 compress 7788 MB/s
16 compress_delta 3779 MB/s
16 decompress 8901 MB/s
16 decompress_delta 10075 MB/s
17 compress 5738 MB/s
17 compress_delta 3723 MB/s
17 decompress 7225 MB/s
17 decompress_delta 10050 MB/s
18 compress 5755 MB/s
18 compress_delta 3830 MB/s
18 decompress 7421 MB/s
18 decompress_delta 11767 MB/s
19 compress 5491 MB/s
19 compress_delta 3803 MB/s
19 decompress 7110 MB/s
19 decompress_delta 11627 MB/s
20 compress 5601 MB/s
20 compress_delta 3758 MB/s
20 decompress 7423 MB/s
20 decompress_delta 11571 MB/s
21 compress 5078 MB/s
21 compress_delta 3667 MB/s
21 decompress 6911 MB/s
21 decompress_delta 11573 MB/s
22 compress 5127 MB/s
22 compress_delta 3645 MB/s
22 decompress 6996 MB/s
22 decompress_delta 11565 MB/s
23 compress 4888 MB/s
23 compress_delta 3633 MB/s
23 decompress 6730 MB/s
23 decompress_delta 11555 MB/s
24 compress 5450 MB/s
24 compress_delta 3627 MB/s
24 decompress 7566 MB/s
24 decompress_delta 12292 MB/s
25 compress 4588 MB/s
25 compress_delta 3610 MB/s
25 decompress 7869 MB/s
25 decompress_delta 12811 MB/s
26 compress 5322 MB/s
26 compress_delta 3843 MB/s
26 decompress 10381 MB/s
26 decompress_delta 13311 MB/s
27 compress 5104 MB/s
27 compress_delta 3801 MB/s
27 decompress 8799 MB/s
27 decompress_delta 13320 MB/s
28 compress 5582 MB/s
28 compress_delta 3787 MB/s
28 decompress 8754 MB/s
28 decompress_delta 13611 MB/s
29 compress 5021 MB/s
29 compress_delta 3797 MB/s
29 decompress 10984 MB/s
29 decompress_delta 14279 MB/s
30 compress 5988 MB/s
30 compress_delta 3793 MB/s
30 decompress 10706 MB/s
30 decompress_delta 14044 MB/s
31 compress 6851 MB/s
31 compress_delta 3800 MB/s
31 decompress 10099 MB/s
31 decompress_delta 13228 MB/s
32 compress 9590 MB/s
32 compress_delta 3790 MB/s
32 decompress 14084 MB/s
32 decompress_delta 14160 MB/s

SSE3

num bits operation throughput
01 compress 27429 MB/s
01 compress_delta 15311 MB/s
01 decompress 25522 MB/s
01 decompress_delta 22767 MB/s
02 compress 27447 MB/s
02 compress_delta 14747 MB/s
02 decompress 25496 MB/s
02 decompress_delta 22633 MB/s
03 compress 26741 MB/s
03 compress_delta 14575 MB/s
03 decompress 25041 MB/s
03 decompress_delta 22465 MB/s
04 compress 27874 MB/s
04 compress_delta 14296 MB/s
04 decompress 24867 MB/s
04 decompress_delta 22385 MB/s
05 compress 26279 MB/s
05 compress_delta 13955 MB/s
05 decompress 24736 MB/s
05 decompress_delta 22047 MB/s
06 compress 25931 MB/s
06 compress_delta 13599 MB/s
06 decompress 24549 MB/s
06 decompress_delta 22050 MB/s
07 compress 25768 MB/s
07 compress_delta 13298 MB/s
07 decompress 24335 MB/s
07 decompress_delta 21691 MB/s
08 compress 27379 MB/s
08 compress_delta 13034 MB/s
08 decompress 24332 MB/s
08 decompress_delta 21695 MB/s
09 compress 25371 MB/s
09 compress_delta 12695 MB/s
09 decompress 23869 MB/s
09 decompress_delta 21226 MB/s
10 compress 25125 MB/s
10 compress_delta 12371 MB/s
10 decompress 24042 MB/s
10 decompress_delta 21315 MB/s
11 compress 24514 MB/s
11 compress_delta 12261 MB/s
11 decompress 23711 MB/s
11 decompress_delta 21033 MB/s
12 compress 24945 MB/s
12 compress_delta 11972 MB/s
12 decompress 23250 MB/s
12 decompress_delta 21076 MB/s
13 compress 23880 MB/s
13 compress_delta 11831 MB/s
13 decompress 23194 MB/s
13 decompress_delta 20918 MB/s
14 compress 22706 MB/s
14 compress_delta 11497 MB/s
14 decompress 22895 MB/s
14 decompress_delta 20818 MB/s
15 compress 23132 MB/s
15 compress_delta 11276 MB/s
15 decompress 22831 MB/s
15 decompress_delta 20601 MB/s
16 compress 24871 MB/s
16 compress_delta 11052 MB/s
16 decompress 22897 MB/s
16 decompress_delta 20039 MB/s
17 compress 21828 MB/s
17 compress_delta 11011 MB/s
17 decompress 22462 MB/s
17 decompress_delta 20627 MB/s
18 compress 22332 MB/s
18 compress_delta 10691 MB/s
18 decompress 22411 MB/s
18 decompress_delta 20233 MB/s
19 compress 22098 MB/s
19 compress_delta 10934 MB/s
19 decompress 22303 MB/s
19 decompress_delta 20316 MB/s
20 compress 22037 MB/s
20 compress_delta 10939 MB/s
20 decompress 22056 MB/s
20 decompress_delta 20417 MB/s
21 compress 21479 MB/s
21 compress_delta 10871 MB/s
21 decompress 21884 MB/s
21 decompress_delta 20543 MB/s
22 compress 21380 MB/s
22 compress_delta 10843 MB/s
22 decompress 21717 MB/s
22 decompress_delta 20185 MB/s
23 compress 20885 MB/s
23 compress_delta 10979 MB/s
23 decompress 21474 MB/s
23 decompress_delta 20227 MB/s
24 compress 21154 MB/s
24 compress_delta 10675 MB/s
24 decompress 21414 MB/s
24 decompress_delta 20435 MB/s
25 compress 20021 MB/s
25 compress_delta 10754 MB/s
25 decompress 21175 MB/s
25 decompress_delta 19418 MB/s
26 compress 19996 MB/s
26 compress_delta 10881 MB/s
26 decompress 21165 MB/s
26 decompress_delta 20735 MB/s
27 compress 19522 MB/s
27 compress_delta 10780 MB/s
27 decompress 21109 MB/s
27 decompress_delta 20440 MB/s
28 compress 19268 MB/s
28 compress_delta 10871 MB/s
28 decompress 20985 MB/s
28 decompress_delta 19330 MB/s
29 compress 18665 MB/s
29 compress_delta 10769 MB/s
29 decompress 20757 MB/s
29 decompress_delta 20545 MB/s
30 compress 18808 MB/s
30 compress_delta 10528 MB/s
30 decompress 20605 MB/s
30 decompress_delta 19338 MB/s
31 compress 18356 MB/s
31 compress_delta 10750 MB/s
31 decompress 20368 MB/s
31 decompress_delta 19186 MB/s
32 compress 18698 MB/s
32 compress_delta 10600 MB/s
32 decompress 20803 MB/s
32 decompress_delta 20611 MB/s

AVX2

Requires to compile with the avx2 feature.

num bits operation throughput
01 compress 28138 MB/s
01 compress_delta 9466 MB/s
01 decompress 23299 MB/s
01 decompress_delta 18810 MB/s
02 compress 27984 MB/s
02 compress_delta 9223 MB/s
02 decompress 22833 MB/s
02 decompress_delta 18594 MB/s
03 compress 26452 MB/s
03 compress_delta 9710 MB/s
03 decompress 22550 MB/s
03 decompress_delta 18202 MB/s
04 compress 26241 MB/s
04 compress_delta 9304 MB/s
04 decompress 22350 MB/s
04 decompress_delta 18174 MB/s
05 compress 26244 MB/s
05 compress_delta 9192 MB/s
05 decompress 22049 MB/s
05 decompress_delta 17964 MB/s
06 compress 26103 MB/s
06 compress_delta 8817 MB/s
06 decompress 21621 MB/s
06 decompress_delta 17705 MB/s
07 compress 24917 MB/s
07 compress_delta 9598 MB/s
07 decompress 21288 MB/s
07 decompress_delta 17634 MB/s
08 compress 25899 MB/s
08 compress_delta 9189 MB/s
08 decompress 20944 MB/s
08 decompress_delta 17459 MB/s
09 compress 25044 MB/s
09 compress_delta 9223 MB/s
09 decompress 20849 MB/s
09 decompress_delta 17283 MB/s
10 compress 23441 MB/s
10 compress_delta 8784 MB/s
10 decompress 20551 MB/s
10 decompress_delta 17104 MB/s
11 compress 23543 MB/s
11 compress_delta 9703 MB/s
11 decompress 20255 MB/s
11 decompress_delta 16557 MB/s
12 compress 24130 MB/s
12 compress_delta 9019 MB/s
12 decompress 19993 MB/s
12 decompress_delta 16815 MB/s
13 compress 23002 MB/s
13 compress_delta 9300 MB/s
13 decompress 19736 MB/s
13 decompress_delta 16541 MB/s
14 compress 22253 MB/s
14 compress_delta 9064 MB/s
14 decompress 19407 MB/s
14 decompress_delta 16421 MB/s
15 compress 22598 MB/s
15 compress_delta 9709 MB/s
15 decompress 19180 MB/s
15 decompress_delta 16256 MB/s
16 compress 22947 MB/s
16 compress_delta 9645 MB/s
16 decompress 19071 MB/s
16 decompress_delta 16245 MB/s
17 compress 21449 MB/s
17 compress_delta 9904 MB/s
17 decompress 18418 MB/s
17 decompress_delta 16111 MB/s
18 compress 20659 MB/s
18 compress_delta 9617 MB/s
18 decompress 18375 MB/s
18 decompress_delta 16317 MB/s
19 compress 20426 MB/s
19 compress_delta 9831 MB/s
19 decompress 18272 MB/s
19 decompress_delta 16169 MB/s
20 compress 19645 MB/s
20 compress_delta 9975 MB/s
20 decompress 18038 MB/s
20 decompress_delta 16248 MB/s
21 compress 19745 MB/s
21 compress_delta 9749 MB/s
21 decompress 17785 MB/s
21 decompress_delta 16221 MB/s
22 compress 19493 MB/s
22 compress_delta 10282 MB/s
22 decompress 17702 MB/s
22 decompress_delta 16200 MB/s
23 compress 18843 MB/s
23 compress_delta 9994 MB/s
23 decompress 17524 MB/s
23 decompress_delta 15987 MB/s
24 compress 18553 MB/s
24 compress_delta 10976 MB/s
24 decompress 17420 MB/s
24 decompress_delta 15688 MB/s
25 compress 17683 MB/s
25 compress_delta 11073 MB/s
25 decompress 17244 MB/s
25 decompress_delta 16333 MB/s
26 compress 17399 MB/s
26 compress_delta 10946 MB/s
26 decompress 17070 MB/s
26 decompress_delta 16033 MB/s
27 compress 17283 MB/s
27 compress_delta 11372 MB/s
27 decompress 16837 MB/s
27 decompress_delta 15853 MB/s
28 compress 17099 MB/s
28 compress_delta 11061 MB/s
28 decompress 16727 MB/s
28 decompress_delta 15881 MB/s
29 compress 16420 MB/s
29 compress_delta 11029 MB/s
29 decompress 16506 MB/s
29 decompress_delta 15669 MB/s
30 compress 16334 MB/s
30 compress_delta 10821 MB/s
30 decompress 16360 MB/s
30 decompress_delta 16330 MB/s
31 compress 15869 MB/s
31 compress_delta 11153 MB/s
31 decompress 16334 MB/s
31 decompress_delta 16385 MB/s
32 compress 16059 MB/s
32 compress_delta 11042 MB/s
32 decompress 16174 MB/s
32 decompress_delta 16046 MB/s

Reference

Other crates you might want to check out

  • stream vbyte A Stream-VByte implementation
  • mayda Another crate implementation the same algorithms.