Trait bitpacking::BitPacker
source · pub trait BitPacker: Sized + Clone + Copy {
const BLOCK_LEN: usize;
// Required methods
fn new() -> Self;
fn compress(
&self,
decompressed: &[u32],
compressed: &mut [u8],
num_bits: u8
) -> usize;
fn compress_sorted(
&self,
initial: u32,
decompressed: &[u32],
compressed: &mut [u8],
num_bits: u8
) -> usize;
fn compress_strictly_sorted(
&self,
initial: Option<u32>,
decompressed: &[u32],
compressed: &mut [u8],
num_bits: u8
) -> usize;
fn decompress(
&self,
compressed: &[u8],
decompressed: &mut [u32],
num_bits: u8
) -> usize;
fn decompress_sorted(
&self,
initial: u32,
compressed: &[u8],
decompressed: &mut [u32],
num_bits: u8
) -> usize;
fn decompress_strictly_sorted(
&self,
initial: Option<u32>,
compressed: &[u8],
decompressed: &mut [u32],
num_bits: u8
) -> usize;
fn num_bits(&self, decompressed: &[u32]) -> u8;
fn num_bits_sorted(&self, initial: u32, decompressed: &[u32]) -> u8;
fn num_bits_strictly_sorted(
&self,
initial: Option<u32>,
decompressed: &[u32]
) -> u8;
// Provided method
fn compressed_block_size(num_bits: u8) -> usize { ... }
}
Expand description
Examples without delta-encoding
extern crate bitpacking;
use bitpacking::{BitPacker4x, BitPacker};
let bitpacker = BitPacker4x::new();
let num_bits: u8 = bitpacker.num_bits(&my_data);
// A block will be take at most 4 bytes per-integers.
let mut compressed = vec![0u8; 4 * BitPacker4x::BLOCK_LEN];
let compressed_len = bitpacker.compress(&my_data, &mut compressed[..], num_bits);
assert_eq!((num_bits as usize) * BitPacker4x::BLOCK_LEN / 8, compressed_len);
// Decompressing
let mut decompressed = vec![0u32; BitPacker4x::BLOCK_LEN];
bitpacker.decompress(&compressed[..compressed_len], &mut decompressed[..], num_bits);
assert_eq!(&my_data, &decompressed);
Examples with delta-encoding
Delta-encoding makes it possible to store sorted integers in an efficient manner. Rather than encoding the integers directly, the interval (or deltas) between each of them are computed and then encoded.
Decoding then requires to first decode the deltas and then operate a cumulative sum (also called integration or prefix sum) on them.
extern crate bitpacking;
use bitpacking::{BitPacker4x, BitPacker};
// The initial value is used to compute the first delta.
// In most use cases, you will be compressing long increasing
// integer sequences.
//
// You should probably pass an initial value of `0u32` to the
// first block if you do not have any information.
//
// When encoding the second block however, you will want to pass the last
// value of the first block.
let initial_value = 0u32;
let bitpacker = BitPacker4x::new();
let num_bits: u8 = bitpacker.num_bits_sorted(initial_value, &my_data);
// A block will be take at most 4 bytes per-integers.
let mut compressed = vec![0u8; 4 * BitPacker4x::BLOCK_LEN];
let compressed_len = bitpacker.compress_sorted(initial_value, &my_data, &mut compressed[..], num_bits);
assert_eq!((num_bits as usize) * BitPacker4x::BLOCK_LEN / 8, compressed_len);
// Decompressing
let mut decompressed = vec![0u32; BitPacker4x::BLOCK_LEN];
// The initial value must be the same as the one passed
// when compressing the block.
bitpacker.decompress_sorted(initial_value, &compressed[..compressed_len], &mut decompressed[..], num_bits);
assert_eq!(&my_data, &decompressed);
Required Associated Constants§
Required Methods§
sourcefn new() -> Self
fn new() -> Self
Checks the available instructions set on the current CPU and returns the best available implementation.
Calling .new()
is extremely cheap, and does not
require any heap allocation. It is not required to cache
its result too aggressively.
sourcefn compress(
&self,
decompressed: &[u32],
compressed: &mut [u8],
num_bits: u8
) -> usize
fn compress( &self, decompressed: &[u32], compressed: &mut [u8], num_bits: u8 ) -> usize
Compress a block of u32
.
Assumes that the integers are all lower than 2^num_bits
.
The result is undefined if they are larger.
Returns the amount of bytes of the compressed block.
Panics
- Panics if the compressed destination array is too small
- Panics if
decompressed
length is not exactly theBLOCK_LEN
.
sourcefn compress_sorted(
&self,
initial: u32,
decompressed: &[u32],
compressed: &mut [u8],
num_bits: u8
) -> usize
fn compress_sorted( &self, initial: u32, decompressed: &[u32], compressed: &mut [u8], num_bits: u8 ) -> usize
Delta encode and compressed the decompressed
array.
Assumes that the elements in the decompressed
array are sorted.
initial
will be used to compute the first delta
.
Panics
- Panics if
initial
is greater thandecompressed[0]
- Panics if
decompressed
is not sorted - Panics if
decompressed
’s length is not exactlyBLOCK_LEN
- Panics if
compressed
is not large enough to receive the compressed data - Panics if the compressed destination array is too small.
Returns the amount of bytes of the compressed block.
Panics
- Panics if the compressed array is too short.
- Panics if the decompressed array is not exactly the
BLOCK_LEN
.
sourcefn compress_strictly_sorted(
&self,
initial: Option<u32>,
decompressed: &[u32],
compressed: &mut [u8],
num_bits: u8
) -> usize
fn compress_strictly_sorted( &self, initial: Option<u32>, decompressed: &[u32], compressed: &mut [u8], num_bits: u8 ) -> usize
Delta encode and compress the decompressed
array.
Assumes that the elements in the decompressed
array are strictly
monotonous, that is, each element is strictly greater than the previous.
This codec can be more efficient that the simply sorted compressor by up to one bit per integer. This has an important impact on saturated or nearly saturated datasets (almost every number appears in sequence), but isn’t very different from the sorted compressor on more sparse datasets.
Panics
- Panics if
initial
is greater or equal todecompressed[0]
- Panics if
decompressed
isn’t strictly monotonic - Panics if
decompressed
’s length is not exactlyBLOCK_LEN
- Panics if
compressed
is not large enough to receive the compressed data
Returns the amount of bytes in the compressed block.
sourcefn decompress(
&self,
compressed: &[u8],
decompressed: &mut [u32],
num_bits: u8
) -> usize
fn decompress( &self, compressed: &[u8], decompressed: &mut [u32], num_bits: u8 ) -> usize
Decompress the compress
array to the decompressed
array.
Returns the amount of bytes that were consumed.
Panics
Panics if the compressed array is too short, or the decompressed array is too short.
sourcefn decompress_sorted(
&self,
initial: u32,
compressed: &[u8],
decompressed: &mut [u32],
num_bits: u8
) -> usize
fn decompress_sorted( &self, initial: u32, compressed: &[u8], decompressed: &mut [u32], num_bits: u8 ) -> usize
Decompress thecompress
array to the decompressed
array.
The compressed
array is assumed to have been delta-encoded and compressed.
initial
must be the value that was passed as the initial
argument compressing
the block.
Returns the amount of bytes that have been read.
Panics
- Panics if the compressed array is too short to contain
BLOCK_LEN
elements - Panics if the decompressed array is too short.
sourcefn decompress_strictly_sorted(
&self,
initial: Option<u32>,
compressed: &[u8],
decompressed: &mut [u32],
num_bits: u8
) -> usize
fn decompress_strictly_sorted( &self, initial: Option<u32>, compressed: &[u8], decompressed: &mut [u32], num_bits: u8 ) -> usize
Decompress thecompress
array to the decompressed
array.
The compressed
array is assumed to have been strict-delta-encoded and compressed.
initial
must be the value that was passed as the initial
argument compressing
the block.
Returns the amount of bytes that have been read.
Panics
- Panics if the compressed array is too short to contain
BLOCK_LEN
elements - Panics if the decompressed array is too short.
sourcefn num_bits(&self, decompressed: &[u32]) -> u8
fn num_bits(&self, decompressed: &[u32]) -> u8
Returns the minimum number of bits used to represent the largest integer in the
decompressed
block.
Panics
Panics if decompressed
’s len is not exactly BLOCK_LEN
.
Provided Methods§
sourcefn compressed_block_size(num_bits: u8) -> usize
fn compressed_block_size(num_bits: u8) -> usize
Returns the size of a compressed block.