Trait bitpacking::BitPacker

source ·
pub trait BitPacker: Sized + Clone + Copy {
    const BLOCK_LEN: usize;

    // Required methods
    fn new() -> Self;
    fn compress(
        &self,
        decompressed: &[u32],
        compressed: &mut [u8],
        num_bits: u8
    ) -> usize;
    fn compress_sorted(
        &self,
        initial: u32,
        decompressed: &[u32],
        compressed: &mut [u8],
        num_bits: u8
    ) -> usize;
    fn compress_strictly_sorted(
        &self,
        initial: Option<u32>,
        decompressed: &[u32],
        compressed: &mut [u8],
        num_bits: u8
    ) -> usize;
    fn decompress(
        &self,
        compressed: &[u8],
        decompressed: &mut [u32],
        num_bits: u8
    ) -> usize;
    fn decompress_sorted(
        &self,
        initial: u32,
        compressed: &[u8],
        decompressed: &mut [u32],
        num_bits: u8
    ) -> usize;
    fn decompress_strictly_sorted(
        &self,
        initial: Option<u32>,
        compressed: &[u8],
        decompressed: &mut [u32],
        num_bits: u8
    ) -> usize;
    fn num_bits(&self, decompressed: &[u32]) -> u8;
    fn num_bits_sorted(&self, initial: u32, decompressed: &[u32]) -> u8;
    fn num_bits_strictly_sorted(
        &self,
        initial: Option<u32>,
        decompressed: &[u32]
    ) -> u8;

    // Provided method
    fn compressed_block_size(num_bits: u8) -> usize { ... }
}
Expand description

Examples without delta-encoding

extern crate bitpacking;

use bitpacking::{BitPacker4x, BitPacker};


let bitpacker = BitPacker4x::new();

let num_bits: u8 = bitpacker.num_bits(&my_data);

// A block will be take at most 4 bytes per-integers.
let mut compressed = vec![0u8; 4 * BitPacker4x::BLOCK_LEN];

let compressed_len = bitpacker.compress(&my_data, &mut compressed[..], num_bits);

assert_eq!((num_bits as usize) *  BitPacker4x::BLOCK_LEN / 8, compressed_len);

// Decompressing
let mut decompressed = vec![0u32; BitPacker4x::BLOCK_LEN];
bitpacker.decompress(&compressed[..compressed_len], &mut decompressed[..], num_bits);

assert_eq!(&my_data, &decompressed);

Examples with delta-encoding

Delta-encoding makes it possible to store sorted integers in an efficient manner. Rather than encoding the integers directly, the interval (or deltas) between each of them are computed and then encoded.

Decoding then requires to first decode the deltas and then operate a cumulative sum (also called integration or prefix sum) on them.

extern crate bitpacking;

use bitpacking::{BitPacker4x, BitPacker};



// The initial value is used to compute the first delta.
// In most use cases, you will be compressing long increasing
// integer sequences.
//
// You should probably pass an initial value of `0u32` to the
// first block if you do not have any information.
//
// When encoding the second block however, you will want to pass the last
// value of the first block.
let initial_value = 0u32;

let bitpacker = BitPacker4x::new();

let num_bits: u8 = bitpacker.num_bits_sorted(initial_value, &my_data);

// A block will be take at most 4 bytes per-integers.
let mut compressed = vec![0u8; 4 * BitPacker4x::BLOCK_LEN];


let compressed_len = bitpacker.compress_sorted(initial_value, &my_data, &mut compressed[..], num_bits);

assert_eq!((num_bits as usize) *  BitPacker4x::BLOCK_LEN / 8, compressed_len);

// Decompressing
let mut decompressed = vec![0u32; BitPacker4x::BLOCK_LEN];

// The initial value must be the same as the one passed
// when compressing the block.
bitpacker.decompress_sorted(initial_value, &compressed[..compressed_len], &mut decompressed[..], num_bits);

assert_eq!(&my_data, &decompressed);

Required Associated Constants§

source

const BLOCK_LEN: usize

Number of u32 per compressed block

Required Methods§

source

fn new() -> Self

Checks the available instructions set on the current CPU and returns the best available implementation.

Calling .new() is extremely cheap, and does not require any heap allocation. It is not required to cache its result too aggressively.

source

fn compress( &self, decompressed: &[u32], compressed: &mut [u8], num_bits: u8 ) -> usize

Compress a block of u32.

Assumes that the integers are all lower than 2^num_bits. The result is undefined if they are larger.

Returns the amount of bytes of the compressed block.

Panics
  • Panics if the compressed destination array is too small
  • Panics if decompressed length is not exactly the BLOCK_LEN.
source

fn compress_sorted( &self, initial: u32, decompressed: &[u32], compressed: &mut [u8], num_bits: u8 ) -> usize

Delta encode and compressed the decompressed array.

Assumes that the elements in the decompressed array are sorted. initial will be used to compute the first delta.

Panics
  • Panics if initial is greater than decompressed[0]
  • Panics if decompressed is not sorted
  • Panics if decompressed’s length is not exactly BLOCK_LEN
  • Panics if compressed is not large enough to receive the compressed data
  • Panics if the compressed destination array is too small.

Returns the amount of bytes of the compressed block.

Panics
  • Panics if the compressed array is too short.
  • Panics if the decompressed array is not exactly the BLOCK_LEN.
source

fn compress_strictly_sorted( &self, initial: Option<u32>, decompressed: &[u32], compressed: &mut [u8], num_bits: u8 ) -> usize

Delta encode and compress the decompressed array.

Assumes that the elements in the decompressed array are strictly monotonous, that is, each element is strictly greater than the previous.

This codec can be more efficient that the simply sorted compressor by up to one bit per integer. This has an important impact on saturated or nearly saturated datasets (almost every number appears in sequence), but isn’t very different from the sorted compressor on more sparse datasets.

Panics
  • Panics if initial is greater or equal to decompressed[0]
  • Panics if decompressed isn’t strictly monotonic
  • Panics if decompressed’s length is not exactly BLOCK_LEN
  • Panics if compressed is not large enough to receive the compressed data

Returns the amount of bytes in the compressed block.

source

fn decompress( &self, compressed: &[u8], decompressed: &mut [u32], num_bits: u8 ) -> usize

Decompress the compress array to the decompressed array.

Returns the amount of bytes that were consumed.

Panics

Panics if the compressed array is too short, or the decompressed array is too short.

source

fn decompress_sorted( &self, initial: u32, compressed: &[u8], decompressed: &mut [u32], num_bits: u8 ) -> usize

Decompress thecompressarray to the decompressed array. The compressed array is assumed to have been delta-encoded and compressed.

initial must be the value that was passed as the initial argument compressing the block.

Returns the amount of bytes that have been read.

Panics
  • Panics if the compressed array is too short to contain BLOCK_LEN elements
  • Panics if the decompressed array is too short.
source

fn decompress_strictly_sorted( &self, initial: Option<u32>, compressed: &[u8], decompressed: &mut [u32], num_bits: u8 ) -> usize

Decompress thecompressarray to the decompressed array. The compressed array is assumed to have been strict-delta-encoded and compressed.

initial must be the value that was passed as the initial argument compressing the block.

Returns the amount of bytes that have been read.

Panics
  • Panics if the compressed array is too short to contain BLOCK_LEN elements
  • Panics if the decompressed array is too short.
source

fn num_bits(&self, decompressed: &[u32]) -> u8

Returns the minimum number of bits used to represent the largest integer in the decompressed block.

Panics

Panics if decompressed’s len is not exactly BLOCK_LEN.

source

fn num_bits_sorted(&self, initial: u32, decompressed: &[u32]) -> u8

Returns the minimum number of bits used to represent the largest delta in the deltas in the decompressed block.

Panics

Panics if decompressed’s len is not exactly BLOCK_LEN.

source

fn num_bits_strictly_sorted( &self, initial: Option<u32>, decompressed: &[u32] ) -> u8

Returns the minimum number of bits used to represent the largest delta-1 in the deltas in the decompressed block.

Panics

Panics if decompressed’s len is not exactly BLOCK_LEN.

Provided Methods§

source

fn compressed_block_size(num_bits: u8) -> usize

Returns the size of a compressed block.

Object Safety§

This trait is not object safe.

Implementors§