lance_encoding::encoder

Trait CompressionStrategy

source
pub trait CompressionStrategy:
    Send
    + Sync
    + Debug {
    // Required methods
    fn create_block_compressor(
        &self,
        field: &Field,
        data: &DataBlock,
    ) -> Result<(Box<dyn BlockCompressor>, ArrayEncoding)>;
    fn create_fixed_per_value(
        &self,
        field: &Field,
        data: &DataBlock,
    ) -> Result<Box<dyn FixedPerValueCompressor>>;
    fn create_variable_per_value(
        &self,
        field: &Field,
        data: &DataBlock,
    ) -> Result<Box<dyn VariablePerValueCompressor>>;
    fn create_miniblock_compressor(
        &self,
        field: &Field,
        data: &DataBlock,
    ) -> Result<Box<dyn MiniBlockCompressor>>;
}
Expand description

A trait to pick which compression to use for given data

There are several different kinds of compression.

  • Block compression is the most generic, but most difficult to use efficiently
  • Fixed-per-value compression results in a fixed number of bits for each value It is used for wide fixed-width types like vector embeddings.
  • Variable-per-value compression results in two buffers, one buffer of offsets and one buffer of data bytes. It is used for wide variable-width types like strings, variable-length lists, binary, etc.
  • Mini-block compression results in a small block of opaque data for chunks of rows. Each block is somewhere between 0 and 16KiB in size. This is used for narrow data types (both fixed and variable length) where we can fit many values into an 16KiB block.

Required Methods§

source

fn create_block_compressor( &self, field: &Field, data: &DataBlock, ) -> Result<(Box<dyn BlockCompressor>, ArrayEncoding)>

Create a block compressor for the given data

source

fn create_fixed_per_value( &self, field: &Field, data: &DataBlock, ) -> Result<Box<dyn FixedPerValueCompressor>>

Create a fixed-per-value compressor for the given data

source

fn create_variable_per_value( &self, field: &Field, data: &DataBlock, ) -> Result<Box<dyn VariablePerValueCompressor>>

Create a variable-per-value compressor for the given data

source

fn create_miniblock_compressor( &self, field: &Field, data: &DataBlock, ) -> Result<Box<dyn MiniBlockCompressor>>

Create a mini-block compressor for the given data

Implementors§