Module compression

Source
Expand description

Compression traits and definitions for Lance 2.1

In 2.1 the first step of encoding is structural encoding, where we shred inputs into leaf arrays and take care of the validity / offsets structure. Then we pick a structural encoding (mini-block or full-zip) and then we compress the data.

This module defines the traits for the compression step. Each structural encoding has its own compression strategy.

Miniblock compression is a block based approach for small data. Since we introduce some read amplification and decompress entire blocks we are able to use opaque compression.

Fullzip compression is a per-value approach where we require that values are transparently compressed so that we can locate them later.

Structs§

DefaultCompressionStrategy
DefaultDecompressionStrategy

Traits§

BlockCompressor
Trait for compression algorithms that compress an entire block of data into one opaque and self-described chunk.
BlockDecompressor
CompressionStrategy
A trait to pick which compression to use for given data
DecompressionStrategy
FixedPerValueDecompressor
MiniBlockDecompressor
VariablePerValueDecompressor