Expand description
The top-level encoding module for Lance files.
Lance files are encoded using a FieldEncodingStrategy
which choose
what encoder to use for each field.
The current strategy is the StructuralEncodingStrategy
which uses “structural”
encoding. A tree of encoders is built up for each field. The struct & list encoders
simply pull off the validity and offsets and collect them. Then, in the primitive leaf
encoder the validity, offsets, and values are accumulated in an accumulation buffer. Once
enough data has been collected the primitive encoder will either use a miniblock encoding
or a full zip encoding to create a page of data from the accumulation buffer.
Structs§
- Batch
Encoder - A batch encoder that encodes RecordBatch objects by delegating to field encoders for each top-level field in the batch.
- Column
Index Sequence - Keeps track of the current column index and makes a mapping from field id to column index
- Encoded
Batch - An encoded batch of data and a page table describing it
- Encoded
Column - Encoded
Page - An encoded page of data
- Encoding
Options - Options that control the encoding process
- OutOf
Line Buffers - A tool to reserve space for buffers that are not in-line with the data
- Structural
Encoding Strategy - An encoding strategy used for 2.1+ files
Constants§
- MIN_
PAGE_ BUFFER_ ALIGNMENT - The minimum alignment for a page buffer. Writers must respect this.
Traits§
- Field
Encoder - Top level encoding trait to code any Arrow array type into one or more pages.
- Field
Encoding Strategy - A trait to pick which kind of field encoding to use for a field
Functions§
- default_
encoding_ strategy - default_
encoding_ strategy_ with_ params - Create an encoding strategy with user-configured compression parameters
- encode_
batch - Helper method to encode a batch of data into memory
Type Aliases§
- Encode
Task - A task to create a page of data