Trait lance_encoding::encoder::FieldEncoder

source ·
pub trait FieldEncoder: Send {
    // Required methods
    fn maybe_encode(&mut self, array: ArrayRef) -> Result<Vec<EncodeTask>>;
    fn flush(&mut self) -> Result<Vec<EncodeTask>>;
    fn num_columns(&self) -> u32;
}
Expand description

Top level encoding trait to code any Arrow array type into one or more pages.

The field encoder implements buffering and encoding of a single input column but it may map to multiple output columns. For example, a list array or struct array will be encoded into multiple columns.

Also, fields may be encoded at different speeds. For example, given a struct column with three fields (a boolean field, an int32 field, and a 4096-dimension tensor field) the tensor field is likely to emit encoded pages much more frequently than the boolean field.

Required Methods§

source

fn maybe_encode(&mut self, array: ArrayRef) -> Result<Vec<EncodeTask>>

Buffer the data and, if there is enough data in the buffer to form a page, return an encoding task to encode the data.

This may return more than one task because a single column may be mapped to multiple output columns. For example, if encoding a struct column with three children then up to three tasks may be returned from each call to maybe_encode.

It may also return multiple tasks for a single column if the input array is larger than a single disk page.

It could also return an empty Vec if there is not enough data yet to encode any pages.

source

fn flush(&mut self) -> Result<Vec<EncodeTask>>

Flush any remaining data from the buffers into encoding tasks

source

fn num_columns(&self) -> u32

The number of output columns this encoding will create

Implementors§