Trait lance_encoding::encoder::FieldEncoder
source · pub trait FieldEncoder: Send {
// Required methods
fn maybe_encode(&mut self, array: ArrayRef) -> Result<Vec<EncodeTask>>;
fn flush(&mut self) -> Result<Vec<EncodeTask>>;
fn num_columns(&self) -> u32;
}Expand description
Top level encoding trait to code any Arrow array type into one or more pages.
The field encoder implements buffering and encoding of a single input column but it may map to multiple output columns. For example, a list array or struct array will be encoded into multiple columns.
Also, fields may be encoded at different speeds. For example, given a struct column with three fields (a boolean field, an int32 field, and a 4096-dimension tensor field) the tensor field is likely to emit encoded pages much more frequently than the boolean field.
Required Methods§
sourcefn maybe_encode(&mut self, array: ArrayRef) -> Result<Vec<EncodeTask>>
fn maybe_encode(&mut self, array: ArrayRef) -> Result<Vec<EncodeTask>>
Buffer the data and, if there is enough data in the buffer to form a page, return an encoding task to encode the data.
This may return more than one task because a single column may be mapped to multiple output columns. For example, if encoding a struct column with three children then up to three tasks may be returned from each call to maybe_encode.
It may also return multiple tasks for a single column if the input array is larger than a single disk page.
It could also return an empty Vec if there is not enough data yet to encode any pages.
sourcefn flush(&mut self) -> Result<Vec<EncodeTask>>
fn flush(&mut self) -> Result<Vec<EncodeTask>>
Flush any remaining data from the buffers into encoding tasks
sourcefn num_columns(&self) -> u32
fn num_columns(&self) -> u32
The number of output columns this encoding will create