Skip to main content

EngineData

Trait EngineData 

Source
pub trait EngineData: AsAny {
    // Required methods
    fn visit_rows(
        &self,
        column_names: &[ColumnName],
        visitor: &mut dyn RowVisitor,
    ) -> DeltaResult<()>;
    fn len(&self) -> usize;
    fn append_columns(
        &self,
        schema: SchemaRef,
        columns: Vec<ArrayData>,
    ) -> DeltaResult<Box<dyn EngineData>>;
    fn apply_selection_vector(
        self: Box<Self>,
        selection_vector: Vec<bool>,
    ) -> DeltaResult<Box<dyn EngineData>>;
    fn has_field(&self, name: &ColumnName) -> bool;

    // Provided method
    fn is_empty(&self) -> bool { ... }
}
Expand description

Any type that an engine wants to return as “data” needs to implement this trait. The bulk of the work is in the EngineData::visit_rows method. See the docs for that method for more details.

struct MyDataType; // Whatever the engine wants here
impl MyDataType {
  fn do_extraction<'a>(&self) -> Vec<&'a dyn GetData<'a>> {
     /// Actually do the extraction into getters
     todo!()
  }
}

impl EngineData for MyDataType {
  fn visit_rows(&self, leaf_columns: &[ColumnName], visitor: &mut dyn RowVisitor) -> DeltaResult<()> {
    let getters = self.do_extraction(); // do the extraction
    visitor.visit(self.len(), &getters); // call the visitor back with the getters
    Ok(())
  }
  fn len(&self) -> usize {
    todo!() // actually get the len here
  }
  fn append_columns(&self, schema: SchemaRef, columns: Vec<ArrayData>) -> DeltaResult<Box<dyn EngineData>> {
    todo!() // convert `SchemaRef` and `ArrayData` into local representation and append them
  }
  fn apply_selection_vector(self: Box<Self>, selection_vector: Vec<bool>) -> DeltaResult<Box<dyn EngineData>> {
    todo!() // filter out unselected rows and return the new set of data
  }
  fn has_field(&self, name: &ColumnName) -> bool {
    todo!() // determine whether the field exists in the data
  }
}

Required Methods§

Source

fn visit_rows( &self, column_names: &[ColumnName], visitor: &mut dyn RowVisitor, ) -> DeltaResult<()>

Visits a subset of leaf columns in each row of this data, passing a GetData item for each requested column to the visitor’s visit method (along with the number of rows of data to be visited).

Source

fn len(&self) -> usize

Return the number of items (rows) in blob

Source

fn append_columns( &self, schema: SchemaRef, columns: Vec<ArrayData>, ) -> DeltaResult<Box<dyn EngineData>>

Append new columns provided by Kernel to the existing data.

This method creates a new EngineData instance that combines the existing columns with the provided new columns. The original data remains unchanged.

§Parameters
  • schema: The schema of the columns being appended (not the entire resulting schema). This schema must describe exactly the columns being added in the columns parameter.
  • columns: The column data to append. Each ArrayData corresponds to one field in the schema.
§Returns

A new EngineData instance containing both the original columns and the appended columns. The schema of the result will contain all original fields followed by the new schema fields.

§Errors

Returns an error if:

  • The number of rows in any appended column doesn’t match the existing data.
  • The number of new columns doesn’t match the number of schema fields.
  • Data type conversion to the engine’s native data types fails.
  • The engine cannot create the combined data structure.
Source

fn apply_selection_vector( self: Box<Self>, selection_vector: Vec<bool>, ) -> DeltaResult<Box<dyn EngineData>>

Apply a selection vector to the data and return a data where only the valid rows are included. This consumes the EngineData, allowing engines to implement this “in place” if desired

Source

fn has_field(&self, name: &ColumnName) -> bool

Returns true if a field at the given (possibly nested) path exists in this data’s schema.

For a top-level field named "foo", use ColumnName::new(["foo"]). For nested fields, each non-leaf element of the path must be a struct field at that level.

Provided Methods§

Source

fn is_empty(&self) -> bool

Returns true if the data is empty (i.e., has no rows).

Trait Implementations§

Source§

impl EngineDataArrowExt for Box<dyn EngineData>

Available on (crate features default-engine-native-tls or default-engine-rustls or arrow-conversion) and crate feature default-engine-base only.

Implementors§

Source§

impl EngineData for ArrowEngineData

Available on (crate features default-engine-native-tls or default-engine-rustls or arrow-conversion) and crate feature default-engine-base only.