pub trait EngineData: AsAny {
// Required methods
fn visit_rows(
&self,
column_names: &[ColumnName],
visitor: &mut dyn RowVisitor,
) -> DeltaResult<()>;
fn len(&self) -> usize;
fn append_columns(
&self,
schema: SchemaRef,
columns: Vec<ArrayData>,
) -> DeltaResult<Box<dyn EngineData>>;
fn apply_selection_vector(
self: Box<Self>,
selection_vector: Vec<bool>,
) -> DeltaResult<Box<dyn EngineData>>;
fn has_field(&self, name: &ColumnName) -> bool;
// Provided method
fn is_empty(&self) -> bool { ... }
}Expand description
Any type that an engine wants to return as “data” needs to implement this trait. The bulk of the
work is in the EngineData::visit_rows method. See the docs for that method for more details.
struct MyDataType; // Whatever the engine wants here
impl MyDataType {
fn do_extraction<'a>(&self) -> Vec<&'a dyn GetData<'a>> {
/// Actually do the extraction into getters
todo!()
}
}
impl EngineData for MyDataType {
fn visit_rows(&self, leaf_columns: &[ColumnName], visitor: &mut dyn RowVisitor) -> DeltaResult<()> {
let getters = self.do_extraction(); // do the extraction
visitor.visit(self.len(), &getters); // call the visitor back with the getters
Ok(())
}
fn len(&self) -> usize {
todo!() // actually get the len here
}
fn append_columns(&self, schema: SchemaRef, columns: Vec<ArrayData>) -> DeltaResult<Box<dyn EngineData>> {
todo!() // convert `SchemaRef` and `ArrayData` into local representation and append them
}
fn apply_selection_vector(self: Box<Self>, selection_vector: Vec<bool>) -> DeltaResult<Box<dyn EngineData>> {
todo!() // filter out unselected rows and return the new set of data
}
fn has_field(&self, name: &ColumnName) -> bool {
todo!() // determine whether the field exists in the data
}
}Required Methods§
Sourcefn visit_rows(
&self,
column_names: &[ColumnName],
visitor: &mut dyn RowVisitor,
) -> DeltaResult<()>
fn visit_rows( &self, column_names: &[ColumnName], visitor: &mut dyn RowVisitor, ) -> DeltaResult<()>
Visits a subset of leaf columns in each row of this data, passing a GetData item for each
requested column to the visitor’s visit method (along with the number of rows of data to
be visited).
Sourcefn append_columns(
&self,
schema: SchemaRef,
columns: Vec<ArrayData>,
) -> DeltaResult<Box<dyn EngineData>>
fn append_columns( &self, schema: SchemaRef, columns: Vec<ArrayData>, ) -> DeltaResult<Box<dyn EngineData>>
Append new columns provided by Kernel to the existing data.
This method creates a new EngineData instance that combines the existing columns
with the provided new columns. The original data remains unchanged.
§Parameters
schema: The schema of the columns being appended (not the entire resulting schema). This schema must describe exactly the columns being added in thecolumnsparameter.columns: The column data to append. EachArrayDatacorresponds to one field in the schema.
§Returns
A new EngineData instance containing both the original columns and the appended columns.
The schema of the result will contain all original fields followed by the new schema fields.
§Errors
Returns an error if:
- The number of rows in any appended column doesn’t match the existing data.
- The number of new columns doesn’t match the number of schema fields.
- Data type conversion to the engine’s native data types fails.
- The engine cannot create the combined data structure.
Sourcefn apply_selection_vector(
self: Box<Self>,
selection_vector: Vec<bool>,
) -> DeltaResult<Box<dyn EngineData>>
fn apply_selection_vector( self: Box<Self>, selection_vector: Vec<bool>, ) -> DeltaResult<Box<dyn EngineData>>
Apply a selection vector to the data and return a data where only the valid rows are included. This consumes the EngineData, allowing engines to implement this “in place” if desired
Sourcefn has_field(&self, name: &ColumnName) -> bool
fn has_field(&self, name: &ColumnName) -> bool
Returns true if a field at the given (possibly nested) path exists in this data’s schema.
For a top-level field named "foo", use ColumnName::new(["foo"]). For nested fields,
each non-leaf element of the path must be a struct field at that level.
Provided Methods§
Trait Implementations§
Source§impl EngineDataArrowExt for Box<dyn EngineData>
Available on (crate features default-engine-native-tls or default-engine-rustls or arrow-conversion) and crate feature default-engine-base only.
impl EngineDataArrowExt for Box<dyn EngineData>
default-engine-native-tls or default-engine-rustls or arrow-conversion) and crate feature default-engine-base only.fn try_into_record_batch(self) -> DeltaResult<RecordBatch>
Implementors§
impl EngineData for ArrowEngineData
default-engine-native-tls or default-engine-rustls or arrow-conversion) and crate feature default-engine-base only.