Trait SchemaAdapter

Source
pub trait SchemaAdapter: Send + Sync {
    // Required methods
    fn map_column_index(
        &self,
        index: usize,
        file_schema: &Schema,
    ) -> Option<usize>;
    fn map_schema(
        &self,
        file_schema: &Schema,
    ) -> Result<(Arc<dyn SchemaMapper>, Vec<usize>)>;
}
Expand description

Creates SchemaMappers to map file-level RecordBatches to a table schema, which may have a schema obtained from merging multiple file-level schemas.

This is useful for implementing schema evolution in partitioned datasets.

See DefaultSchemaAdapterFactory for more details and examples.

Required Methods§

Source

fn map_column_index(&self, index: usize, file_schema: &Schema) -> Option<usize>

Map a column index in the table schema to a column index in a particular file schema

This is used while reading a file to push down projections by mapping projected column indexes from the table schema to the file schema

Panics if index is not in range for the table schema

Source

fn map_schema( &self, file_schema: &Schema, ) -> Result<(Arc<dyn SchemaMapper>, Vec<usize>)>

Creates a mapping for casting columns from the file schema to the table schema.

This is used after reading a record batch. The returned SchemaMapper:

  1. Maps columns to the expected columns indexes
  2. Handles missing values (e.g. fills nulls or a default value) for columns in the in the table schema not in the file schema
  3. Handles different types: if the column in the file schema has a different type than table_schema, the mapper will resolve this difference (e.g. by casting to the appropriate type)

Returns:

  • a SchemaMapper
  • an ordered list of columns to project from the file

Implementors§