pub trait SchemaAdapter: Send + Sync {
// Required methods
fn map_column_index(
&self,
index: usize,
file_schema: &Schema,
) -> Option<usize>;
fn map_schema(
&self,
file_schema: &Schema,
) -> Result<(Arc<dyn SchemaMapper>, Vec<usize>)>;
}
Expand description
Creates SchemaMapper
s to map file-level RecordBatch
es to a table
schema, which may have a schema obtained from merging multiple file-level
schemas.
This is useful for implementing schema evolution in partitioned datasets.
See DefaultSchemaAdapterFactory
for more details and examples.
Required Methods§
Sourcefn map_column_index(&self, index: usize, file_schema: &Schema) -> Option<usize>
fn map_column_index(&self, index: usize, file_schema: &Schema) -> Option<usize>
Map a column index in the table schema to a column index in a particular file schema
This is used while reading a file to push down projections by mapping projected column indexes from the table schema to the file schema
Panics if index is not in range for the table schema
Sourcefn map_schema(
&self,
file_schema: &Schema,
) -> Result<(Arc<dyn SchemaMapper>, Vec<usize>)>
fn map_schema( &self, file_schema: &Schema, ) -> Result<(Arc<dyn SchemaMapper>, Vec<usize>)>
Creates a mapping for casting columns from the file schema to the table schema.
This is used after reading a record batch. The returned SchemaMapper
:
- Maps columns to the expected columns indexes
- Handles missing values (e.g. fills nulls or a default value) for columns in the in the table schema not in the file schema
- Handles different types: if the column in the file schema has a
different type than
table_schema
, the mapper will resolve this difference (e.g. by casting to the appropriate type)
Returns:
- a
SchemaMapper
- an ordered list of columns to project from the file