pub struct DefaultSchemaAdapterFactory;
Expand description
Default SchemaAdapterFactory
for mapping schemas.
This can be used to adapt file-level record batches to a table schema and implement schema evolution.
Given an input file schema and a table schema, this factory returns
SchemaAdapter
that return SchemaMapper
s that:
- Reorder columns
- Cast columns to the correct type
- Fill missing columns with nulls
§Errors:
- If a column in the table schema is non-nullable but is not present in the file schema (i.e. it is missing), the returned mapper tries to fill it with nulls resulting in a schema error.
§Illustration of Schema Mapping
┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
┌───────┐ ┌───────┐ │ ┌───────┐ ┌───────┐ ┌───────┐ │
││ 1.0 │ │ "foo" │ ││ NULL │ │ "foo" │ │ "1.0" │
├───────┤ ├───────┤ │ Schema mapping ├───────┤ ├───────┤ ├───────┤ │
││ 2.0 │ │ "bar" │ ││ NULL │ │ "bar" │ │ "2.0" │
└───────┘ └───────┘ │────────────────▶ └───────┘ └───────┘ └───────┘ │
│ │
column "c" column "b"│ column "a" column "b" column "c"│
│ Float64 Utf8 │ Int32 Utf8 Utf8
─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ┘
Input Record Batch Output Record Batch
Schema { Schema {
"c": Float64, "a": Int32,
"b": Utf8, "b": Utf8,
} "c": Utf8,
}
§Example of using the DefaultSchemaAdapterFactory
to map RecordBatch
s
Note SchemaMapping
also supports mapping partial batches, which is used as
part of predicate pushdown.
// Table has fields "a", "b" and "c"
let table_schema = Schema::new(vec![
Field::new("a", DataType::Int32, true),
Field::new("b", DataType::Utf8, true),
Field::new("c", DataType::Utf8, true),
]);
// create an adapter to map the table schema to the file schema
let adapter = DefaultSchemaAdapterFactory::from_schema(Arc::new(table_schema));
// The file schema has fields "c" and "b" but "b" is stored as an 'Float64'
// instead of 'Utf8'
let file_schema = Schema::new(vec![
Field::new("c", DataType::Utf8, true),
Field::new("b", DataType::Float64, true),
]);
// Get a mapping from the file schema to the table schema
let (mapper, _indices) = adapter.map_schema(&file_schema).unwrap();
let file_batch = record_batch!(
("c", Utf8, vec!["foo", "bar"]),
("b", Float64, vec![1.0, 2.0])
).unwrap();
let mapped_batch = mapper.map_batch(file_batch).unwrap();
// the mapped batch has the correct schema and the "b" column has been cast to Utf8
let expected_batch = record_batch!(
("a", Int32, vec![None, None]), // missing column filled with nulls
("b", Utf8, vec!["1.0", "2.0"]), // b was cast to string and order was changed
("c", Utf8, vec!["foo", "bar"])
).unwrap();
assert_eq!(mapped_batch, expected_batch);
Implementations§
Source§impl DefaultSchemaAdapterFactory
impl DefaultSchemaAdapterFactory
Sourcepub fn from_schema(table_schema: SchemaRef) -> Box<dyn SchemaAdapter>
pub fn from_schema(table_schema: SchemaRef) -> Box<dyn SchemaAdapter>
Create a new factory for mapping batches from a file schema to a table schema.
This is a convenience for DefaultSchemaAdapterFactory::create
with
the same schema for both the projected table schema and the table
schema.
Trait Implementations§
Source§impl Clone for DefaultSchemaAdapterFactory
impl Clone for DefaultSchemaAdapterFactory
Source§fn clone(&self) -> DefaultSchemaAdapterFactory
fn clone(&self) -> DefaultSchemaAdapterFactory
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source
. Read moreSource§impl Debug for DefaultSchemaAdapterFactory
impl Debug for DefaultSchemaAdapterFactory
Source§impl Default for DefaultSchemaAdapterFactory
impl Default for DefaultSchemaAdapterFactory
Source§fn default() -> DefaultSchemaAdapterFactory
fn default() -> DefaultSchemaAdapterFactory
Returns the “default value” for a type. Read more
Source§impl SchemaAdapterFactory for DefaultSchemaAdapterFactory
impl SchemaAdapterFactory for DefaultSchemaAdapterFactory
Source§fn create(
&self,
projected_table_schema: SchemaRef,
_table_schema: SchemaRef,
) -> Box<dyn SchemaAdapter>
fn create( &self, projected_table_schema: SchemaRef, _table_schema: SchemaRef, ) -> Box<dyn SchemaAdapter>
Create a
SchemaAdapter
Read moreAuto Trait Implementations§
impl Freeze for DefaultSchemaAdapterFactory
impl RefUnwindSafe for DefaultSchemaAdapterFactory
impl Send for DefaultSchemaAdapterFactory
impl Sync for DefaultSchemaAdapterFactory
impl Unpin for DefaultSchemaAdapterFactory
impl UnwindSafe for DefaultSchemaAdapterFactory
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more