pub struct TableSchema { /* private fields */ }Expand description
Helper to hold table schema information for partitioned data sources.
When reading partitioned data (such as Hive-style partitioning), a table’s schema consists of two parts:
- File schema: The schema of the actual data files on disk
- Partition columns: Columns that are encoded in the directory structure, not stored in the files themselves
§Example: Partitioned Table
Consider a table with the following directory structure:
/data/date=2025-10-10/region=us-west/data.parquet
/data/date=2025-10-11/region=us-east/data.parquetIn this case:
- File schema: The schema of
data.parquetfiles (e.g.,[user_id, amount]) - Partition columns:
[date, region]extracted from the directory path - Table schema: The full schema combining both (e.g.,
[user_id, amount, date, region])
§When to Use
Use TableSchema when:
- Reading partitioned data sources (Parquet, CSV, etc. with Hive-style partitioning)
- You need to efficiently access different schema representations without reconstructing them
- You want to avoid repeatedly concatenating file and partition schemas
For non-partitioned data or when working with a single schema representation,
working directly with Arrow’s Schema or SchemaRef is simpler.
§Performance
This struct pre-computes and caches the full table schema, allowing cheap references to any representation without repeated allocations or reconstructions.
Implementations§
Source§impl TableSchema
impl TableSchema
Sourcepub fn new(file_schema: SchemaRef, table_partition_cols: Vec<FieldRef>) -> Self
pub fn new(file_schema: SchemaRef, table_partition_cols: Vec<FieldRef>) -> Self
Create a new TableSchema from a file schema and partition columns.
The table schema is automatically computed by appending the partition columns to the file schema.
You should prefer calling this method over
chaining TableSchema::from_file_schema and TableSchema::with_table_partition_cols
if you have both the file schema and partition columns available at construction time
since it avoids re-computing the table schema.
§Arguments
file_schema- Schema of the data files (without partition columns)table_partition_cols- Partition columns to append to each row
§Example
let file_schema = Arc::new(Schema::new(vec![
Field::new("user_id", DataType::Int64, false),
Field::new("amount", DataType::Float64, false),
]));
let partition_cols = vec![
Arc::new(Field::new("date", DataType::Utf8, false)),
Arc::new(Field::new("region", DataType::Utf8, false)),
];
let table_schema = TableSchema::new(file_schema, partition_cols);
// Table schema will have 4 columns: user_id, amount, date, region
assert_eq!(table_schema.table_schema().fields().len(), 4);Sourcepub fn from_file_schema(file_schema: SchemaRef) -> Self
pub fn from_file_schema(file_schema: SchemaRef) -> Self
Create a new TableSchema with no partition columns.
You should prefer calling TableSchema::new if you have partition columns at
construction time since it avoids re-computing the table schema.
Sourcepub fn with_table_partition_cols(self, partition_cols: Vec<FieldRef>) -> Self
pub fn with_table_partition_cols(self, partition_cols: Vec<FieldRef>) -> Self
Add partition columns to an existing TableSchema, returning a new instance.
You should prefer calling TableSchema::new instead of chaining TableSchema::from_file_schema
into TableSchema::with_table_partition_cols if you have partition columns at construction time
since it avoids re-computing the table schema.
Sourcepub fn file_schema(&self) -> &SchemaRef
pub fn file_schema(&self) -> &SchemaRef
Get the file schema (without partition columns).
This is the schema of the actual data files on disk.
Sourcepub fn table_partition_cols(&self) -> &Vec<FieldRef> ⓘ
pub fn table_partition_cols(&self) -> &Vec<FieldRef> ⓘ
Get the table partition columns.
These are the columns derived from the directory structure that will be appended to each row during query execution.
Sourcepub fn table_schema(&self) -> &SchemaRef
pub fn table_schema(&self) -> &SchemaRef
Get the full table schema (file schema + partition columns).
This is the complete schema that will be seen by queries, combining both the columns from the files and the partition columns.
Trait Implementations§
Source§impl Clone for TableSchema
impl Clone for TableSchema
Source§fn clone(&self) -> TableSchema
fn clone(&self) -> TableSchema
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreAuto Trait Implementations§
impl Freeze for TableSchema
impl RefUnwindSafe for TableSchema
impl Send for TableSchema
impl Sync for TableSchema
impl Unpin for TableSchema
impl UnwindSafe for TableSchema
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more