Struct TableSchema

Source

pub struct TableSchema { /* private fields */ }

Expand description

Helper to hold table schema information for partitioned data sources.

When reading partitioned data (such as Hive-style partitioning), a table’s schema consists of two parts:

File schema: The schema of the actual data files on disk
Partition columns: Columns that are encoded in the directory structure, not stored in the files themselves

§Example: Partitioned Table

Consider a table with the following directory structure:

/data/date=2025-10-10/region=us-west/data.parquet
/data/date=2025-10-11/region=us-east/data.parquet

In this case:

File schema: The schema of data.parquet files (e.g., [user_id, amount])
Partition columns: [date, region] extracted from the directory path
Table schema: The full schema combining both (e.g., [user_id, amount, date, region])

§When to Use

Use TableSchema when:

Reading partitioned data sources (Parquet, CSV, etc. with Hive-style partitioning)
You need to efficiently access different schema representations without reconstructing them
You want to avoid repeatedly concatenating file and partition schemas

For non-partitioned data or when working with a single schema representation, working directly with Arrow’s Schema or SchemaRef is simpler.

§Performance

This struct pre-computes and caches the full table schema, allowing cheap references to any representation without repeated allocations or reconstructions.

Implementations§

Source §

impl TableSchema

Source

pub fn new(file_schema: SchemaRef, table_partition_cols: Vec<FieldRef>) -> Self

Create a new TableSchema from a file schema and partition columns.

The table schema is automatically computed by appending the partition columns to the file schema.

You should prefer calling this method over chaining TableSchema::from_file_schema and TableSchema::with_table_partition_cols if you have both the file schema and partition columns available at construction time since it avoids re-computing the table schema.

§Arguments

file_schema - Schema of the data files (without partition columns)
table_partition_cols - Partition columns to append to each row

§Example

let file_schema = Arc::new(Schema::new(vec![
    Field::new("user_id", DataType::Int64, false),
    Field::new("amount", DataType::Float64, false),
]));

let partition_cols = vec![
    Arc::new(Field::new("date", DataType::Utf8, false)),
    Arc::new(Field::new("region", DataType::Utf8, false)),
];

let table_schema = TableSchema::new(file_schema, partition_cols);

// Table schema will have 4 columns: user_id, amount, date, region
assert_eq!(table_schema.table_schema().fields().len(), 4);

Source

pub fn from_file_schema(file_schema: SchemaRef) -> Self

Create a new TableSchema with no partition columns.

You should prefer calling TableSchema::new if you have partition columns at construction time since it avoids re-computing the table schema.

Source

pub fn with_table_partition_cols(self, partition_cols: Vec<FieldRef>) -> Self

Add partition columns to an existing TableSchema, returning a new instance.

You should prefer calling TableSchema::new instead of chaining TableSchema::from_file_schema into TableSchema::with_table_partition_cols if you have partition columns at construction time since it avoids re-computing the table schema.

Source

pub fn file_schema(&self) -> &SchemaRef

Get the file schema (without partition columns).

This is the schema of the actual data files on disk.

Source

pub fn table_partition_cols(&self) -> &Vec<FieldRef> ⓘ

Get the table partition columns.

These are the columns derived from the directory structure that will be appended to each row during query execution.

Source

pub fn table_schema(&self) -> &SchemaRef

Get the full table schema (file schema + partition columns).

This is the complete schema that will be seen by queries, combining both the columns from the files and the partition columns.

Trait Implementations§

Source §

impl Clone for TableSchema

Source §

fn clone(&self) -> TableSchema

Returns a duplicate of the value. Read more

1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

Source §

impl Debug for TableSchema

Source §

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl UnwindSafe for TableSchema

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<T> CloneToUninit for T
where T: Clone,

Source §

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)

Performs copy-assignment from self to dest. Read more

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T> IntoEither for T

Source §

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

impl<T> ToOwned for T
where T: Clone,

Source §

type Owned = T

The resulting type after obtaining ownership.

Source §

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more

Source §

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more

Source §

impl<T, U> TryFrom for T
where U: Into<T>,

Source §

type Error = Infallible

The type returned in the event of a conversion error.

Source §

fn try_from(value: U) -> Result<T, <T as TryFrom>::Error>

Performs the conversion.

Source §

impl<T, U> TryInto for T
where U: TryFrom<T>,

Source §

type Error = >::Error

The type returned in the event of a conversion error.

Source §

fn try_into(self) -> Result<U, >::Error>

Performs the conversion.

Source §

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source §

fn vzip(self) -> V

Source §

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,

Source §

Struct TableSchema Copy item path

§Example: Partitioned Table

§When to Use

§Performance

Implementations§

impl TableSchema

pub fn new(file_schema: SchemaRef, table_partition_cols: Vec<FieldRef>) -> Self

§Arguments

§Example

pub fn from_file_schema(file_schema: SchemaRef) -> Self

pub fn with_table_partition_cols(self, partition_cols: Vec<FieldRef>) -> Self

pub fn file_schema(&self) -> &SchemaRef

pub fn table_partition_cols(&self) -> &Vec<FieldRef> ⓘ

pub fn table_schema(&self) -> &SchemaRef

Trait Implementations§

impl Clone for TableSchema

fn clone(&self) -> TableSchema

fn clone_from(&mut self, source: &Self)

impl Debug for TableSchema

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Auto Trait Implementations§

impl Freeze for TableSchema

impl RefUnwindSafe for TableSchema

impl Send for TableSchema

impl Sync for TableSchema

impl Unpin for TableSchema

impl UnwindSafe for TableSchema

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> IntoEither for T

fn into_either(self, into_left: bool) -> Either<Self, Self>

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>where F: FnOnce(&Self) -> bool,

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<V, T> VZip<V> for Twhere V: MultiLane<T>,

fn vzip(self) -> V

impl<T> Allocation for Twhere T: RefUnwindSafe + Send + Sync,

impl<T> ErasedDestructor for Twhere T: 'static,

Struct TableSchema

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,

impl<T> ErasedDestructor for T
where T: 'static,