pub struct SuperTable {
pub batches: Vec<Arc<Table>>,
pub schema: Vec<Arc<Field>>,
pub n_rows: usize,
pub name: String,
}Expand description
§SuperTable
Higher-order container representing a sequence of Table batches with consistent schema.
§Overview
- Each batch is a
Table(record batch) with identical column metadata. - Stored as
Vec<Arc<Table>>, preserving order and schema consistency. - Row counts per batch may vary, but are consistent across all Table columns.
- When exported via Arrow FFI, the batches are viewed as a single logical table.
- Useful for open-ended streams, partitioned datasets, or other scenarios where batches are processed independently.
§Fields
batches: ordered collection ofTablebatches.schema: cached schema from the first batch for fast access.n_rows: total row count across all batches.name: super table name.
§Use cases
- Streaming and mini-batch processing.
- Reading multiple Arrow IPC/memory-mapped files as one dataset.
- Parallel or windowed in-memory analytics.
- Incremental table construction where batches arrive over time.
Fields§
§batches: Vec<Arc<Table>>§schema: Vec<Arc<Field>>§n_rows: usize§name: StringImplementations§
Source§impl SuperTable
impl SuperTable
Sourcepub fn new(name: String) -> SuperTable
pub fn new(name: String) -> SuperTable
Creates a new empty BatchedTable with a specified name.
Sourcepub fn from_batches(
batches: Vec<Arc<Table>>,
name_override: Option<String>,
) -> SuperTable
pub fn from_batches( batches: Vec<Arc<Table>>, name_override: Option<String>, ) -> SuperTable
Builds from a collection of Table batches.
Panics if column count or field metadata are inconsistent.
Sourcepub fn insert_rows(
&mut self,
index: usize,
other: impl Into<SuperTable>,
) -> Result<(), MinarrowError>
pub fn insert_rows( &mut self, index: usize, other: impl Into<SuperTable>, ) -> Result<(), MinarrowError>
Inserts rows from another SuperTable (or Table) at the specified index.
This is an O(n) operation where n is the number of rows in the batch containing the insertion point.
§Arguments
index- Global row position before which to insert (0 = prepend, n_rows = append)other- SuperTable or Table to insert (viaInto<SuperTable>)
§Requirements
- Schema (column names, types, nullability) must match
indexmust be <=self.n_rows
§Strategy
Finds the batch containing the insertion point, splits it at that position, then inserts other’s batches in between the split halves. This redistributes rows across batches while preserving chunked structure.
§Errors
IndexErrorif index > n_rows- Schema mismatch if field metadata doesn’t match
pub fn n_cols(&self) -> usize
Sourcepub fn cols(&self) -> Vec<Arc<Field>>
pub fn cols(&self) -> Vec<Arc<Field>>
Returns the columns of the Super Table
Holds an assumption that all inner tables have the same fields
pub fn n_rows(&self) -> usize
pub fn n_batches(&self) -> usize
pub fn len(&self) -> usize
pub fn is_empty(&self) -> bool
pub fn schema(&self) -> &[Arc<Field>]
pub fn batches(&self) -> &[Arc<Table>]
pub fn batch(&self, idx: usize) -> Option<&Arc<Table>>
pub fn view(&self, offset: usize, len: usize) -> SuperTableV
pub fn from_views(slices: &[TableV], name: String) -> SuperTable
Sourcepub fn rechunk(
&mut self,
strategy: RechunkStrategy,
) -> Result<(), MinarrowError>
pub fn rechunk( &mut self, strategy: RechunkStrategy, ) -> Result<(), MinarrowError>
Rechunks the table according to the specified strategy.
Redistributes rows across batches using an efficient incremental approach that avoids full materialization:
Count(n): Creates batches ofnrows (last batch may be smaller)Auto: Uses a default size of 8192 rowsMemory(bytes): Targets a specific memory size per batch
§Arguments
strategy- The rechunking strategy to use
§Errors
- Returns
IndexErrorifCount(0)is specified - Returns
IndexErrorif memory-based calculation results in 0 chunk size
§Example
// Rechunk into 1024-row batches
table.rechunk(RechunkStrategy::Count(1024))?;
// Rechunk with default size
table.rechunk(RechunkStrategy::Auto)?;
// Target 64KB per batch
table.rechunk(RechunkStrategy::Memory(65536))?;Sourcepub fn rechunk_to(
&mut self,
up_to_row: usize,
strategy: RechunkStrategy,
) -> Result<(), MinarrowError>
pub fn rechunk_to( &mut self, up_to_row: usize, strategy: RechunkStrategy, ) -> Result<(), MinarrowError>
Rechunks only the first up_to_row rows, leaving the rest untouched.
This is useful for streaming scenarios where new data is being appended and you want to rechunk stable data while leaving recent additions alone.
§Arguments
up_to_row- Rechunk only rows before this indexstrategy- The rechunking strategy to use
§Errors
- Returns
IndexErrorifup_to_rowis greater than total row count - Returns same errors as
rechunk()for invalid strategies
§Example
// Rechunk first 1000 rows, leave the rest untouched
table.rechunk_to(1000, RechunkStrategy::Count(512))?;Trait Implementations§
Source§impl AsRef<SuperTable> for PyTable
impl AsRef<SuperTable> for PyTable
Source§fn as_ref(&self) -> &SuperTable
fn as_ref(&self) -> &SuperTable
Source§impl Clone for SuperTable
impl Clone for SuperTable
Source§fn clone(&self) -> SuperTable
fn clone(&self) -> SuperTable
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Concatenate for SuperTable
impl Concatenate for SuperTable
Source§fn concat(self, other: SuperTable) -> Result<SuperTable, MinarrowError>
fn concat(self, other: SuperTable) -> Result<SuperTable, MinarrowError>
Concatenates two SuperTables by appending all batches from other to self.
§Requirements
- Both SuperTables must have the same schema (column names and types)
§Returns
A new SuperTable containing all batches from self followed by all batches from other
§Errors
IncompatibleTypeErrorif schemas don’t match
Source§impl Consolidate for SuperTable
impl Consolidate for SuperTable
Source§fn consolidate(self) -> Table
fn consolidate(self) -> Table
Consolidates all batches into a single contiguous Table.
Materialises all rows from all batches into one table. Use this when you need contiguous memory for operations or APIs that require single buffers.
Uses self.name for the resulting table. Rename afterwards if needed.
When the arena feature is enabled, all column buffers are written
into a single allocation then sliced into typed views, reducing
allocation count from O(columns) to O(1). The resulting buffers
are SharedBuffer-backed; mutations trigger copy-on-write.
Without the arena feature, falls back to per-column concat.
§Panics
Panics if the SuperTable is empty.
Source§impl Debug for SuperTable
impl Debug for SuperTable
Source§impl Default for SuperTable
impl Default for SuperTable
Source§fn default() -> SuperTable
fn default() -> SuperTable
Source§impl Display for SuperTable
impl Display for SuperTable
Source§impl From<PyTable> for SuperTable
impl From<PyTable> for SuperTable
Source§impl From<SuperTable> for PyTable
impl From<SuperTable> for PyTable
Source§fn from(table: SuperTable) -> Self
fn from(table: SuperTable) -> Self
Source§impl From<SuperTableV> for SuperTable
Available on crate feature views only.
impl From<SuperTableV> for SuperTable
views only.