Skip to main content

TableReader

Trait TableReader 

Source
pub trait TableReader: Send + Sync {
    // Required methods
    fn name(&self) -> &str;
    fn supported_formats(&self) -> &[DataSourceFormat];
    fn register_table<'life0, 'life1, 'life2, 'life3, 'life4, 'async_trait>(
        &'life0 self,
        ctx: &'life1 SessionContext,
        table_name: &'life2 str,
        table_info: &'life3 TableInfo,
        schema: SchemaRef,
        storage_options: &'life4 HashMap<String, String>,
    ) -> Pin<Box<dyn Future<Output = CatalogResult<()>> + Send + 'async_trait>>
       where Self: 'async_trait,
             'life0: 'async_trait,
             'life1: 'async_trait,
             'life2: 'async_trait,
             'life3: 'async_trait,
             'life4: 'async_trait;
}
Expand description

Reads table data in a specific format and registers it into a DataFusion SessionContext.

Analogous to Presto’s ConnectorPageSourceProvider — decoupled from catalog metadata so that format readers are reusable across any catalog.

§Extensibility

Implement this trait to add support for new data formats:

  • Parquet (provided)
  • Delta Lake (provided, behind delta feature)
  • CSV (future)
  • Iceberg (future)
  • ORC (future)

Required Methods§

Source

fn name(&self) -> &str

Human-readable name of this reader (e.g., “parquet”, “delta”).

Source

fn supported_formats(&self) -> &[DataSourceFormat]

The data format(s) this reader can handle.

Source

fn register_table<'life0, 'life1, 'life2, 'life3, 'life4, 'async_trait>( &'life0 self, ctx: &'life1 SessionContext, table_name: &'life2 str, table_info: &'life3 TableInfo, schema: SchemaRef, storage_options: &'life4 HashMap<String, String>, ) -> Pin<Box<dyn Future<Output = CatalogResult<()>> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait, 'life1: 'async_trait, 'life2: 'async_trait, 'life3: 'async_trait, 'life4: 'async_trait,

Register a table into a DataFusion SessionContext using its storage location.

The reader should read (or reference) the data at table_info.storage_location and register it as a DataFusion TableProvider so it can be queried via SQL.

§Arguments
  • ctx - The DataFusion session context to register the table in.
  • table_name - The name to register the table under (already lowercased).
  • table_info - Full table metadata from the catalog, including storage_location.
  • schema - Arrow schema derived from the table’s column definitions.
  • storage_options - Key-value pairs for cloud storage credentials (e.g., azure_storage_account_name, aws_access_key_id, etc.).

Implementors§