Module datafusion::datasource

source ·
Expand description

DataFusion data sources: TableProvider and ListingTable

Re-exports§

Modules§

  • This module contains code for reading Avro data into RecordBatches
  • CteWorkTable implementation used for recursive queries
  • Default TableSource implementation used in DataFusion physical plans
  • EmptyTable useful for testing.
  • Module containing helper methods for the various file formats See write.rs for write related helper methods
  • A table that uses a function to generate data
  • A table that uses the ObjectStore listing capability to get the list of files to process.
  • Factory for creating ListingTables with default options
  • MemTable for querying Vec<RecordBatch> by DataFusion.
  • ObjectStoreRegistry holds all the object stores at Runtime with a scheme for each store. This allows the user to extend DataFusion with different storage systems such as S3 or HDFS and query data inside these systems.
  • Execution plans that read file formats
  • Data source traits
  • TableProvider for stream sources, such as FIFO files
  • A simplified TableProvider for streaming partitioned datasets
  • View data source which uses a LogicalPlan as it’s input.

Enums§

  • Indicates the type of this table for metadata/catalog purposes.

Functions§

  • Get all files as well as the file level summary statistics (no statistic for partition columns). If the optional limit is provided, includes only sufficient files. Needed to read up to limit number of rows. collect_stats is passed down from the configuration parameter on ListingTable. If it is false we only construct bare statistics and skip a potentially expensive call to multiunzip for constructing file level summary statistics.