Struct datafusion::datasource::listing::ListingTable
source · pub struct ListingTable { /* private fields */ }
Expand description
Reads data from one or more files via an
ObjectStore
. For example, from
local files or objects from AWS S3. Implements TableProvider
,
a DataFusion data source.
Features
-
Merges schemas if the files have compatible but not indentical schemas
-
Hive-style partitioning support, where a path such as
/files/date=1/1/2022/data.parquet
is injected as adate
column. -
Projection pushdown for formats that support it such as such as Parquet
Example
Here is an example of reading a directory of parquet files using a
ListingTable
:
let ctx = SessionContext::new();
let session_state = ctx.state();
let table_path = "/path/to/parquet";
// Parse the path
let table_path = ListingTableUrl::parse(table_path)?;
// Create default parquet options
let file_format = ParquetFormat::new();
let listing_options = ListingOptions::new(Arc::new(file_format))
.with_file_extension(".parquet");
// Resolve the schema
let resolved_schema = listing_options
.infer_schema(&session_state, &table_path)
.await?;
let config = ListingTableConfig::new(table_path)
.with_listing_options(listing_options)
.with_schema(resolved_schema);
// Create a new TableProvider
let provider = Arc::new(ListingTable::try_new(config)?);
// This provider can now be read as a dataframe:
let df = ctx.read_table(provider.clone());
// or registered as a named table:
ctx.register_table("my_table", provider);
Implementations§
source§impl ListingTable
impl ListingTable
sourcepub fn try_new(config: ListingTableConfig) -> Result<Self>
pub fn try_new(config: ListingTableConfig) -> Result<Self>
Create new ListingTable
that lists the FS to get the files
to scan. See ListingTable
for and example.
Takes a ListingTableConfig
as input which requires an ObjectStore
and table_path
.
ListingOptions
and SchemaRef
are optional. If they are not
provided the file type is inferred based on the file suffix.
If the schema is provided then it must be resolved before creating the table
and should contain the fields of the file without the table
partitioning columns.
sourcepub fn with_definition(self, defintion: Option<String>) -> Self
pub fn with_definition(self, defintion: Option<String>) -> Self
Specify the SQL definition for this table, if any
sourcepub fn table_paths(&self) -> &Vec<ListingTableUrl>
pub fn table_paths(&self) -> &Vec<ListingTableUrl>
Get paths ref
sourcepub fn options(&self) -> &ListingOptions
pub fn options(&self) -> &ListingOptions
Get options ref
Trait Implementations§
source§impl TableProvider for ListingTable
impl TableProvider for ListingTable
source§fn as_any(&self) -> &dyn Any
fn as_any(&self) -> &dyn Any
Any
so that it can be
downcast to a specific implementation.