Struct datafusion::datasource::listing::ListingOptions [−][src]
pub struct ListingOptions {
pub file_extension: String,
pub format: Arc<dyn FileFormat>,
pub table_partition_cols: Vec<String>,
pub collect_stat: bool,
pub target_partitions: usize,
}
Expand description
Options for creating a ListingTable
Fields
file_extension: String
A suffix on which files should be filtered (leave empty to keep all files on the path)
format: Arc<dyn FileFormat>
The file format
table_partition_cols: Vec<String>
The expected partition column names in the folder structure.
For example Vec["a", "b"]
means that the two first levels of
partitioning expected should be named “a” and “b”:
- If there is a third level of partitioning it will be ignored.
- Files that don’t follow this partitioning will be ignored.
Note that only
DEFAULT_PARTITION_COLUMN_DATATYPE
is currently supported for the column type.
collect_stat: bool
Set true to try to guess statistics from the files. This can add a lot of overhead as it will usually require files to be opened and at least partially parsed.
target_partitions: usize
Group files to avoid that the number of partitions exceeds this limit
Implementations
Creates an options instance with the given format Default values:
- no file extension filter
- no input partition to discover
- one target partition
- no stat collection
pub async fn infer_schema<'a>(
&'a self,
object_store: Arc<dyn ObjectStore>,
path: &'a str
) -> Result<SchemaRef>
pub async fn infer_schema<'a>(
&'a self,
object_store: Arc<dyn ObjectStore>,
path: &'a str
) -> Result<SchemaRef>
Infer the schema of the files at the given path on the provided object store. The inferred schema does not include the partitioning columns.
This method will not be called by the table itself but before creating it. This way when creating the logical plan we can decide to resolve the schema locally or ask a remote service to do it (e.g a scheduler).