pub struct DataFusionConfig {
pub pushdown_filters: bool,
pub reorder_filters: bool,
pub list_files_cache: bool,
pub list_files_cache_mb: usize,
pub list_files_cache_ttl_secs: u64,
}Expand description
DataFusion backend performance tuning ([datafusion] block).
Every knob is off / stock by default, so the backend behaves exactly
like DataFusion out of the box unless you opt in. These mainly help lazy
(lazy = true) parquet datasets, especially on object storage. Ignored by
the DuckDB backend.
Fields§
§pushdown_filters: boolPush row-level filters down into the parquet decoder so rows that fail
a predicate are never materialised (in addition to the row-group /
page-index pruning that always happens). DataFusion default is false
because for some workloads the extra per-row evaluation is not worth
it; turn it on for selective filters over large row groups.
reorder_filters: boolLet the parquet scan reorder pushed-down predicates by estimated
selectivity. Only has an effect together with pushdown_filters.
DataFusion default is false.
list_files_cache: boolCache object-store file listings on the shared runtime so repeated
lazy queries reuse LIST results instead of re-listing the source
prefix every time — the dominant per-query cost on S3. Default false.
list_files_cache_mb: usizeMemory budget for the file-listing cache, in MiB. Only used when
list_files_cache = true. Default 64.
list_files_cache_ttl_secs: u64How long a cached listing stays valid, in seconds. Bounds how long it
takes for newly written files to become visible without an explicit
reload. 0 means no expiry (infinite). Default 60.
Trait Implementations§
Source§impl Clone for DataFusionConfig
impl Clone for DataFusionConfig
Source§fn clone(&self) -> DataFusionConfig
fn clone(&self) -> DataFusionConfig
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more