Struct datafusion::datasource::physical_plan::parquet::ParquetExec
source · pub struct ParquetExec { /* private fields */ }
Expand description
Execution plan for scanning one or more Parquet partitions
Implementations§
source§impl ParquetExec
impl ParquetExec
sourcepub fn new(
base_config: FileScanConfig,
predicate: Option<Arc<dyn PhysicalExpr>>,
metadata_size_hint: Option<usize>
) -> Self
pub fn new( base_config: FileScanConfig, predicate: Option<Arc<dyn PhysicalExpr>>, metadata_size_hint: Option<usize> ) -> Self
Create a new Parquet reader execution plan provided file list and schema.
sourcepub fn base_config(&self) -> &FileScanConfig
pub fn base_config(&self) -> &FileScanConfig
Ref to the base configs
sourcepub fn predicate(&self) -> Option<&Arc<dyn PhysicalExpr>>
pub fn predicate(&self) -> Option<&Arc<dyn PhysicalExpr>>
Optional predicate.
sourcepub fn pruning_predicate(&self) -> Option<&Arc<PruningPredicate>>
pub fn pruning_predicate(&self) -> Option<&Arc<PruningPredicate>>
Optional reference to this parquet scan’s pruning predicate
sourcepub fn with_parquet_file_reader_factory(
self,
parquet_file_reader_factory: Arc<dyn ParquetFileReaderFactory>
) -> Self
pub fn with_parquet_file_reader_factory( self, parquet_file_reader_factory: Arc<dyn ParquetFileReaderFactory> ) -> Self
Optional user defined parquet file reader factory.
ParquetFileReaderFactory
complements TableProvider
, It enables users to provide custom
implementation for data access operations.
If custom ParquetFileReaderFactory
is provided, then data access operations will be routed
to this factory instead of ObjectStore
.
sourcepub fn with_pushdown_filters(self, pushdown_filters: bool) -> Self
pub fn with_pushdown_filters(self, pushdown_filters: bool) -> Self
sourcepub fn with_reorder_filters(self, reorder_filters: bool) -> Self
pub fn with_reorder_filters(self, reorder_filters: bool) -> Self
If true, the RowFilter
made by pushdown_filters
may try to
minimize the cost of filter evaluation by reordering the
predicate Expr
s. If false, the predicates are applied in
the same order as specified in the query. Defaults to false.
sourcepub fn with_enable_page_index(self, enable_page_index: bool) -> Self
pub fn with_enable_page_index(self, enable_page_index: bool) -> Self
If enabled, the reader will read the page index
This is used to optimise filter pushdown
via RowSelector
and RowFilter
by
eliminating unnecessary IO and decoding
sourcepub fn get_repartitioned(
&self,
target_partitions: usize,
repartition_file_min_size: usize
) -> Self
pub fn get_repartitioned( &self, target_partitions: usize, repartition_file_min_size: usize ) -> Self
Redistribute files across partitions according to their size
Trait Implementations§
source§impl Clone for ParquetExec
impl Clone for ParquetExec
source§fn clone(&self) -> ParquetExec
fn clone(&self) -> ParquetExec
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moresource§impl Debug for ParquetExec
impl Debug for ParquetExec
source§impl ExecutionPlan for ParquetExec
impl ExecutionPlan for ParquetExec
source§fn output_partitioning(&self) -> Partitioning
fn output_partitioning(&self) -> Partitioning
Get the output partitioning of this plan
source§fn children(&self) -> Vec<Arc<dyn ExecutionPlan>>
fn children(&self) -> Vec<Arc<dyn ExecutionPlan>>
source§fn output_ordering(&self) -> Option<&[PhysicalSortExpr]>
fn output_ordering(&self) -> Option<&[PhysicalSortExpr]>
Some(keys)
with the description of how it was sorted. Read moresource§fn ordering_equivalence_properties(&self) -> OrderingEquivalenceProperties
fn ordering_equivalence_properties(&self) -> OrderingEquivalenceProperties
source§fn with_new_children(
self: Arc<Self>,
_: Vec<Arc<dyn ExecutionPlan>>
) -> Result<Arc<dyn ExecutionPlan>>
fn with_new_children( self: Arc<Self>, _: Vec<Arc<dyn ExecutionPlan>> ) -> Result<Arc<dyn ExecutionPlan>>
source§fn execute(
&self,
partition_index: usize,
ctx: Arc<TaskContext>
) -> Result<SendableRecordBatchStream>
fn execute( &self, partition_index: usize, ctx: Arc<TaskContext> ) -> Result<SendableRecordBatchStream>
source§fn metrics(&self) -> Option<MetricsSet>
fn metrics(&self) -> Option<MetricsSet>
source§fn statistics(&self) -> Statistics
fn statistics(&self) -> Statistics
ExecutionPlan
node.source§fn unbounded_output(&self, _children: &[bool]) -> Result<bool>
fn unbounded_output(&self, _children: &[bool]) -> Result<bool>
source§fn required_input_distribution(&self) -> Vec<Distribution>
fn required_input_distribution(&self) -> Vec<Distribution>
source§fn required_input_ordering(&self) -> Vec<Option<Vec<PhysicalSortRequirement>>>
fn required_input_ordering(&self) -> Vec<Option<Vec<PhysicalSortRequirement>>>
source§fn maintains_input_order(&self) -> Vec<bool>
fn maintains_input_order(&self) -> Vec<bool>
false
if this operator’s implementation may reorder
rows within or between partitions. Read moresource§fn benefits_from_input_partitioning(&self) -> bool
fn benefits_from_input_partitioning(&self) -> bool
true
if this operator would benefit from
partitioning its input (and thus from more parallelism). For
operators that do very little work the overhead of extra
parallelism may outweigh any benefits Read more