Trait datafusion::physical_plan::ExecutionPlan
source · pub trait ExecutionPlan: Debug + Send + Sync {
Show 17 methods
// Required methods
fn as_any(&self) -> &dyn Any;
fn schema(&self) -> SchemaRef;
fn output_partitioning(&self) -> Partitioning;
fn output_ordering(&self) -> Option<&[PhysicalSortExpr]>;
fn children(&self) -> Vec<Arc<dyn ExecutionPlan>>;
fn with_new_children(
self: Arc<Self>,
children: Vec<Arc<dyn ExecutionPlan>>
) -> Result<Arc<dyn ExecutionPlan>>;
fn execute(
&self,
partition: usize,
context: Arc<TaskContext>
) -> Result<SendableRecordBatchStream>;
fn statistics(&self) -> Statistics;
// Provided methods
fn unbounded_output(&self, _children: &[bool]) -> Result<bool> { ... }
fn required_input_distribution(&self) -> Vec<Distribution> { ... }
fn required_input_ordering(
&self
) -> Vec<Option<Vec<PhysicalSortRequirement>>> { ... }
fn maintains_input_order(&self) -> Vec<bool> { ... }
fn benefits_from_input_partitioning(&self) -> bool { ... }
fn equivalence_properties(&self) -> EquivalenceProperties { ... }
fn ordering_equivalence_properties(&self) -> OrderingEquivalenceProperties { ... }
fn metrics(&self) -> Option<MetricsSet> { ... }
fn fmt_as(&self, _t: DisplayFormatType, f: &mut Formatter<'_>) -> Result { ... }
}
Expand description
ExecutionPlan
represent nodes in the DataFusion Physical Plan.
Each ExecutionPlan
is partition-aware and is responsible for
creating the actual async
SendableRecordBatchStream
s
of RecordBatch
that incrementally compute the operator’s
output from its input partition.
ExecutionPlan
can be displayed in a simplified form using the
return value from displayable
in addition to the (normally
quite verbose) Debug
output.
Required Methods§
sourcefn as_any(&self) -> &dyn Any
fn as_any(&self) -> &dyn Any
Returns the execution plan as Any
so that it can be
downcast to a specific implementation.
sourcefn output_partitioning(&self) -> Partitioning
fn output_partitioning(&self) -> Partitioning
Specifies the output partitioning scheme of this plan
sourcefn output_ordering(&self) -> Option<&[PhysicalSortExpr]>
fn output_ordering(&self) -> Option<&[PhysicalSortExpr]>
If the output of this operator within each partition is sorted,
returns Some(keys)
with the description of how it was sorted.
For example, Sort, (obviously) produces sorted output as does
SortPreservingMergeStream. Less obviously Projection
produces sorted output if its input was sorted as it does not
reorder the input rows,
It is safe to return None
here if your operator does not
have any particular output order here
sourcefn children(&self) -> Vec<Arc<dyn ExecutionPlan>>
fn children(&self) -> Vec<Arc<dyn ExecutionPlan>>
Get a list of child execution plans that provide the input for this plan. The returned list will be empty for leaf nodes, will contain a single value for unary nodes, or two values for binary nodes (such as joins).
sourcefn with_new_children(
self: Arc<Self>,
children: Vec<Arc<dyn ExecutionPlan>>
) -> Result<Arc<dyn ExecutionPlan>>
fn with_new_children( self: Arc<Self>, children: Vec<Arc<dyn ExecutionPlan>> ) -> Result<Arc<dyn ExecutionPlan>>
Returns a new plan where all children were replaced by new plans.
sourcefn execute(
&self,
partition: usize,
context: Arc<TaskContext>
) -> Result<SendableRecordBatchStream>
fn execute( &self, partition: usize, context: Arc<TaskContext> ) -> Result<SendableRecordBatchStream>
creates an iterator
sourcefn statistics(&self) -> Statistics
fn statistics(&self) -> Statistics
Returns the global output statistics for this ExecutionPlan
node.
Provided Methods§
sourcefn unbounded_output(&self, _children: &[bool]) -> Result<bool>
fn unbounded_output(&self, _children: &[bool]) -> Result<bool>
Specifies whether this plan generates an infinite stream of records. If the plan does not support pipelining, but its input(s) are infinite, returns an error to indicate this.
sourcefn required_input_distribution(&self) -> Vec<Distribution>
fn required_input_distribution(&self) -> Vec<Distribution>
Specifies the data distribution requirements for all the children for this operator, By default it’s [Distribution::UnspecifiedDistribution] for each child,
sourcefn required_input_ordering(&self) -> Vec<Option<Vec<PhysicalSortRequirement>>>
fn required_input_ordering(&self) -> Vec<Option<Vec<PhysicalSortRequirement>>>
Specifies the ordering requirements for all of the children For each child, it’s the local ordering requirement within each partition rather than the global ordering
NOTE that checking !is_empty()
does not check for a
required input ordering. Instead, the correct check is that at
least one entry must be Some
sourcefn maintains_input_order(&self) -> Vec<bool>
fn maintains_input_order(&self) -> Vec<bool>
Returns false
if this operator’s implementation may reorder
rows within or between partitions.
For example, Projection, Filter, and Limit maintain the order of inputs – they may transform values (Projection) or not produce the same number of rows that went in (Filter and Limit), but the rows that are produced go in the same way.
DataFusion uses this metadata to apply certain optimizations such as automatically repartitioning correctly.
The default implementation returns false
WARNING: if you override this default, you MUST ensure that the operator’s maintains the ordering invariant or else DataFusion may produce incorrect results.
sourcefn benefits_from_input_partitioning(&self) -> bool
fn benefits_from_input_partitioning(&self) -> bool
Returns true
if this operator would benefit from
partitioning its input (and thus from more parallelism). For
operators that do very little work the overhead of extra
parallelism may outweigh any benefits
The default implementation returns true
unless this operator
has signalled it requires a single child input partition.
sourcefn equivalence_properties(&self) -> EquivalenceProperties
fn equivalence_properties(&self) -> EquivalenceProperties
Get the EquivalenceProperties within the plan
sourcefn ordering_equivalence_properties(&self) -> OrderingEquivalenceProperties
fn ordering_equivalence_properties(&self) -> OrderingEquivalenceProperties
Get the OrderingEquivalenceProperties within the plan
sourcefn metrics(&self) -> Option<MetricsSet>
fn metrics(&self) -> Option<MetricsSet>
Return a snapshot of the set of Metric
s for this
ExecutionPlan
.
While the values of the metrics in the returned
MetricsSet
s may change as execution progresses, the
specific metrics will not.
Once self.execute()
has returned (technically the future is
resolved) for all available partitions, the set of metrics
should be complete. If this function is called prior to
execute()
new metrics may appear in subsequent calls.
sourcefn fmt_as(&self, _t: DisplayFormatType, f: &mut Formatter<'_>) -> Result
fn fmt_as(&self, _t: DisplayFormatType, f: &mut Formatter<'_>) -> Result
Format this ExecutionPlan
to f
in the specified type.
Should not include a newline
Note this function prints a placeholder by default to preserve backwards compatibility.