Trait datafusion::physical_plan::ExecutionPlan
source · [−]pub trait ExecutionPlan: Debug + Send + Sync {
Show 14 methods
fn as_any(&self) -> &dyn Any;
fn schema(&self) -> SchemaRef;
fn output_partitioning(&self) -> Partitioning;
fn output_ordering(&self) -> Option<&[PhysicalSortExpr]>;
fn children(&self) -> Vec<Arc<dyn ExecutionPlan>>ⓘNotable traits for Vec<u8, A>impl<A> Write for Vec<u8, A> where
A: Allocator,
;
fn with_new_children(
self: Arc<Self>,
children: Vec<Arc<dyn ExecutionPlan>>
) -> Result<Arc<dyn ExecutionPlan>>;
fn execute(
&self,
partition: usize,
context: Arc<TaskContext>
) -> Result<SendableRecordBatchStream>;
fn statistics(&self) -> Statistics;
fn required_child_distribution(&self) -> Distribution { ... }
fn relies_on_input_order(&self) -> bool { ... }
fn maintains_input_order(&self) -> bool { ... }
fn benefits_from_input_partitioning(&self) -> bool { ... }
fn metrics(&self) -> Option<MetricsSet> { ... }
fn fmt_as(&self, _t: DisplayFormatType, f: &mut Formatter<'_>) -> Result { ... }
}
Expand description
ExecutionPlan
represent nodes in the DataFusion Physical Plan.
Each ExecutionPlan
is Partition-aware and is responsible for
creating the actual async
SendableRecordBatchStream
s
of RecordBatch
that incrementally compute the operator’s
output from its input partition.
ExecutionPlan
can be displayed in an simplified form using the
return value from displayable
in addition to the (normally
quite verbose) Debug
output.
Required Methods
Returns the execution plan as Any
so that it can be
downcast to a specific implementation.
fn output_partitioning(&self) -> Partitioning
fn output_partitioning(&self) -> Partitioning
Specifies the output partitioning scheme of this plan
fn output_ordering(&self) -> Option<&[PhysicalSortExpr]>
fn output_ordering(&self) -> Option<&[PhysicalSortExpr]>
If the output of this operator is sorted, returns Some(keys)
with the description of how it was sorted.
For example, Sort, (obviously) produces sorted output as does
SortPreservingMergeStream. Less obviously Projection
produces sorted output if its input was sorted as it does not
reorder the input rows,
It is safe to return None
here if your operator does not
have any particular output order here
Get a list of child execution plans that provide the input for this plan. The returned list will be empty for leaf nodes, will contain a single value for unary nodes, or two values for binary nodes (such as joins).
fn with_new_children(
self: Arc<Self>,
children: Vec<Arc<dyn ExecutionPlan>>
) -> Result<Arc<dyn ExecutionPlan>>
fn with_new_children(
self: Arc<Self>,
children: Vec<Arc<dyn ExecutionPlan>>
) -> Result<Arc<dyn ExecutionPlan>>
Returns a new plan where all children were replaced by new plans.
fn execute(
&self,
partition: usize,
context: Arc<TaskContext>
) -> Result<SendableRecordBatchStream>
fn execute(
&self,
partition: usize,
context: Arc<TaskContext>
) -> Result<SendableRecordBatchStream>
creates an iterator
fn statistics(&self) -> Statistics
fn statistics(&self) -> Statistics
Returns the global output statistics for this ExecutionPlan
node.
Provided Methods
fn required_child_distribution(&self) -> Distribution
fn required_child_distribution(&self) -> Distribution
Specifies the data distribution requirements of all the children for this operator
fn relies_on_input_order(&self) -> bool
fn relies_on_input_order(&self) -> bool
Returns true
if this operator relies on its inputs being
produced in a certain order (for example that they are sorted
a particular way) for correctness.
If true
is returned, DataFusion will not apply certain
optimizations which might reorder the inputs (such as
repartitioning to increase concurrency).
The default implementation returns true
WARNING: if you override this default and return false
, your
operator can not rely on datafusion preserving the input order
as it will likely not.
fn maintains_input_order(&self) -> bool
fn maintains_input_order(&self) -> bool
Returns false
if this operator’s implementation may reorder
rows within or between partitions.
For example, Projection, Filter, and Limit maintain the order of inputs – they may transform values (Projection) or not produce the same number of rows that went in (Filter and Limit), but the rows that are produced go in the same way.
DataFusion uses this metadata to apply certain optimizations such as automatically repartitioning correctly.
The default implementation returns false
WARNING: if you override this default, you MUST ensure that the operator’s maintains the ordering invariant or else DataFusion may produce incorrect results.
fn benefits_from_input_partitioning(&self) -> bool
fn benefits_from_input_partitioning(&self) -> bool
Returns true
if this operator would benefit from
partitioning its input (and thus from more parallelism). For
operators that do very little work the overhead of extra
parallelism may outweigh any benefits
The default implementation returns true
unless this operator
has signalled it requiers a single child input partition.
fn metrics(&self) -> Option<MetricsSet>
fn metrics(&self) -> Option<MetricsSet>
Return a snapshot of the set of Metric
s for this
ExecutionPlan
.
While the values of the metrics in the returned
MetricsSet
s may change as execution progresses, the
specific metrics will not.
Once self.execute()
has returned (technically the future is
resolved) for all available partitions, the set of metrics
should be complete. If this function is called prior to
execute()
new metrics may appear in subsequent calls.
fn fmt_as(&self, _t: DisplayFormatType, f: &mut Formatter<'_>) -> Result
fn fmt_as(&self, _t: DisplayFormatType, f: &mut Formatter<'_>) -> Result
Format this ExecutionPlan
to f
in the specified type.
Should not include a newline
Note this function prints a placeholder by default to preserve backwards compatibility.