pub struct NetworkCoalesceExec { /* private fields */ }Expand description
ExecutionPlan that coalesces partitions from multiple tasks into a one or more task without performing any repartition, and maintaining the same partitioning scheme.
This is the equivalent of a [CoalescePartitionsExec] but coalescing tasks across the network between distributed stages.
┌───────────────────────────┐ ■
│ NetworkCoalesceExec │ │
│ (task 1) │ │
└┬─┬┬─┬┬─┬┬─┬┬─┬┬─┬┬─┬┬─┬┬─┬┘ Stage N+1
│1││2││3││4││5││6││7││8││9│ │
└─┘└─┘└─┘└─┘└─┘└─┘└─┘└─┘└─┘ │
▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ■
┌──┬──┬───────────────────────┴──┴──┘ │ │ │ └──┴──┴──────────────────────┬──┬──┐
│ │ │ │ │ │ │ │ │ ■
┌─┐┌─┐┌─┐ ┌─┐┌─┐┌─┐ ┌─┐┌─┐┌─┐ │
│1││2││3│ │4││5││6│ │7││8││9│ │
┌┴─┴┴─┴┴─┴──────────────────┐ ┌─────────┴─┴┴─┴┴─┴─────────┐ ┌──────────────────┴─┴┴─┴┴─┴┐ Stage N
│ Arc<dyn ExecutionPlan> │ │ Arc<dyn ExecutionPlan> │ │ Arc<dyn ExecutionPlan> │ │
│ (task 1) │ │ (task 2) │ │ (task 3) │ │
└───────────────────────────┘ └───────────────────────────┘ └───────────────────────────┘ ■The communication between two stages across a NetworkCoalesceExec has two implications:
- Stage N+1 may have one or more tasks. Each consumer task reads a contiguous group of upstream tasks from Stage N.
- Output partitioning for Stage N+1 is sized based on the maximum upstream-group size. When groups are uneven, consumer tasks with smaller groups return empty streams for the “extra” partitions.
┌───────────────────────────┐ ┌───────────────────────────┐ ■
│ NetworkCoalesceExec │ │ NetworkCoalesceExec │ │
│ (task 1) │ │ (task 2) │ │
└┬─┬┬─┬┬─┬┬─┬┬─┬┬─┬─────────┘ └┬─┬┬─┬┬─┬┬─┬┬─┬┬─┬─────────┘ Stage N+1
│1││2││3││4││5││6│ │7││8││9││_││_││_│ │
└─┘└─┘└─┘└─┘└─┘└─┘ └─┘└─┘└─┘└─┘└─┘└─┘ │
▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ■
┌──┬──┬────────────┴──┴──┘ └──┴──┴─────┬──┬──┐ └──┴──┴────────────────┬──┬──┐
│ │ │ │ │ │ │ │ │ ■
┌─┐┌─┐┌─┐ ┌─┐┌─┐┌─┐ ┌─┐┌─┐┌─┐ │
│1││2││3│ │4││5││6│ │7││8││9│ │
┌┴─┴┴─┴┴─┴──────────────────┐ ┌─────────┴─┴┴─┴┴─┴─────────┐ ┌──────────────────┴─┴┴─┴┴─┴┐ Stage N
│ Arc<dyn ExecutionPlan> │ │ Arc<dyn ExecutionPlan> │ │ Arc<dyn ExecutionPlan> │ │
│ (task 1) │ │ (task 2) │ │ (task 3) │ │
└───────────────────────────┘ └───────────────────────────┘ └───────────────────────────┘ ■This node has two variants.
- Pending: acts as a placeholder for the distributed optimization step to mark it as ready.
- Ready: runs within a distributed stage and queries the next input stage over the network using Arrow Flight.
Implementations§
Source§impl NetworkCoalesceExec
impl NetworkCoalesceExec
Sourcepub fn try_new(
input: Arc<dyn ExecutionPlan>,
query_id: Uuid,
num: usize,
task_count: usize,
input_task_count: usize,
) -> Result<Self>
pub fn try_new( input: Arc<dyn ExecutionPlan>, query_id: Uuid, num: usize, task_count: usize, input_task_count: usize, ) -> Result<Self>
Builds a new NetworkCoalesceExec in “Pending” state.
Typically, this node should be placed right after nodes that coalesce all the input partitions into one, for example:
- [CoalescePartitionsExec]
- [SortPreservingMergeExec]
Trait Implementations§
Source§impl Clone for NetworkCoalesceExec
impl Clone for NetworkCoalesceExec
Source§fn clone(&self) -> NetworkCoalesceExec
fn clone(&self) -> NetworkCoalesceExec
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for NetworkCoalesceExec
impl Debug for NetworkCoalesceExec
Source§impl DisplayAs for NetworkCoalesceExec
impl DisplayAs for NetworkCoalesceExec
Source§impl ExecutionPlan for NetworkCoalesceExec
impl ExecutionPlan for NetworkCoalesceExec
Source§fn as_any(&self) -> &dyn Any
fn as_any(&self) -> &dyn Any
Returns the execution plan as
Any so that it can be
downcast to a specific implementation.Source§fn properties(&self) -> &Arc<PlanProperties>
fn properties(&self) -> &Arc<PlanProperties>
Return properties of the output of the
ExecutionPlan, such as output
ordering(s), partitioning information etc. Read moreSource§fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>
fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>
Get a list of children
ExecutionPlans that act as inputs to this plan.
The returned list will be empty for leaf nodes such as scans, will contain
a single value for unary nodes, or two values for binary nodes (such as
joins).Source§fn with_new_children(
self: Arc<Self>,
children: Vec<Arc<dyn ExecutionPlan>>,
) -> Result<Arc<dyn ExecutionPlan>>
fn with_new_children( self: Arc<Self>, children: Vec<Arc<dyn ExecutionPlan>>, ) -> Result<Arc<dyn ExecutionPlan>>
Returns a new
ExecutionPlan where all existing children were replaced
by the children, in orderSource§fn execute(
&self,
partition: usize,
context: Arc<TaskContext>,
) -> Result<SendableRecordBatchStream>
fn execute( &self, partition: usize, context: Arc<TaskContext>, ) -> Result<SendableRecordBatchStream>
Source§fn metrics(&self) -> Option<MetricsSet>
fn metrics(&self) -> Option<MetricsSet>
Return a snapshot of the set of
Metrics for this
ExecutionPlan. If no Metrics are available, return None. Read moreSource§fn static_name() -> &'static strwhere
Self: Sized,
fn static_name() -> &'static strwhere
Self: Sized,
Short name for the ExecutionPlan, such as ‘DataSourceExec’.
Like
name but can be called without an instance.Source§fn check_invariants(&self, check: InvariantLevel) -> Result<(), DataFusionError>
fn check_invariants(&self, check: InvariantLevel) -> Result<(), DataFusionError>
Returns an error if this individual node does not conform to its invariants.
These invariants are typically only checked in debug mode. Read more
Source§fn required_input_distribution(&self) -> Vec<Distribution>
fn required_input_distribution(&self) -> Vec<Distribution>
Specifies the data distribution requirements for all the
children for this
ExecutionPlan, By default it’s [Distribution::UnspecifiedDistribution] for each child,Source§fn required_input_ordering(&self) -> Vec<Option<OrderingRequirements>>
fn required_input_ordering(&self) -> Vec<Option<OrderingRequirements>>
Specifies the ordering required for all of the children of this
ExecutionPlan. Read moreSource§fn maintains_input_order(&self) -> Vec<bool>
fn maintains_input_order(&self) -> Vec<bool>
Returns
false if this ExecutionPlan’s implementation may reorder
rows within or between partitions. Read moreSource§fn benefits_from_input_partitioning(&self) -> Vec<bool>
fn benefits_from_input_partitioning(&self) -> Vec<bool>
Specifies whether the
ExecutionPlan benefits from increased
parallelization at its input for each child. Read moreSource§fn reset_state(
self: Arc<Self>,
) -> Result<Arc<dyn ExecutionPlan>, DataFusionError>
fn reset_state( self: Arc<Self>, ) -> Result<Arc<dyn ExecutionPlan>, DataFusionError>
Reset any internal state within this
ExecutionPlan. Read moreSource§fn repartitioned(
&self,
_target_partitions: usize,
_config: &ConfigOptions,
) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
fn repartitioned( &self, _target_partitions: usize, _config: &ConfigOptions, ) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
If supported, attempt to increase the partitioning of this
ExecutionPlan to
produce target_partitions partitions. Read moreSource§fn partition_statistics(
&self,
partition: Option<usize>,
) -> Result<Statistics, DataFusionError>
fn partition_statistics( &self, partition: Option<usize>, ) -> Result<Statistics, DataFusionError>
Returns statistics for a specific partition of this
ExecutionPlan node.
If statistics are not available, should return Statistics::new_unknown
(the default), not an error.
If partition is None, it returns statistics for the entire plan.Source§fn supports_limit_pushdown(&self) -> bool
fn supports_limit_pushdown(&self) -> bool
Source§fn with_fetch(&self, _limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>
fn with_fetch(&self, _limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>
Returns a fetching variant of this
ExecutionPlan node, if it supports
fetch limits. Returns None otherwise.Source§fn fetch(&self) -> Option<usize>
fn fetch(&self) -> Option<usize>
Gets the fetch count for the operator,
None means there is no fetch.Source§fn cardinality_effect(&self) -> CardinalityEffect
fn cardinality_effect(&self) -> CardinalityEffect
Gets the effect on cardinality, if known
Source§fn try_swapping_with_projection(
&self,
_projection: &ProjectionExec,
) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
fn try_swapping_with_projection( &self, _projection: &ProjectionExec, ) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
Attempts to push down the given projection into the input of this
ExecutionPlan. Read moreSource§fn gather_filters_for_pushdown(
&self,
_phase: FilterPushdownPhase,
parent_filters: Vec<Arc<dyn PhysicalExpr>>,
_config: &ConfigOptions,
) -> Result<FilterDescription, DataFusionError>
fn gather_filters_for_pushdown( &self, _phase: FilterPushdownPhase, parent_filters: Vec<Arc<dyn PhysicalExpr>>, _config: &ConfigOptions, ) -> Result<FilterDescription, DataFusionError>
Collect filters that this node can push down to its children.
Filters that are being pushed down from parents are passed in,
and the node may generate additional filters to push down.
For example, given the plan FilterExec -> HashJoinExec -> DataSourceExec,
what will happen is that we recurse down the plan calling
ExecutionPlan::gather_filters_for_pushdown: Read moreSource§fn handle_child_pushdown_result(
&self,
_phase: FilterPushdownPhase,
child_pushdown_result: ChildPushdownResult,
_config: &ConfigOptions,
) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>, DataFusionError>
fn handle_child_pushdown_result( &self, _phase: FilterPushdownPhase, child_pushdown_result: ChildPushdownResult, _config: &ConfigOptions, ) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>, DataFusionError>
Handle the result of a child pushdown. Read more
Source§fn with_new_state(
&self,
_state: Arc<dyn Any + Send + Sync>,
) -> Option<Arc<dyn ExecutionPlan>>
fn with_new_state( &self, _state: Arc<dyn Any + Send + Sync>, ) -> Option<Arc<dyn ExecutionPlan>>
Injects arbitrary run-time state into this execution plan, returning a new plan
instance that incorporates that state if it is relevant to the concrete
node implementation. Read more
Source§fn try_pushdown_sort(
&self,
_order: &[PhysicalSortExpr],
) -> Result<SortOrderPushdownResult<Arc<dyn ExecutionPlan>>, DataFusionError>
fn try_pushdown_sort( &self, _order: &[PhysicalSortExpr], ) -> Result<SortOrderPushdownResult<Arc<dyn ExecutionPlan>>, DataFusionError>
Try to push down sort ordering requirements to this node. Read more
Source§fn with_preserve_order(
&self,
_preserve_order: bool,
) -> Option<Arc<dyn ExecutionPlan>>
fn with_preserve_order( &self, _preserve_order: bool, ) -> Option<Arc<dyn ExecutionPlan>>
Returns a variant of this
ExecutionPlan that is aware of order-sensitivity. Read moreSource§impl NetworkBoundary for NetworkCoalesceExec
impl NetworkBoundary for NetworkCoalesceExec
Source§fn input_stage(&self) -> &Stage
fn input_stage(&self) -> &Stage
Returns the assigned input Stage, if any.
Source§fn with_input_stage(&self, input_stage: Stage) -> Result<Arc<dyn ExecutionPlan>>
fn with_input_stage(&self, input_stage: Stage) -> Result<Arc<dyn ExecutionPlan>>
Called when a Stage is correctly formed. The NetworkBoundary can use this
information to perform any internal transformations necessary for distributed execution. Read more
Auto Trait Implementations§
impl Freeze for NetworkCoalesceExec
impl !RefUnwindSafe for NetworkCoalesceExec
impl Send for NetworkCoalesceExec
impl Sync for NetworkCoalesceExec
impl Unpin for NetworkCoalesceExec
impl UnsafeUnpin for NetworkCoalesceExec
impl !UnwindSafe for NetworkCoalesceExec
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
Source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
Wrap the input message
T in a tonic::Request