pub struct NetworkCoalesceExec { /* private fields */ }Expand description
ExecutionPlan that coalesces partitions from multiple tasks into a one or more task without performing any repartition, and maintaining the same partitioning scheme.
This is the equivalent of a [CoalescePartitionsExec] but coalescing tasks across the network between distributed stages.
┌───────────────────────────┐ ■
│ NetworkCoalesceExec │ │
│ (task 1) │ │
└┬─┬┬─┬┬─┬┬─┬┬─┬┬─┬┬─┬┬─┬┬─┬┘ Stage N+1
│1││2││3││4││5││6││7││8││9│ │
└─┘└─┘└─┘└─┘└─┘└─┘└─┘└─┘└─┘ │
▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ■
┌──┬──┬───────────────────────┴──┴──┘ │ │ │ └──┴──┴──────────────────────┬──┬──┐
│ │ │ │ │ │ │ │ │ ■
┌─┐┌─┐┌─┐ ┌─┐┌─┐┌─┐ ┌─┐┌─┐┌─┐ │
│1││2││3│ │4││5││6│ │7││8││9│ │
┌┴─┴┴─┴┴─┴──────────────────┐ ┌─────────┴─┴┴─┴┴─┴─────────┐ ┌──────────────────┴─┴┴─┴┴─┴┐ Stage N
│ Arc<dyn ExecutionPlan> │ │ Arc<dyn ExecutionPlan> │ │ Arc<dyn ExecutionPlan> │ │
│ (task 1) │ │ (task 2) │ │ (task 3) │ │
└───────────────────────────┘ └───────────────────────────┘ └───────────────────────────┘ ■The communication between two stages across a NetworkCoalesceExec has two implications:
- Stage N+1 may have one or more tasks. Each consumer task reads a contiguous group of upstream tasks from Stage N.
- Output partitioning for Stage N+1 is sized based on the maximum upstream-group size. When groups are uneven, consumer tasks with smaller groups return empty streams for the “extra” partitions.
┌───────────────────────────┐ ┌───────────────────────────┐ ■
│ NetworkCoalesceExec │ │ NetworkCoalesceExec │ │
│ (task 1) │ │ (task 2) │ │
└┬─┬┬─┬┬─┬┬─┬┬─┬┬─┬─────────┘ └┬─┬┬─┬┬─┬┬─┬┬─┬┬─┬─────────┘ Stage N+1
│1││2││3││4││5││6│ │7││8││9││_││_││_│ │
└─┘└─┘└─┘└─┘└─┘└─┘ └─┘└─┘└─┘└─┘└─┘└─┘ │
▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ▲ ■
┌──┬──┬────────────┴──┴──┘ └──┴──┴─────┬──┬──┐ └──┴──┴────────────────┬──┬──┐
│ │ │ │ │ │ │ │ │ ■
┌─┐┌─┐┌─┐ ┌─┐┌─┐┌─┐ ┌─┐┌─┐┌─┐ │
│1││2││3│ │4││5││6│ │7││8││9│ │
┌┴─┴┴─┴┴─┴──────────────────┐ ┌─────────┴─┴┴─┴┴─┴─────────┐ ┌──────────────────┴─┴┴─┴┴─┴┐ Stage N
│ Arc<dyn ExecutionPlan> │ │ Arc<dyn ExecutionPlan> │ │ Arc<dyn ExecutionPlan> │ │
│ (task 1) │ │ (task 2) │ │ (task 3) │ │
└───────────────────────────┘ └───────────────────────────┘ └───────────────────────────┘ ■This node has two variants.
- Pending: acts as a placeholder for the distributed optimization step to mark it as ready.
- Ready: runs within a distributed stage and queries the next input stage over the network using Arrow Flight.
Implementations§
Source§impl NetworkCoalesceExec
impl NetworkCoalesceExec
Sourcepub fn try_new(
input: Arc<dyn ExecutionPlan>,
producer_tasks: usize,
consumer_tasks: usize,
) -> Result<Self>
pub fn try_new( input: Arc<dyn ExecutionPlan>, producer_tasks: usize, consumer_tasks: usize, ) -> Result<Self>
Creates a new NetworkCoalesceExec fed by the provided input plan.
The input plan will be remotely executed in producer_tasks tasks, while the
NetworkCoalesceExec will be executed in consumer_tasks tasks in the stage above.
Typically, this node should be placed right after nodes that coalesce all the input partitions into one, for example:
- [CoalescePartitionsExec]
- [SortPreservingMergeExec]
§Warning
The caller must ensure that the provided consumer_tasks count matches the producer_tasks
of the network boundary immediately above.
Trait Implementations§
Source§impl Clone for NetworkCoalesceExec
impl Clone for NetworkCoalesceExec
Source§fn clone(&self) -> NetworkCoalesceExec
fn clone(&self) -> NetworkCoalesceExec
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for NetworkCoalesceExec
impl Debug for NetworkCoalesceExec
Source§impl DisplayAs for NetworkCoalesceExec
impl DisplayAs for NetworkCoalesceExec
Source§impl ExecutionPlan for NetworkCoalesceExec
impl ExecutionPlan for NetworkCoalesceExec
Source§fn properties(&self) -> &Arc<PlanProperties> ⓘ
fn properties(&self) -> &Arc<PlanProperties> ⓘ
ExecutionPlan, such as output
ordering(s), partitioning information etc. Read moreSource§fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>
fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>
ExecutionPlans that act as inputs to this plan.
The returned list will be empty for leaf nodes such as scans, will contain
a single value for unary nodes, or two values for binary nodes (such as
joins).Source§fn with_new_children(
self: Arc<Self>,
children: Vec<Arc<dyn ExecutionPlan>>,
) -> Result<Arc<dyn ExecutionPlan>>
fn with_new_children( self: Arc<Self>, children: Vec<Arc<dyn ExecutionPlan>>, ) -> Result<Arc<dyn ExecutionPlan>>
ExecutionPlan where all existing children were replaced
by the children, in orderSource§fn execute(
&self,
partition: usize,
context: Arc<TaskContext>,
) -> Result<SendableRecordBatchStream>
fn execute( &self, partition: usize, context: Arc<TaskContext>, ) -> Result<SendableRecordBatchStream>
Source§fn metrics(&self) -> Option<MetricsSet>
fn metrics(&self) -> Option<MetricsSet>
Metrics for this
ExecutionPlan. If no Metrics are available, return None. Read moreSource§fn static_name() -> &'static strwhere
Self: Sized,
fn static_name() -> &'static strwhere
Self: Sized,
name but can be called without an instance.Source§fn downcast_delegate(&self) -> Option<&(dyn ExecutionPlan + 'static)>
fn downcast_delegate(&self) -> Option<&(dyn ExecutionPlan + 'static)>
ExecutionPlan downcast identity. Read moreSource§fn check_invariants(&self, check: InvariantLevel) -> Result<(), DataFusionError>
fn check_invariants(&self, check: InvariantLevel) -> Result<(), DataFusionError>
Source§fn required_input_distribution(&self) -> Vec<Distribution>
fn required_input_distribution(&self) -> Vec<Distribution>
ExecutionPlan, By default it’s [Distribution::UnspecifiedDistribution] for each child,Source§fn required_input_ordering(&self) -> Vec<Option<OrderingRequirements>>
fn required_input_ordering(&self) -> Vec<Option<OrderingRequirements>>
ExecutionPlan. Read moreSource§fn maintains_input_order(&self) -> Vec<bool>
fn maintains_input_order(&self) -> Vec<bool>
false if this ExecutionPlan’s implementation may reorder
rows within or between partitions. Read moreSource§fn benefits_from_input_partitioning(&self) -> Vec<bool>
fn benefits_from_input_partitioning(&self) -> Vec<bool>
ExecutionPlan benefits from increased
parallelization at its input for each child. Read moreSource§fn reset_state(
self: Arc<Self>,
) -> Result<Arc<dyn ExecutionPlan>, DataFusionError>
fn reset_state( self: Arc<Self>, ) -> Result<Arc<dyn ExecutionPlan>, DataFusionError>
ExecutionPlan. Read moreSource§fn repartitioned(
&self,
_target_partitions: usize,
_config: &ConfigOptions,
) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
fn repartitioned( &self, _target_partitions: usize, _config: &ConfigOptions, ) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
ExecutionPlan to
produce target_partitions partitions. Read moreSource§fn partition_statistics(
&self,
partition: Option<usize>,
) -> Result<Arc<Statistics>, DataFusionError>
fn partition_statistics( &self, partition: Option<usize>, ) -> Result<Arc<Statistics>, DataFusionError>
ExecutionPlan node.
If statistics are not available, should return Statistics::new_unknown
(the default), not an error.
If partition is None, it returns statistics for the entire plan.Source§fn supports_limit_pushdown(&self) -> bool
fn supports_limit_pushdown(&self) -> bool
Source§fn with_fetch(&self, _limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>
fn with_fetch(&self, _limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>
ExecutionPlan node, if it supports
fetch limits. Returns None otherwise. Read moreSource§fn fetch(&self) -> Option<usize>
fn fetch(&self) -> Option<usize>
None means there is no fetch.Source§fn cardinality_effect(&self) -> CardinalityEffect
fn cardinality_effect(&self) -> CardinalityEffect
Source§fn try_swapping_with_projection(
&self,
_projection: &ProjectionExec,
) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
fn try_swapping_with_projection( &self, _projection: &ProjectionExec, ) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
ExecutionPlan. Read moreSource§fn gather_filters_for_pushdown(
&self,
_phase: FilterPushdownPhase,
parent_filters: Vec<Arc<dyn PhysicalExpr>>,
_config: &ConfigOptions,
) -> Result<FilterDescription, DataFusionError>
fn gather_filters_for_pushdown( &self, _phase: FilterPushdownPhase, parent_filters: Vec<Arc<dyn PhysicalExpr>>, _config: &ConfigOptions, ) -> Result<FilterDescription, DataFusionError>
ExecutionPlan::gather_filters_for_pushdown: Read moreSource§fn handle_child_pushdown_result(
&self,
_phase: FilterPushdownPhase,
child_pushdown_result: ChildPushdownResult,
_config: &ConfigOptions,
) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>, DataFusionError>
fn handle_child_pushdown_result( &self, _phase: FilterPushdownPhase, child_pushdown_result: ChildPushdownResult, _config: &ConfigOptions, ) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>, DataFusionError>
Source§fn with_new_state(
&self,
_state: Arc<dyn Any + Send + Sync>,
) -> Option<Arc<dyn ExecutionPlan>>
fn with_new_state( &self, _state: Arc<dyn Any + Send + Sync>, ) -> Option<Arc<dyn ExecutionPlan>>
Source§fn try_pushdown_sort(
&self,
_order: &[PhysicalSortExpr],
) -> Result<SortOrderPushdownResult<Arc<dyn ExecutionPlan>>, DataFusionError>
fn try_pushdown_sort( &self, _order: &[PhysicalSortExpr], ) -> Result<SortOrderPushdownResult<Arc<dyn ExecutionPlan>>, DataFusionError>
Source§fn with_preserve_order(
&self,
_preserve_order: bool,
) -> Option<Arc<dyn ExecutionPlan>>
fn with_preserve_order( &self, _preserve_order: bool, ) -> Option<Arc<dyn ExecutionPlan>>
ExecutionPlan that is aware of order-sensitivity. Read moreSource§impl NetworkBoundary for NetworkCoalesceExec
impl NetworkBoundary for NetworkCoalesceExec
Source§fn input_stage(&self) -> &Stage
fn input_stage(&self) -> &Stage
Source§fn with_input_stage(&self, input_stage: Stage) -> Result<Arc<dyn ExecutionPlan>>
fn with_input_stage(&self, input_stage: Stage) -> Result<Arc<dyn ExecutionPlan>>
Source§fn producer_head(&self, _consumer_task_count: usize) -> ProducerHead
fn producer_head(&self, _consumer_task_count: usize) -> ProducerHead
Auto Trait Implementations§
impl !RefUnwindSafe for NetworkCoalesceExec
impl !UnwindSafe for NetworkCoalesceExec
impl Freeze for NetworkCoalesceExec
impl Send for NetworkCoalesceExec
impl Sync for NetworkCoalesceExec
impl Unpin for NetworkCoalesceExec
impl UnsafeUnpin for NetworkCoalesceExec
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
Source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T in a tonic::Request