pub struct BroadcastExec { /* private fields */ }Expand description
ExecutionPlan that scales up partitions for network broadcasting.
This plan takes N input partitions and exposes N*M output partitions,
where M is the number of consumer tasks. Each virtual partition i
returns the cached result of input partition i % N.
This allows each consumer task to fetch a unique set of partition numbers, the virtual partitions, while all receiving the same data via the actual partitions. This structure maintains the invariant that each partition is executed exactly once by the framework.
Broadcast is used in a 1 to many context, like this:
┌────────────────────────┐ ┌────────────────────────┐ ┌────────────────────────┐ ■
│ NetworkBroadcastExec │ │ NetworkBroadcastExec │ ... │ NetworkBroadcastExec │ │
│ (task 1) │ │ (task 2) │ │ (task M) │ Stage N+1
└┬─┬─────┬───┬───────────┘ └───────┬─┬─────┬────┬───┘ └─────┬──────┬─────┬────┬┘ │
│0│ │N-1│ │N│ │2N-1│ │(M-1)N│ │MN-1│ │
└▲┘ ... └▲──┘ └▲┘ ... └──▲─┘ └───▲──┘ ... └──▲─┘ ■
│ │ Populates │ │ │ │
│ └────Cache Index ───┐ Cache Hit Cache Hit ┌──Cache Hit────┘ │
│ N-1 │ Index 0 Index N-1 │ │
└────Populates ─────┐ │ │ │ │ ┌───Cache Hit──┘
Cache Index 0 │ │ │ │ │ │
┌┴┐ ... ┌┴──┐ ┌┴┐ ... ┌──┴─┐ ... ┌───┴──┐ ... ┌───┴┐ ■
│0│ │N-1│ │N│ │2N-1│ │(M-1)N│ │MN-1│ │
┌┴─┴─────┴───┴──────┴─┴─────┴────┴───────────────────┴──────┴─────┴────┴┐ │
│ BroadcastExec │ │
│ ┌───────────────────────────┐ │ │
│ │ Batch Cache │ │ Stage N
│ │┌─────────┐ ┌─────────┐│ │ │
│ ││ index 0 │ ... │index N-1││ │ │
│ │└─────────┘ └─────────┘│ │ │
│ └───────────────────────────┘ │ │
└───────────────────────────┬─┬──────────┬───┬──────────────────────────┘ ■
│0│ │N-1│
└▲┘ ... └─▲─┘
│ │
┌──┘ └──┐
│ │ ■
┌┴┐ ... ┌──┴┐ │
│0│ │N-1│ Stage N-1
┌┴─┴───────────────┴───┴┐ │
│Arc<dyn ExecutionPlan> │ │
└───────────────────────┘ ■Notice that the first consumer task, [NetworkBroadcastExec] task 1, triggers the execution of the operator below the [BroadCastExec] and populates each cache index with the respective partition. Subsequent consumer tasks, rather than executing the same partitions, read the data from the cache for each partition.
Implementations§
Source§impl BroadcastExec
impl BroadcastExec
pub fn new(input: Arc<dyn ExecutionPlan>, consumer_task_count: usize) -> Self
pub fn input_partition_count(&self) -> usize
pub fn consumer_task_count(&self) -> usize
Trait Implementations§
Source§impl Debug for BroadcastExec
impl Debug for BroadcastExec
Source§impl DisplayAs for BroadcastExec
impl DisplayAs for BroadcastExec
Source§impl ExecutionPlan for BroadcastExec
impl ExecutionPlan for BroadcastExec
Source§fn as_any(&self) -> &dyn Any
fn as_any(&self) -> &dyn Any
Any so that it can be
downcast to a specific implementation.Source§fn properties(&self) -> &Arc<PlanProperties>
fn properties(&self) -> &Arc<PlanProperties>
ExecutionPlan, such as output
ordering(s), partitioning information etc. Read moreSource§fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>
fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>
ExecutionPlans that act as inputs to this plan.
The returned list will be empty for leaf nodes such as scans, will contain
a single value for unary nodes, or two values for binary nodes (such as
joins).Source§fn with_new_children(
self: Arc<Self>,
children: Vec<Arc<dyn ExecutionPlan>>,
) -> Result<Arc<dyn ExecutionPlan>>
fn with_new_children( self: Arc<Self>, children: Vec<Arc<dyn ExecutionPlan>>, ) -> Result<Arc<dyn ExecutionPlan>>
ExecutionPlan where all existing children were replaced
by the children, in orderSource§fn execute(
&self,
partition: usize,
context: Arc<TaskContext>,
) -> Result<SendableRecordBatchStream>
fn execute( &self, partition: usize, context: Arc<TaskContext>, ) -> Result<SendableRecordBatchStream>
Source§fn static_name() -> &'static strwhere
Self: Sized,
fn static_name() -> &'static strwhere
Self: Sized,
name but can be called without an instance.Source§fn check_invariants(&self, check: InvariantLevel) -> Result<(), DataFusionError>
fn check_invariants(&self, check: InvariantLevel) -> Result<(), DataFusionError>
Source§fn required_input_distribution(&self) -> Vec<Distribution>
fn required_input_distribution(&self) -> Vec<Distribution>
ExecutionPlan, By default it’s [Distribution::UnspecifiedDistribution] for each child,Source§fn required_input_ordering(&self) -> Vec<Option<OrderingRequirements>>
fn required_input_ordering(&self) -> Vec<Option<OrderingRequirements>>
ExecutionPlan. Read moreSource§fn maintains_input_order(&self) -> Vec<bool>
fn maintains_input_order(&self) -> Vec<bool>
false if this ExecutionPlan’s implementation may reorder
rows within or between partitions. Read moreSource§fn benefits_from_input_partitioning(&self) -> Vec<bool>
fn benefits_from_input_partitioning(&self) -> Vec<bool>
ExecutionPlan benefits from increased
parallelization at its input for each child. Read moreSource§fn reset_state(
self: Arc<Self>,
) -> Result<Arc<dyn ExecutionPlan>, DataFusionError>
fn reset_state( self: Arc<Self>, ) -> Result<Arc<dyn ExecutionPlan>, DataFusionError>
ExecutionPlan. Read moreSource§fn repartitioned(
&self,
_target_partitions: usize,
_config: &ConfigOptions,
) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
fn repartitioned( &self, _target_partitions: usize, _config: &ConfigOptions, ) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
ExecutionPlan to
produce target_partitions partitions. Read moreSource§fn metrics(&self) -> Option<MetricsSet>
fn metrics(&self) -> Option<MetricsSet>
Metrics for this
ExecutionPlan. If no Metrics are available, return None. Read moreSource§fn partition_statistics(
&self,
partition: Option<usize>,
) -> Result<Statistics, DataFusionError>
fn partition_statistics( &self, partition: Option<usize>, ) -> Result<Statistics, DataFusionError>
ExecutionPlan node.
If statistics are not available, should return Statistics::new_unknown
(the default), not an error.
If partition is None, it returns statistics for the entire plan.Source§fn supports_limit_pushdown(&self) -> bool
fn supports_limit_pushdown(&self) -> bool
Source§fn with_fetch(&self, _limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>
fn with_fetch(&self, _limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>
ExecutionPlan node, if it supports
fetch limits. Returns None otherwise.Source§fn fetch(&self) -> Option<usize>
fn fetch(&self) -> Option<usize>
None means there is no fetch.Source§fn cardinality_effect(&self) -> CardinalityEffect
fn cardinality_effect(&self) -> CardinalityEffect
Source§fn try_swapping_with_projection(
&self,
_projection: &ProjectionExec,
) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
fn try_swapping_with_projection( &self, _projection: &ProjectionExec, ) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
ExecutionPlan. Read moreSource§fn gather_filters_for_pushdown(
&self,
_phase: FilterPushdownPhase,
parent_filters: Vec<Arc<dyn PhysicalExpr>>,
_config: &ConfigOptions,
) -> Result<FilterDescription, DataFusionError>
fn gather_filters_for_pushdown( &self, _phase: FilterPushdownPhase, parent_filters: Vec<Arc<dyn PhysicalExpr>>, _config: &ConfigOptions, ) -> Result<FilterDescription, DataFusionError>
ExecutionPlan::gather_filters_for_pushdown: Read moreSource§fn handle_child_pushdown_result(
&self,
_phase: FilterPushdownPhase,
child_pushdown_result: ChildPushdownResult,
_config: &ConfigOptions,
) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>, DataFusionError>
fn handle_child_pushdown_result( &self, _phase: FilterPushdownPhase, child_pushdown_result: ChildPushdownResult, _config: &ConfigOptions, ) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>, DataFusionError>
Source§fn with_new_state(
&self,
_state: Arc<dyn Any + Send + Sync>,
) -> Option<Arc<dyn ExecutionPlan>>
fn with_new_state( &self, _state: Arc<dyn Any + Send + Sync>, ) -> Option<Arc<dyn ExecutionPlan>>
Source§fn try_pushdown_sort(
&self,
_order: &[PhysicalSortExpr],
) -> Result<SortOrderPushdownResult<Arc<dyn ExecutionPlan>>, DataFusionError>
fn try_pushdown_sort( &self, _order: &[PhysicalSortExpr], ) -> Result<SortOrderPushdownResult<Arc<dyn ExecutionPlan>>, DataFusionError>
Source§fn with_preserve_order(
&self,
_preserve_order: bool,
) -> Option<Arc<dyn ExecutionPlan>>
fn with_preserve_order( &self, _preserve_order: bool, ) -> Option<Arc<dyn ExecutionPlan>>
ExecutionPlan that is aware of order-sensitivity. Read moreAuto Trait Implementations§
impl Freeze for BroadcastExec
impl !RefUnwindSafe for BroadcastExec
impl Send for BroadcastExec
impl Sync for BroadcastExec
impl Unpin for BroadcastExec
impl UnsafeUnpin for BroadcastExec
impl !UnwindSafe for BroadcastExec
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
Source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T in a tonic::Request