pub struct DistributedLeafExec { /* private fields */ }Expand description
Represents a leaf node ready to be distributed across N tasks, where the variant of the node
belonging to each task is stored in a Vec of N positions.
While sending this plan over the wire to a remote worker, only the appropriate variant is sent.
This ExecutionPlan implementation is typically returned by crate::TaskEstimator::scale_up_leaf_node, which will be called for scaling up a node for distribution. The process typically looks like this:
- The distributed planner calls crate::TaskEstimator::scale_up_leaf_node providing a leaf node and the amount of tasks in which it should be distributed:
┌──────────────┐
│DataSourceExec│ + 3 tasks
└──────────────┘- The crate::TaskEstimator implementation, either user provided or a default one, returns a DistributedLeafExec adhering to this task count:
┌────────────────────────────────────────────────┐
│ DistributedLeafExec │
│ │
│┌──────────────┐┌──────────────┐┌──────────────┐│
││DataSourceExec││DataSourceExec││DataSourceExec││
││ for task 0 ││ for task 1 ││ for task 2 ││
│└──────────────┘└──────────────┘└──────────────┘│
└────────────────────────────────────────────────┘- The crate::DistributedExec node, upon being executed, will send the different variants of the leaf node to the respective workers, instead of sending the full DistributedLeafExec:
┌──────────────────┐┌──────────────────┐┌──────────────────┐
│ Worker 0 ││ Worker 1 ││ Worker 2 │
│ ││ ││ │
│ ... ││ ... ││ ... │
│ ││ ││ │
│ ┌──────────────┐ ││ ┌──────────────┐ ││ ┌──────────────┐ │
│ │ SomeExec │ ││ │ SomeExec │ ││ │ SomeExec │ │
│ │ │ ││ │ │ ││ │ │ │
│ └──────────────┘ ││ └──────────────┘ ││ └──────────────┘ │
│ ┌──────────────┐ ││ ┌──────────────┐ ││ ┌──────────────┐ │
│ │DataSourceExec│ ││ │DataSourceExec│ ││ │DataSourceExec│ │
│ │ for task 0 │ ││ │ for task 1 │ ││ │ for task 2 │ │
│ └──────────────┘ ││ └──────────────┘ ││ └──────────────┘ │
└──────────────────┘└──────────────────┘└──────────────────┘This way, the different workers get to execute different versions of the same plan, each handling its own range of non-overlapping data.
Implementations§
Source§impl DistributedLeafExec
impl DistributedLeafExec
Sourcepub fn try_new(
original: Arc<dyn ExecutionPlan>,
variants: impl IntoIterator<Item = Arc<dyn ExecutionPlan>>,
) -> Result<Self>
pub fn try_new( original: Arc<dyn ExecutionPlan>, variants: impl IntoIterator<Item = Arc<dyn ExecutionPlan>>, ) -> Result<Self>
Builds a new DistributedLeafExec based on the provided original plan and its per-task variants. Provided variants must expose the same partition count as the original plan.
Sourcepub fn original(&self) -> &Arc<dyn ExecutionPlan> ⓘ
pub fn original(&self) -> &Arc<dyn ExecutionPlan> ⓘ
The plan this leaf was built from (the leaf passed to
crate::TaskEstimator::scale_up_leaf_node). Useful for recognising which DistributedLeafExec
you are looking at — e.g. by downcasting it to your own leaf type — before inspecting its
DistributedLeafExec::variants.
Sourcepub fn variants(&self) -> &[Arc<dyn ExecutionPlan>]
pub fn variants(&self) -> &[Arc<dyn ExecutionPlan>]
The per-task variants, in task order: variants()[i] is the plan sent to task i. Useful
for inspecting per-task information (e.g. data locality) when routing tasks to workers via
crate::TaskEstimator::route_tasks.
Trait Implementations§
Source§impl Debug for DistributedLeafExec
impl Debug for DistributedLeafExec
Source§impl DisplayAs for DistributedLeafExec
impl DisplayAs for DistributedLeafExec
Source§impl ExecutionPlan for DistributedLeafExec
impl ExecutionPlan for DistributedLeafExec
Source§fn properties(&self) -> &Arc<PlanProperties> ⓘ
fn properties(&self) -> &Arc<PlanProperties> ⓘ
ExecutionPlan, such as output
ordering(s), partitioning information etc. Read moreSource§fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>
fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>
ExecutionPlans that act as inputs to this plan.
The returned list will be empty for leaf nodes such as scans, will contain
a single value for unary nodes, or two values for binary nodes (such as
joins).Source§fn with_new_children(
self: Arc<Self>,
_children: Vec<Arc<dyn ExecutionPlan>>,
) -> Result<Arc<dyn ExecutionPlan>>
fn with_new_children( self: Arc<Self>, _children: Vec<Arc<dyn ExecutionPlan>>, ) -> Result<Arc<dyn ExecutionPlan>>
ExecutionPlan where all existing children were replaced
by the children, in orderSource§fn execute(
&self,
partition: usize,
context: Arc<TaskContext>,
) -> Result<SendableRecordBatchStream>
fn execute( &self, partition: usize, context: Arc<TaskContext>, ) -> Result<SendableRecordBatchStream>
Source§fn metrics(&self) -> Option<MetricsSet>
fn metrics(&self) -> Option<MetricsSet>
Metrics for this
ExecutionPlan. If no Metrics are available, return None. Read moreSource§fn partition_statistics(
&self,
partition: Option<usize>,
) -> Result<Arc<Statistics>>
fn partition_statistics( &self, partition: Option<usize>, ) -> Result<Arc<Statistics>>
ExecutionPlan node.
If statistics are not available, should return Statistics::new_unknown
(the default), not an error.
If partition is None, it returns statistics for the entire plan.Source§fn static_name() -> &'static strwhere
Self: Sized,
fn static_name() -> &'static strwhere
Self: Sized,
name but can be called without an instance.Source§fn downcast_delegate(&self) -> Option<&(dyn ExecutionPlan + 'static)>
fn downcast_delegate(&self) -> Option<&(dyn ExecutionPlan + 'static)>
ExecutionPlan downcast identity. Read moreSource§fn check_invariants(&self, check: InvariantLevel) -> Result<(), DataFusionError>
fn check_invariants(&self, check: InvariantLevel) -> Result<(), DataFusionError>
Source§fn required_input_distribution(&self) -> Vec<Distribution>
fn required_input_distribution(&self) -> Vec<Distribution>
ExecutionPlan, By default it’s [Distribution::UnspecifiedDistribution] for each child,Source§fn required_input_ordering(&self) -> Vec<Option<OrderingRequirements>>
fn required_input_ordering(&self) -> Vec<Option<OrderingRequirements>>
ExecutionPlan. Read moreSource§fn maintains_input_order(&self) -> Vec<bool>
fn maintains_input_order(&self) -> Vec<bool>
false if this ExecutionPlan’s implementation may reorder
rows within or between partitions. Read moreSource§fn benefits_from_input_partitioning(&self) -> Vec<bool>
fn benefits_from_input_partitioning(&self) -> Vec<bool>
ExecutionPlan benefits from increased
parallelization at its input for each child. Read moreSource§fn reset_state(
self: Arc<Self>,
) -> Result<Arc<dyn ExecutionPlan>, DataFusionError>
fn reset_state( self: Arc<Self>, ) -> Result<Arc<dyn ExecutionPlan>, DataFusionError>
ExecutionPlan. Read moreSource§fn repartitioned(
&self,
_target_partitions: usize,
_config: &ConfigOptions,
) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
fn repartitioned( &self, _target_partitions: usize, _config: &ConfigOptions, ) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
ExecutionPlan to
produce target_partitions partitions. Read moreSource§fn supports_limit_pushdown(&self) -> bool
fn supports_limit_pushdown(&self) -> bool
Source§fn with_fetch(&self, _limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>
fn with_fetch(&self, _limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>
ExecutionPlan node, if it supports
fetch limits. Returns None otherwise. Read moreSource§fn fetch(&self) -> Option<usize>
fn fetch(&self) -> Option<usize>
None means there is no fetch.Source§fn cardinality_effect(&self) -> CardinalityEffect
fn cardinality_effect(&self) -> CardinalityEffect
Source§fn try_swapping_with_projection(
&self,
_projection: &ProjectionExec,
) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
fn try_swapping_with_projection( &self, _projection: &ProjectionExec, ) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
ExecutionPlan. Read moreSource§fn gather_filters_for_pushdown(
&self,
_phase: FilterPushdownPhase,
parent_filters: Vec<Arc<dyn PhysicalExpr>>,
_config: &ConfigOptions,
) -> Result<FilterDescription, DataFusionError>
fn gather_filters_for_pushdown( &self, _phase: FilterPushdownPhase, parent_filters: Vec<Arc<dyn PhysicalExpr>>, _config: &ConfigOptions, ) -> Result<FilterDescription, DataFusionError>
ExecutionPlan::gather_filters_for_pushdown: Read moreSource§fn handle_child_pushdown_result(
&self,
_phase: FilterPushdownPhase,
child_pushdown_result: ChildPushdownResult,
_config: &ConfigOptions,
) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>, DataFusionError>
fn handle_child_pushdown_result( &self, _phase: FilterPushdownPhase, child_pushdown_result: ChildPushdownResult, _config: &ConfigOptions, ) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>, DataFusionError>
Source§fn with_new_state(
&self,
_state: Arc<dyn Any + Send + Sync>,
) -> Option<Arc<dyn ExecutionPlan>>
fn with_new_state( &self, _state: Arc<dyn Any + Send + Sync>, ) -> Option<Arc<dyn ExecutionPlan>>
Source§fn try_pushdown_sort(
&self,
_order: &[PhysicalSortExpr],
) -> Result<SortOrderPushdownResult<Arc<dyn ExecutionPlan>>, DataFusionError>
fn try_pushdown_sort( &self, _order: &[PhysicalSortExpr], ) -> Result<SortOrderPushdownResult<Arc<dyn ExecutionPlan>>, DataFusionError>
Source§fn with_preserve_order(
&self,
_preserve_order: bool,
) -> Option<Arc<dyn ExecutionPlan>>
fn with_preserve_order( &self, _preserve_order: bool, ) -> Option<Arc<dyn ExecutionPlan>>
ExecutionPlan that is aware of order-sensitivity. Read moreAuto Trait Implementations§
impl !RefUnwindSafe for DistributedLeafExec
impl !UnwindSafe for DistributedLeafExec
impl Freeze for DistributedLeafExec
impl Send for DistributedLeafExec
impl Sync for DistributedLeafExec
impl Unpin for DistributedLeafExec
impl UnsafeUnpin for DistributedLeafExec
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> IntoRequest<T> for T
impl<T> IntoRequest<T> for T
Source§fn into_request(self) -> Request<T>
fn into_request(self) -> Request<T>
T in a tonic::Request