pub struct NestedLoopJoinExec { /* private fields */ }Expand description
NestedLoopJoinExec is build-probe join operator, whose main task is to
perform joins without any equijoin conditions in ON clause.
Execution consists of following phases:
§1. Build phase
Collecting build-side data in memory, by polling all available data from build-side input. Due to the absence of equijoin conditions, it’s not possible to partition build-side data across multiple threads of the operator, so build-side is always collected in a single batch shared across all threads. The operator always considers LEFT input as build-side input, so it’s crucial to adjust smaller input to be the LEFT one. Normally this selection is handled by physical optimizer.
§2. Probe phase
Sequentially polling batches from the probe-side input and processing them according to the following logic:
- apply join filter (
ONclause) to Cartesian product of probe batch and build side data – filter evaluation is executed once per build-side data row - update shared bitmap of joined (“visited”) build-side row indices, if required – allows to produce unmatched build-side data in case of e.g. LEFT/FULL JOIN after probing phase completed
- perform join index alignment is required – depending on
JoinType - produce output join batch
Probing phase is executed in parallel, according to probe-side input partitioning – one thread per partition. After probe input is exhausted, each thread ATTEMPTS to produce unmatched build-side data.
§3. Producing unmatched build-side data
Producing unmatched build-side data as an output batch, after probe input is exhausted. This step is also executed in parallel (once per probe input partition), and to avoid duplicate output of unmatched data (due to shared nature build-side data), each thread “reports” about probe phase completion (which means that “visited” bitmap won’t be updated anymore), and only the last thread, reporting about completion, will return output.
Implementations§
source§impl NestedLoopJoinExec
impl NestedLoopJoinExec
sourcepub fn try_new(
left: Arc<dyn ExecutionPlan>,
right: Arc<dyn ExecutionPlan>,
filter: Option<JoinFilter>,
join_type: &JoinType,
) -> Result<Self>
pub fn try_new( left: Arc<dyn ExecutionPlan>, right: Arc<dyn ExecutionPlan>, filter: Option<JoinFilter>, join_type: &JoinType, ) -> Result<Self>
Try to create a new NestedLoopJoinExec
sourcepub fn left(&self) -> &Arc<dyn ExecutionPlan>
pub fn left(&self) -> &Arc<dyn ExecutionPlan>
left side
sourcepub fn right(&self) -> &Arc<dyn ExecutionPlan>
pub fn right(&self) -> &Arc<dyn ExecutionPlan>
right side
sourcepub fn filter(&self) -> Option<&JoinFilter>
pub fn filter(&self) -> Option<&JoinFilter>
Filters applied before join output
Trait Implementations§
source§impl Debug for NestedLoopJoinExec
impl Debug for NestedLoopJoinExec
source§impl DisplayAs for NestedLoopJoinExec
impl DisplayAs for NestedLoopJoinExec
source§impl ExecutionPlan for NestedLoopJoinExec
impl ExecutionPlan for NestedLoopJoinExec
source§fn name(&self) -> &'static str
fn name(&self) -> &'static str
source§fn as_any(&self) -> &dyn Any
fn as_any(&self) -> &dyn Any
Any so that it can be
downcast to a specific implementation.source§fn properties(&self) -> &PlanProperties
fn properties(&self) -> &PlanProperties
ExecutionPlan, such as output
ordering(s), partitioning information etc. Read moresource§fn required_input_distribution(&self) -> Vec<Distribution>
fn required_input_distribution(&self) -> Vec<Distribution>
ExecutionPlan, By default it’s [Distribution::UnspecifiedDistribution] for each child,source§fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>
fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>
ExecutionPlans that act as inputs to this plan.
The returned list will be empty for leaf nodes such as scans, will contain
a single value for unary nodes, or two values for binary nodes (such as
joins).source§fn with_new_children(
self: Arc<Self>,
children: Vec<Arc<dyn ExecutionPlan>>,
) -> Result<Arc<dyn ExecutionPlan>>
fn with_new_children( self: Arc<Self>, children: Vec<Arc<dyn ExecutionPlan>>, ) -> Result<Arc<dyn ExecutionPlan>>
ExecutionPlan where all existing children were replaced
by the children, in ordersource§fn execute(
&self,
partition: usize,
context: Arc<TaskContext>,
) -> Result<SendableRecordBatchStream>
fn execute( &self, partition: usize, context: Arc<TaskContext>, ) -> Result<SendableRecordBatchStream>
source§fn metrics(&self) -> Option<MetricsSet>
fn metrics(&self) -> Option<MetricsSet>
Metrics for this
ExecutionPlan. If no Metrics are available, return None. Read moresource§fn statistics(&self) -> Result<Statistics>
fn statistics(&self) -> Result<Statistics>
ExecutionPlan node. If statistics are not
available, should return Statistics::new_unknown (the default), not
an error.source§fn static_name() -> &'static strwhere
Self: Sized,
fn static_name() -> &'static strwhere
Self: Sized,
name but can be called without an instance.source§fn required_input_ordering(&self) -> Vec<Option<LexRequirement>>
fn required_input_ordering(&self) -> Vec<Option<LexRequirement>>
ExecutionPlan. Read moresource§fn maintains_input_order(&self) -> Vec<bool>
fn maintains_input_order(&self) -> Vec<bool>
false if this ExecutionPlan’s implementation may reorder
rows within or between partitions. Read moresource§fn benefits_from_input_partitioning(&self) -> Vec<bool>
fn benefits_from_input_partitioning(&self) -> Vec<bool>
ExecutionPlan benefits from increased
parallelization at its input for each child. Read moresource§fn repartitioned(
&self,
_target_partitions: usize,
_config: &ConfigOptions,
) -> Result<Option<Arc<dyn ExecutionPlan>>>
fn repartitioned( &self, _target_partitions: usize, _config: &ConfigOptions, ) -> Result<Option<Arc<dyn ExecutionPlan>>>
ExecutionPlan to
produce target_partitions partitions. Read moresource§fn supports_limit_pushdown(&self) -> bool
fn supports_limit_pushdown(&self) -> bool
source§fn with_fetch(&self, _limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>
fn with_fetch(&self, _limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>
ExecutionPlan node, if it supports
fetch limits. Returns None otherwise.Auto Trait Implementations§
impl !Freeze for NestedLoopJoinExec
impl !RefUnwindSafe for NestedLoopJoinExec
impl Send for NestedLoopJoinExec
impl Sync for NestedLoopJoinExec
impl Unpin for NestedLoopJoinExec
impl !UnwindSafe for NestedLoopJoinExec
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> IntoEither for T
impl<T> IntoEither for T
source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moresource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more