pub struct SpatialJoinExec {
pub left: Arc<dyn ExecutionPlan>,
pub right: Arc<dyn ExecutionPlan>,
pub on: SpatialPredicate,
pub filter: Option<JoinFilter>,
pub join_type: JoinType,
pub metrics: ExecutionPlanMetricsSet,
/* private fields */
}Expand description
Physical execution plan for performing spatial joins between two tables. It uses a spatial index to speed up the join operation.
§Algorithm Overview
The spatial join execution follows a hash-join-like pattern:
- Build Phase: The left (smaller) table geometries are indexed using a spatial index
- Probe Phase: Each geometry from the right table is used to query the spatial index
- Refinement: Candidate pairs from the index are refined using exact spatial predicates
- Output: Matching pairs are combined according to the specified join type
Fields§
§left: Arc<dyn ExecutionPlan>left (build) side which gets hashed
right: Arc<dyn ExecutionPlan>right (probe) side which are filtered by the hash table
on: SpatialPredicatePrimary spatial join condition (the expression in the ON clause of the join)
filter: Option<JoinFilter>Additional filters which are applied while finding matching rows. It could contain part of the ON clause, or expressions in the WHERE clause.
join_type: JoinTypeHow the join is performed (OUTER, INNER, etc)
metrics: ExecutionPlanMetricsSetMetrics for tracking execution statistics (public for wrapper implementations)
Implementations§
Source§impl SpatialJoinExec
impl SpatialJoinExec
Sourcepub fn try_new(
left: Arc<dyn ExecutionPlan>,
right: Arc<dyn ExecutionPlan>,
on: SpatialPredicate,
filter: Option<JoinFilter>,
join_type: &JoinType,
projection: Option<Vec<usize>>,
options: &SpatialJoinOptions,
) -> Result<Self>
pub fn try_new( left: Arc<dyn ExecutionPlan>, right: Arc<dyn ExecutionPlan>, on: SpatialPredicate, filter: Option<JoinFilter>, join_type: &JoinType, projection: Option<Vec<usize>>, options: &SpatialJoinOptions, ) -> Result<Self>
Try to create a new SpatialJoinExec
Sourcepub fn try_new_internal(
left: Arc<dyn ExecutionPlan>,
right: Arc<dyn ExecutionPlan>,
on: SpatialPredicate,
filter: Option<JoinFilter>,
join_type: &JoinType,
projection: Option<Vec<usize>>,
seed: u64,
) -> Result<Self>
pub fn try_new_internal( left: Arc<dyn ExecutionPlan>, right: Arc<dyn ExecutionPlan>, on: SpatialPredicate, filter: Option<JoinFilter>, join_type: &JoinType, projection: Option<Vec<usize>>, seed: u64, ) -> Result<Self>
Create a new SpatialJoinExec with additional options
Sourcepub fn contains_projection(&self) -> bool
pub fn contains_projection(&self) -> bool
Does this join has a projection on the joined columns
Sourcepub fn swap_inputs(&self) -> Result<Arc<dyn ExecutionPlan>>
pub fn swap_inputs(&self) -> Result<Arc<dyn ExecutionPlan>>
Returns a new ExecutionPlan that runs NestedLoopsJoins with the left
and right inputs swapped.
§Notes:
This function should be called BEFORE inserting any repartitioning
operators on the join’s children. Check [super::HashJoinExec::swap_inputs]
for more details.
pub fn with_projection(&self, projection: Option<Vec<usize>>) -> Result<Self>
Trait Implementations§
Source§impl Debug for SpatialJoinExec
impl Debug for SpatialJoinExec
Source§impl DisplayAs for SpatialJoinExec
impl DisplayAs for SpatialJoinExec
Source§impl EmbeddedProjection for SpatialJoinExec
impl EmbeddedProjection for SpatialJoinExec
Source§impl ExecutionPlan for SpatialJoinExec
impl ExecutionPlan for SpatialJoinExec
Source§fn try_swapping_with_projection(
&self,
projection: &ProjectionExec,
) -> Result<Option<Arc<dyn ExecutionPlan>>>
fn try_swapping_with_projection( &self, projection: &ProjectionExec, ) -> Result<Option<Arc<dyn ExecutionPlan>>>
Tries to push projection down through SpatialJoinExec. If possible, performs the
pushdown and returns a new SpatialJoinExec as the top plan which has projections
as its children. Otherwise, returns None.
Source§fn as_any(&self) -> &dyn Any
fn as_any(&self) -> &dyn Any
Any so that it can be
downcast to a specific implementation.Source§fn properties(&self) -> &PlanProperties
fn properties(&self) -> &PlanProperties
ExecutionPlan, such as output
ordering(s), partitioning information etc. Read moreSource§fn maintains_input_order(&self) -> Vec<bool>
fn maintains_input_order(&self) -> Vec<bool>
false if this ExecutionPlan’s implementation may reorder
rows within or between partitions. Read moreSource§fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>
fn children(&self) -> Vec<&Arc<dyn ExecutionPlan>>
ExecutionPlans that act as inputs to this plan.
The returned list will be empty for leaf nodes such as scans, will contain
a single value for unary nodes, or two values for binary nodes (such as
joins).Source§fn with_new_children(
self: Arc<Self>,
children: Vec<Arc<dyn ExecutionPlan>>,
) -> Result<Arc<dyn ExecutionPlan>>
fn with_new_children( self: Arc<Self>, children: Vec<Arc<dyn ExecutionPlan>>, ) -> Result<Arc<dyn ExecutionPlan>>
ExecutionPlan where all existing children were replaced
by the children, in orderSource§fn metrics(&self) -> Option<MetricsSet>
fn metrics(&self) -> Option<MetricsSet>
Metrics for this
ExecutionPlan. If no Metrics are available, return None. Read moreSource§fn execute(
&self,
partition: usize,
context: Arc<TaskContext>,
) -> Result<SendableRecordBatchStream>
fn execute( &self, partition: usize, context: Arc<TaskContext>, ) -> Result<SendableRecordBatchStream>
Source§fn static_name() -> &'static strwhere
Self: Sized,
fn static_name() -> &'static strwhere
Self: Sized,
name but can be called without an instance.Source§fn check_invariants(&self, check: InvariantLevel) -> Result<(), DataFusionError>
fn check_invariants(&self, check: InvariantLevel) -> Result<(), DataFusionError>
Source§fn required_input_distribution(&self) -> Vec<Distribution>
fn required_input_distribution(&self) -> Vec<Distribution>
ExecutionPlan, By default it’s [Distribution::UnspecifiedDistribution] for each child,Source§fn required_input_ordering(&self) -> Vec<Option<OrderingRequirements>>
fn required_input_ordering(&self) -> Vec<Option<OrderingRequirements>>
ExecutionPlan. Read moreSource§fn benefits_from_input_partitioning(&self) -> Vec<bool>
fn benefits_from_input_partitioning(&self) -> Vec<bool>
ExecutionPlan benefits from increased
parallelization at its input for each child. Read moreSource§fn reset_state(
self: Arc<Self>,
) -> Result<Arc<dyn ExecutionPlan>, DataFusionError>
fn reset_state( self: Arc<Self>, ) -> Result<Arc<dyn ExecutionPlan>, DataFusionError>
ExecutionPlan. Read moreSource§fn repartitioned(
&self,
_target_partitions: usize,
_config: &ConfigOptions,
) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
fn repartitioned( &self, _target_partitions: usize, _config: &ConfigOptions, ) -> Result<Option<Arc<dyn ExecutionPlan>>, DataFusionError>
ExecutionPlan to
produce target_partitions partitions. Read moreSource§fn statistics(&self) -> Result<Statistics, DataFusionError>
fn statistics(&self) -> Result<Statistics, DataFusionError>
partition_statistics method insteadExecutionPlan node. If statistics are not
available, should return Statistics::new_unknown (the default), not
an error. Read moreSource§fn partition_statistics(
&self,
partition: Option<usize>,
) -> Result<Statistics, DataFusionError>
fn partition_statistics( &self, partition: Option<usize>, ) -> Result<Statistics, DataFusionError>
ExecutionPlan node.
If statistics are not available, should return Statistics::new_unknown
(the default), not an error.
If partition is None, it returns statistics for the entire plan.Source§fn supports_limit_pushdown(&self) -> bool
fn supports_limit_pushdown(&self) -> bool
Source§fn with_fetch(&self, _limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>
fn with_fetch(&self, _limit: Option<usize>) -> Option<Arc<dyn ExecutionPlan>>
ExecutionPlan node, if it supports
fetch limits. Returns None otherwise.Source§fn fetch(&self) -> Option<usize>
fn fetch(&self) -> Option<usize>
None means there is no fetch.Source§fn cardinality_effect(&self) -> CardinalityEffect
fn cardinality_effect(&self) -> CardinalityEffect
Source§fn gather_filters_for_pushdown(
&self,
_phase: FilterPushdownPhase,
parent_filters: Vec<Arc<dyn PhysicalExpr>>,
_config: &ConfigOptions,
) -> Result<FilterDescription, DataFusionError>
fn gather_filters_for_pushdown( &self, _phase: FilterPushdownPhase, parent_filters: Vec<Arc<dyn PhysicalExpr>>, _config: &ConfigOptions, ) -> Result<FilterDescription, DataFusionError>
ExecutionPlan::gather_filters_for_pushdown: Read moreSource§fn handle_child_pushdown_result(
&self,
_phase: FilterPushdownPhase,
child_pushdown_result: ChildPushdownResult,
_config: &ConfigOptions,
) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>, DataFusionError>
fn handle_child_pushdown_result( &self, _phase: FilterPushdownPhase, child_pushdown_result: ChildPushdownResult, _config: &ConfigOptions, ) -> Result<FilterPushdownPropagation<Arc<dyn ExecutionPlan>>, DataFusionError>
ExecutionPlan::gather_filters_for_pushdown.
It allows the current node to process the results of filter pushdown from
its children, deciding whether to absorb filters, modify the plan, or pass
filters back up to its parent. Read moreAuto Trait Implementations§
impl Freeze for SpatialJoinExec
impl !RefUnwindSafe for SpatialJoinExec
impl Send for SpatialJoinExec
impl Sync for SpatialJoinExec
impl Unpin for SpatialJoinExec
impl UnsafeUnpin for SpatialJoinExec
impl !UnwindSafe for SpatialJoinExec
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more