pub struct DynamicFilterPhysicalExpr { /* private fields */ }Expand description
A dynamic PhysicalExpr that can be updated by anyone with a reference to it.
Any ExecutionPlan that uses this expression and holds a reference to it internally should probably also
implement ExecutionPlan::reset_state to remain compatible with recursive queries and other situations where
the same ExecutionPlan is reused with different data.
Implementations§
Source§impl DynamicFilterPhysicalExpr
impl DynamicFilterPhysicalExpr
Sourcepub fn new(
children: Vec<Arc<dyn PhysicalExpr>>,
inner: Arc<dyn PhysicalExpr>,
) -> Self
pub fn new( children: Vec<Arc<dyn PhysicalExpr>>, inner: Arc<dyn PhysicalExpr>, ) -> Self
Create a new DynamicFilterPhysicalExpr
from an initial expression and a list of children.
The list of children is provided separately because
the initial expression may not have the same children.
For example, if the initial expression is just true
it will not reference any columns, but we may know that
we are going to replace this expression with a real one
that does reference certain columns.
In this case you must pass in the columns that will be
used in the final expression as children to this function
since DataFusion is generally not compatible with dynamic
children in expressions.
To determine the children you can:
- Use
collect_columnsto collect the columns from the expression. - Use existing information, such as the sort columns in a
SortExec.
Generally the important bit is that the leaf children that reference columns do not change since those will be used to determine what columns need to read or projected when evaluating the expression.
Any ExecutionPlan that uses this expression and holds a reference to it internally should probably also
implement ExecutionPlan::reset_state to remain compatible with recursive queries and other situations where
the same ExecutionPlan is reused with different data.
Sourcepub fn current(&self) -> Result<Arc<dyn PhysicalExpr>>
pub fn current(&self) -> Result<Arc<dyn PhysicalExpr>>
Get the current expression.
This will return the current expression with any children
remapped to match calls to PhysicalExpr::with_new_children.
Sourcepub fn update(&self, new_expr: Arc<dyn PhysicalExpr>) -> Result<()>
pub fn update(&self, new_expr: Arc<dyn PhysicalExpr>) -> Result<()>
Update the current expression and notify all waiters. Any children of this expression must be a subset of the original children passed to the constructor. This should be called e.g.:
- When we’ve computed the probe side’s hash table in a HashJoinExec
- After every batch is processed if we update the TopK heap in a SortExec using a TopK approach.
Sourcepub fn mark_complete(&self)
pub fn mark_complete(&self)
Mark this dynamic filter as complete and broadcast to all waiters.
This signals that all expected updates have been received.
Waiters using Self::wait_complete will be notified.
Sourcepub async fn wait_update(&self)
pub async fn wait_update(&self)
Wait asynchronously for any update to this filter.
This method will return when Self::update is called and the generation increases.
It does not guarantee that the filter is complete.
Sourcepub async fn wait_complete(self: &Arc<Self>)
pub async fn wait_complete(self: &Arc<Self>)
Wait asynchronously until this dynamic filter is marked as complete.
This method returns immediately if the filter is already complete or if the filter
is not being used by any consumers.
Otherwise, it waits until Self::mark_complete is called.
Unlike Self::wait_update, this method guarantees that when it returns,
the filter is fully complete with no more updates expected.
Sourcepub fn is_used(self: &Arc<Self>) -> bool
pub fn is_used(self: &Arc<Self>) -> bool
Check if this dynamic filter is being actively used by any consumers.
Returns true if there are references beyond the producer (e.g., the HashJoinExec
that created the filter). This is useful to avoid computing expensive filter
expressions when no consumer will actually use them.
Note: We check the inner Arc’s strong_count, not the outer Arc’s count, because
when filters are transformed (e.g., via reassign_expr_columns during filter pushdown),
new outer Arc instances are created via with_new_children(), but they all share the
same inner Arc<RwLock<Inner>>. This is what allows filter updates to propagate to
consumers even after transformation.
Trait Implementations§
Source§impl Debug for DynamicFilterPhysicalExpr
impl Debug for DynamicFilterPhysicalExpr
Source§impl Display for DynamicFilterPhysicalExpr
impl Display for DynamicFilterPhysicalExpr
Source§impl Hash for DynamicFilterPhysicalExpr
impl Hash for DynamicFilterPhysicalExpr
Source§impl PhysicalExpr for DynamicFilterPhysicalExpr
impl PhysicalExpr for DynamicFilterPhysicalExpr
Source§fn as_any(&self) -> &dyn Any
fn as_any(&self) -> &dyn Any
Any so that it can be
downcast to a specific implementation.Source§fn children(&self) -> Vec<&Arc<dyn PhysicalExpr>>
fn children(&self) -> Vec<&Arc<dyn PhysicalExpr>>
Source§fn with_new_children(
self: Arc<Self>,
children: Vec<Arc<dyn PhysicalExpr>>,
) -> Result<Arc<dyn PhysicalExpr>>
fn with_new_children( self: Arc<Self>, children: Vec<Arc<dyn PhysicalExpr>>, ) -> Result<Arc<dyn PhysicalExpr>>
Source§fn data_type(&self, input_schema: &Schema) -> Result<DataType>
fn data_type(&self, input_schema: &Schema) -> Result<DataType>
Source§fn nullable(&self, input_schema: &Schema) -> Result<bool>
fn nullable(&self, input_schema: &Schema) -> Result<bool>
Source§fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue>
fn evaluate(&self, batch: &RecordBatch) -> Result<ColumnarValue>
Source§fn fmt_sql(&self, f: &mut Formatter<'_>) -> Result
fn fmt_sql(&self, f: &mut Formatter<'_>) -> Result
PhysicalExpr in nice human readable “SQL” format Read moreSource§fn snapshot(&self) -> Result<Option<Arc<dyn PhysicalExpr>>>
fn snapshot(&self) -> Result<Option<Arc<dyn PhysicalExpr>>>
PhysicalExpr, if it is dynamic. Read moreSource§fn snapshot_generation(&self) -> u64
fn snapshot_generation(&self) -> u64
PhysicalExpr for snapshotting purposes.
The generation is an arbitrary u64 that can be used to track changes
in the state of the PhysicalExpr over time without having to do an exhaustive comparison.
This is useful to avoid unnecessary computation or serialization if there are no changes to the expression.
In particular, dynamic expressions that may change over time; this allows cheap checks for changes.
Static expressions that do not change over time should return 0, as does the default implementation.
You should not call this method directly as it does not handle recursion.
Instead use snapshot_generation to handle recursion and capture the
full state of the PhysicalExpr.Source§fn return_field(
&self,
input_schema: &Schema,
) -> Result<Arc<Field>, DataFusionError>
fn return_field( &self, input_schema: &Schema, ) -> Result<Arc<Field>, DataFusionError>
Source§fn evaluate_selection(
&self,
batch: &RecordBatch,
selection: &BooleanArray,
) -> Result<ColumnarValue, DataFusionError>
fn evaluate_selection( &self, batch: &RecordBatch, selection: &BooleanArray, ) -> Result<ColumnarValue, DataFusionError>
Source§fn evaluate_bounds(
&self,
_children: &[&Interval],
) -> Result<Interval, DataFusionError>
fn evaluate_bounds( &self, _children: &[&Interval], ) -> Result<Interval, DataFusionError>
Source§fn propagate_constraints(
&self,
_interval: &Interval,
_children: &[&Interval],
) -> Result<Option<Vec<Interval>>, DataFusionError>
fn propagate_constraints( &self, _interval: &Interval, _children: &[&Interval], ) -> Result<Option<Vec<Interval>>, DataFusionError>
Source§fn evaluate_statistics(
&self,
children: &[&Distribution],
) -> Result<Distribution, DataFusionError>
fn evaluate_statistics( &self, children: &[&Distribution], ) -> Result<Distribution, DataFusionError>
Source§fn propagate_statistics(
&self,
parent: &Distribution,
children: &[&Distribution],
) -> Result<Option<Vec<Distribution>>, DataFusionError>
fn propagate_statistics( &self, parent: &Distribution, children: &[&Distribution], ) -> Result<Option<Vec<Distribution>>, DataFusionError>
Source§fn get_properties(
&self,
_children: &[ExprProperties],
) -> Result<ExprProperties, DataFusionError>
fn get_properties( &self, _children: &[ExprProperties], ) -> Result<ExprProperties, DataFusionError>
PhysicalExpr based on its
children’s properties (i.e. order and range), recursively aggregating
the information from its children. In cases where the PhysicalExpr
has no children (e.g., Literal or Column), these properties should
be specified externally, as the function defaults to unknown properties.Source§fn is_volatile_node(&self) -> bool
fn is_volatile_node(&self) -> bool
impl Eq for DynamicFilterPhysicalExpr
Auto Trait Implementations§
impl Freeze for DynamicFilterPhysicalExpr
impl !RefUnwindSafe for DynamicFilterPhysicalExpr
impl Send for DynamicFilterPhysicalExpr
impl Sync for DynamicFilterPhysicalExpr
impl Unpin for DynamicFilterPhysicalExpr
impl !UnwindSafe for DynamicFilterPhysicalExpr
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
Source§fn equivalent(&self, key: &K) -> bool
fn equivalent(&self, key: &K) -> bool
key and return true if they are equal.Source§impl<Q, K> Equivalent<K> for Q
impl<Q, K> Equivalent<K> for Q
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more