Expand description
re-export of datafusion_physical_optimizer
crate
Modules§
- aggregate_
statistics - Utilizing exact statistics from sources to avoid scanning data
- coalesce_
batches - CoalesceBatches optimizer that groups batches together rows in bigger batches to avoid overhead with small batches
- combine_
partial_ final_ agg - CombinePartialFinalAggregate optimizer rule checks the adjacent Partial and Final AggregateExecs and try to combine them if necessary
- enforce_
distribution - EnforceDistribution optimizer rule inspects the physical plan with respect
to distribution requirements and adds
RepartitionExec
s to satisfy them when necessary. If increasing parallelism is beneficial (and also desirable according to the configuration), this rule increases partition counts in the physical plan. - enforce_
sorting - EnforceSorting optimizer rule inspects the physical plan with respect to local sorting requirements and does the following:
- filter_
pushdown - join_
selection - The
JoinSelection
rule tries to modify a given plan so that it can accommodate infinite sources and utilize statistical information (if there is any) to obtain more performant plans. To achieve the first goal, it tries to transform a non-runnable query (with the given infinite sources) into a runnable query by replacing pipeline-breaking join operations with pipeline-friendly ones. To achieve the second goal, it selects the properPartitionMode
and the build side using the available statistics for hash joins. - limit_
pushdown LimitPushdown
pushesLIMIT
down throughExecutionPlan
s to reduce data transfer as much as possible.- limited_
distinct_ aggregation - A special-case optimizer rule that pushes limit into a grouped aggregation which has no aggregate expressions or sorting requirements
- optimizer
- Physical optimizer traits
- output_
requirements - The GlobalOrderRequire optimizer rule either:
- projection_
pushdown - This file implements the
ProjectionPushdown
physical optimization rule. The functionremove_unnecessary_projections
tries to push down all projections one by one if the operator below is amenable to this. If a projection reaches a source, it can even disappear from the plan entirely. - pruning
PruningPredicate
to apply filterExpr
to prune “containers” based on statistics (e.g. Parquet Row Groups)- sanity_
checker - The SanityCheckPlan rule ensures that a given plan can accommodate its infinite sources, if there are any. It will reject non-runnable query plans that use pipeline-breaking operators on infinite input(s). In addition, it will check if all order and distribution requirements of a plan are satisfied by its children.
- topk_
aggregation - An optimizer rule that detects aggregate operations that could use a limited bucket count
- update_
aggr_ exprs - An optimizer rule that checks ordering requirements of aggregate expressions and modifies the expressions to work more efficiently if possible.
- utils
Traits§
- Physical
Optimizer Rule PhysicalOptimizerRule
transforms one [‘ExecutionPlan’] into another which computes the same results, but in a potentially more efficient way.