Module datafusion::physical_optimizer::pruning

Expand description

This module contains code to prune “containers” of row groups based on statistics prior to execution. This can lead to significant performance improvements by avoiding the need to evaluate a plan on entire containers (e.g. an entire file)

For example, it is used to prune (skip) row groups while reading parquet files if it can be determined from the predicate that nothing in the row group can match.

This code is currently specific to Parquet, but soon (TM), via https://github.com/apache/arrow-datafusion/issues/363 it will be genericized.

Structs

PruningPredicate

Evaluates filter expressions on statistics in order to prune data containers (e.g. parquet row group)

Traits

PruningStatistics

Interface to pass statistics information to PruningPredicate