Expand description

This module contains code to prune “containers” of row groups based on statistics prior to execution. This can lead to significant performance improvements by avoiding the need to evaluate a plan on entire containers (e.g. an entire file)

For example, DataFusion uses this code to prune (skip) row groups while reading parquet files if it can be determined from the predicate that nothing in the row group can match.

This code can also be used by other systems to prune other entities (e.g. entire files) if the statistics are known via some other source (e.g. a catalog)

Structs

  • Evaluates filter expressions on statistics in order to prune data containers (e.g. parquet row group)

Traits