Expand description
§DataFusion Index Provider
This crate provides a comprehensive framework for adding index-based scanning capabilities
to DataFusion datafusion::datasource::TableProviders. It enables efficient query execution
by leveraging secondary indexes to reduce I/O and improve query performance through a
sophisticated two-phase execution model.
§Architecture Overview
The crate implements a two-phase execution model:
- Index Phase: Scan one or more indexes to identify primary key values matching the query filters
- Fetch Phase: Use the primary key values to fetch complete records from the underlying storage
This approach is particularly effective for selective queries where indexes can significantly reduce the number of rows that need to be fetched from primary storage.
§Core Components
§Index Management
physical_plan::Index: Trait representing a physical index that can be scanned to retrieve primary key valuesprovider::IndexedTableProvider: Extension of DataFusion’sTableProviderwith index discoverytypes::IndexFilter: Enum representing filter operations that can be pushed down to indexes
§Execution Engine
physical_plan::exec::fetch::RecordFetchExec: Top-level execution plan orchestrating the two phasesphysical_plan::exec::index::IndexScanExec: Execution plan for scanning a single indexphysical_plan::fetcher::RecordFetcher: Trait for fetching complete records using primary key values
§Query Capabilities
The system’s query capabilities are defined by the trait implementations:
§Index Trait Capabilities
Each physical_plan::Index implementation defines its query capabilities through:
supports_predicate(&self, predicate: &Expr) -> Result<bool>: Determines if the index can handle a specific predicate- Default implementation: Supports any predicate that references the index’s column name
- Custom implementations: Can implement sophisticated predicate analysis for complex index types
§IndexedTableProvider Filter Analysis
The provider::IndexedTableProvider trait provides comprehensive filter analysis through
build_index_filter():
§Supported Expression Types
- Simple predicates: Any expression that an index’s
supports_predicate()method accepts - AND operations:
Expr::BinaryExprwithOperator::And- createsIndexFilter::And - OR operations:
Expr::BinaryExprwithOperator::Or- createsIndexFilter::Or - Nested expressions: Recursive traversal supports arbitrarily nested AND/OR combinations
§Filter Processing Rules
- Recursive traversal: Expressions are recursively analyzed to build
types::IndexFiltertrees - All-or-nothing: If any part of an AND/OR expression cannot be indexed, the entire expression is rejected
- Index matching: Each leaf predicate must match exactly one index via
supports_predicate() - Automatic grouping: Multiple root-level indexable filters are automatically wrapped in
IndexFilter::And
§Supported Query Patterns
Based on the trait design, the system supports:
§Basic Index Predicates
- Any predicate supported by the index implementation
- Equality conditions:
indexed_column = 'value' - Inequality conditions:
indexed_column > 100 - Range queries:
indexed_column BETWEEN 10 AND 50
§Logical Combinations
- AND across different indexes:
col_a = 1 AND col_b = 2 - OR across different indexes:
col_a = 1 OR col_b = 2 - Complex nested combinations:
(col_a = 1 AND col_b = 2) OR (col_a = 3 AND col_b = 4)
§Multi-level Nesting
- Arbitrarily deep nesting supported:
((col_a = 1 OR col_a = 2) AND col_b = 3) OR (col_c = 4 AND col_d = 5)
§Execution Plan Generation
The physical_plan::exec::fetch::RecordFetchExec generates
optimized execution plans based on the types::IndexFilter structure:
§IndexFilter::Single - Direct Index Scan
For simple conditions on a single indexed column:
RecordFetchExec
└── IndexScanExec (target_index)§IndexFilter::And - Index Intersection
For conjunctive conditions across multiple indexes, the system builds a left-deep tree of joins to intersect primary key values:
RecordFetchExec
└── Projection(PK columns)
└── HashJoin/SortMergeJoin (INNER on PK columns)
├── Projection(PK columns)
│ └── HashJoin/SortMergeJoin (INNER on PK columns)
│ ├── IndexScanExec (col_a_index)
│ └── IndexScanExec (col_b_index)
└── IndexScanExec (col_c_index)§IndexFilter::Or - Union with Deduplication
For disjunctive conditions, the system uses UnionExec followed by AggregateExec
for automatic primary key deduplication:
RecordFetchExec
└── AggregateExec (GROUP BY PK columns)
└── UnionExec
├── IndexScanExec (col_a)
├── IndexScanExec (col_b)
└── IndexScanExec (col_c)This approach ensures that overlapping results from different indexes are automatically deduplicated before record fetching.
§Implementation Guide
- Implement the Index Trait: Create indexes that can scan and return primary key values
- Implement the RecordFetcher Trait: Define how to fetch complete records using primary key values
- Implement IndexedTableProvider: Expose available indexes and filter analysis capabilities
- Update TableProvider Implementation: Integrate index-based execution into your scan method
§Performance Characteristics
§Execution Plan Efficiency
- Single index: Direct scan with minimal overhead
- AND operations: Intersection cost scales with number of indexes and intermediate result sizes
- OR operations: Union cost includes deduplication overhead but prevents duplicate fetches
§Memory Usage
- Index results: Streamed through execution pipeline to minimize memory footprint
- Join operations: Hash joins require memory proportional to smaller index result set
- Deduplication: OR operations require memory to store unique primary key values during aggregation
§Bounded Execution
- Single partition requirement:
physical_plan::exec::fetch::RecordFetchExecrequires single partition input for correct result merging - Incremental emission: Results are emitted incrementally as they become available
- Bounded memory: Execution is bounded with predictable memory usage patterns
Re-exports§
pub use physical_plan::fetcher::RecordFetcher;pub use physical_plan::Index;pub use provider::IndexedTableProvider;pub use types::IndexFilter;pub use types::IndexFilters;pub use types::UnionMode;
Modules§
- physical_
plan - Physical plan traits and operators for index-based scanning.
- provider
- Index-aware
datafusion::catalog::TableProviderextension trait. - types
- Core types for representing index filter structures. Core type definitions for representing index-based filter operations.