1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
//! Sort Pushdown Optimization
//!
//! This optimizer attempts to push sort requirements down through the execution plan
//! tree to data sources that can natively handle them (e.g., by scanning files in
//! reverse order).
//!
//! ## How it works
//!
//! 1. Detects `SortExec` nodes in the plan
//! 2. Calls `try_pushdown_sort()` on the input to recursively push the sort requirement
//! 3. Each node type defines its own pushdown behavior:
//! - **Transparent nodes** (CoalesceBatchesExec, RepartitionExec, etc.) delegate to
//! their children and wrap the result
//! - **Data sources** (DataSourceExec) check if they can optimize for the ordering
//! - **Blocking nodes** return `Unsupported` to stop pushdown
//! 4. Based on the result:
//! - `Exact`: Remove the Sort operator (data source guarantees perfect ordering)
//! - `Inexact`: Keep Sort but use optimized input (enables early termination for TopK)
//! - `Unsupported`: No change
//!
//! ## Current capabilities (Phase 1)
//!
//! - Reverse scan optimization: when required sort is the reverse of the data source's
//! natural ordering, enable reverse scanning (reading row groups in reverse order)
//! - Supports prefix matching: if data has ordering [A DESC, B ASC] and query needs
//! [A ASC], reversing gives [A ASC, B DESC] which satisfies the requirement
//!
//! TODO Issue: <https://github.com/apache/datafusion/issues/19329>
//! ## Future enhancements (Phase 2),
//!
//! - File reordering based on statistics
//! - Return `Exact` when files are known to be perfectly sorted
//! - Complete Sort elimination when ordering is guaranteed
use cratePhysicalOptimizerRule;
use Result;
use ConfigOptions;
use ;
use ExecutionPlan;
use SortOrderPushdownResult;
use SortExec;
use Arc;
/// A PhysicalOptimizerRule that attempts to push down sort requirements to data sources.
///
/// See module-level documentation for details.
;