1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
use crateWorkUnit;
use Result;
use TaskContext;
use ExecutionPlanMetricsSet;
use BoxStream;
use Debug;
use Arc;
/// Extension point for building user-defined work unit streams consumed by a
/// [`crate::WorkUnitFeed`] embedded in a leaf [`datafusion::physical_plan::ExecutionPlan`].
///
/// Implement this trait on a type that knows how to produce the per-partition stream of
/// work items (e.g. file addresses, external queries, key ranges) that the leaf plan needs
/// at runtime. Then wrap the implementation with [`crate::WorkUnitFeed::new`] and store
/// the resulting [`crate::WorkUnitFeed`] as a field of your [`ExecutionPlan`] node.
///
/// In a distributed context the provider is only invoked on the **coordinating** stage
/// that initiates the query. The work units it produces are serialized and streamed over
/// the network to the workers, which expose the same typed stream to the leaf plan as if
/// it were running locally.
///
/// See [`WorkUnitFeedProvider::feed`] for the per-call contract.
/// Provides contextual information about where a [WorkUnitFeedProvider] is being executed. When
/// using [WorkUnitFeedProvider] in distributed queries, it might be getting executed in the
/// coordinating stage, or it might be getting executed just locally because the query did not
/// need any remote execution.