Crate datafusion_physical_plan
source ·Expand description
Traits for physical query plan, supporting parallel execution for partitioned relations.
Re-exports§
pub use crate::display::DefaultDisplay;pub use crate::display::DisplayAs;pub use crate::display::DisplayFormatType;pub use crate::display::VerboseDisplay;pub use crate::metrics::Metric;pub use crate::stream::EmptyRecordBatchStream;
Modules§
- Aggregates functionalities
- Defines the ANALYZE operator
- CoalesceBatchesExec combines small batches into larger batches for more efficient use of vectorized processing by upstream operators.
- Defines the merge plan for executing partitions in parallel and then merging the results into a single partition
- Defines common code used in execution plans
- Implementation of physical plan display. See
crate::displayablefor examples of how to format - EmptyRelation with produce_one_row=false execution plan
- Defines the EXPLAIN operator
- Defines physical expressions that can evaluated at runtime during query execution
- FilterExec evaluates a boolean predicate against all input batches to determine which rows to include in its output batches.
- Declaration of built-in (scalar) functions. This module contains built-in functions’ enumeration and metadata.
- Functionality used both on logical and physical plans
- Execution plan for writing data to
DataSinks - DataFusion Join implementations
- Defines the LIMIT plan
- Execution plan for reading in-memory batches of data
- Metrics for recording information about execution
- EmptyRelation produce_one_row=true execution plan
- Defines the projection execution plan. A projection determines which columns or expressions are returned from a query. The SQL statement
SELECT a, b, a+b FROM t1is an example of a projection on tablet1where the expressionsa,b, anda+bare the projection expressions.SELECTwithoutFROMwill only evaluate expressions. - Defines the recursive query plan
- This file implements the
RepartitionExecoperator, which maps N input partitions to M output partitions based on a partitioning scheme, optionally maintaining the order of the input rows in the output. - Sort functionalities
- Stream wrappers for physical operators
- Generic plans for deferred execution:
StreamingTableExecandPartitionStream - Utilities for testing datafusion-physical-plan
- This module provides common traits for visiting or rewriting tree nodes easily.
- This module contains functions and structs supporting user-defined aggregate functions.
- UDF support
- The Union operator combines multiple inputs with the same schema
- Defines the unnest column plan for unnesting values in a column that contains a list type, conceptually is like joining each row with all the values in the list column.
- Values execution plan
- Physical expressions for window functions
- Defines the work table query plan
Macros§
- The
handle_async_statemacro adapts thehandle_statemacro for use in asynchronous operations, particularly when dealing withPollresults within async traits likeEagerJoinStream. It polls the asynchronous state-changing function usingpoll_unpinand then passes the result tohandle_statefor further processing. - The
handle_statemacro is designed to process the result of a state-changing operation, encountered e.g. in implementations ofEagerJoinStream. It operates on aStatefulStreamResultby matching its variants and executing corresponding actions. This macro is used to streamline code that deals with state transitions, reducing boilerplate and improving readability. - Macro wraps Err(
$ERR) to add backtrace feature
Structs§
- Statistics for a column within a relation
- Statistics for a relation Fields are optional and can be inexact because the sources sometimes provide approximate estimates for performance reasons and the transformations output are not always predictable.
- Global TopK
Enums§
- Represents the result of evaluating an expression: either a single
ScalarValueor anArrayRef. - How data is distributed amongst partitions. See
Partitioningfor more details. - Specifies how the input to an aggregation or window operator is ordered relative to their
GROUP BYorPARTITION BYexpressions. - Output partitioning supported by
ExecutionPlans.
Traits§
- Tracks an aggregate function’s state.
- An aggregate expression that:
- Represent nodes in the DataFusion Physical Plan.
- Trait that implements the Visitor pattern for a depth first walk of
ExecutionPlannodes.pre_visitis called before any children are visited, and thenpost_visitis called after all children have been visited. PhysicalExprevaluate DataFusion expressions such asA + 1, orCAST(c1 AS int).- Trait for types that stream arrow::record_batch::RecordBatch
- Common trait for window function implementations
Functions§
- Visit all children of this plan, according to the order defined on
ExecutionPlanVisitor. - Execute the ExecutionPlan and collect the results in memory
- Execute the ExecutionPlan and collect the results in memory
- Return a wrapper around an
ExecutionPlanwhich can be displayed in various easier to understand ways. - Execute the ExecutionPlan and return a single stream of
RecordBatches. - Execute the ExecutionPlan and return a vec with one stream per output partition
- Utility function yielding a string representation of the given
ExecutionPlan. - Indicate whether a data exchange is needed for the input of
plan, which will be very helpful especially for the distributed engine to judge whether need to deal with shuffling. Currently there are 3 kinds of execution plan which needs data exchange 1. RepartitionExec for changing the partition number between twoExecutionPlans 2. CoalescePartitionsExec for collapsing all of the partitions into one without ordering guarantee 3. SortPreservingMergeExec for collapsing all of the sorted partitions into one with ordering guarantee - Applies an optional projection to a
SchemaRef, returning the projected schema - Recursively calls
pre_visitandpost_visitfor this node and all of its children, as described onExecutionPlanVisitor - Returns a copy of this plan if we change any child according to the pointer comparison. The size of
childrenmust be equal to the size ofExecutionPlan::children().
Type Aliases§
- Trait for a [
Stream] ofRecordBatches