Crate datafusion_physical_plan

source ·
Expand description

Traits for physical query plan, supporting parallel execution for partitioned relations.

Re-exports§

Modules§

  • Aggregates functionalities
  • Defines the ANALYZE operator
  • CoalesceBatchesExec combines small batches into larger batches for more efficient use of vectorized processing by upstream operators.
  • Defines the merge plan for executing partitions in parallel and then merging the results into a single partition
  • Defines common code used in execution plans
  • Implementation of physical plan display. See crate::displayable for examples of how to format
  • EmptyRelation with produce_one_row=false execution plan
  • Defines the EXPLAIN operator
  • Defines physical expressions that can evaluated at runtime during query execution
  • FilterExec evaluates a boolean predicate against all input batches to determine which rows to include in its output batches.
  • Deprecated module. Add new feature in scalar_function.rs
  • Functionality used both on logical and physical plans
  • Execution plan for writing data to DataSinks
  • DataFusion Join implementations
  • Defines the LIMIT plan
  • Execution plan for reading in-memory batches of data
  • Metrics for recording information about execution
  • EmptyRelation produce_one_row=true execution plan
  • Defines the projection execution plan. A projection determines which columns or expressions are returned from a query. The SQL statement SELECT a, b, a+b FROM t1 is an example of a projection on table t1 where the expressions a, b, and a+b are the projection expressions. SELECT without FROM will only evaluate expressions.
  • Defines the recursive query plan
  • This file implements the RepartitionExec operator, which maps N input partitions to M output partitions based on a partitioning scheme, optionally maintaining the order of the input rows in the output.
  • Sort functionalities
  • Stream wrappers for physical operators
  • Generic plans for deferred execution: StreamingTableExec and PartitionStream
  • Utilities for testing datafusion-physical-plan
  • This module provides common traits for visiting or rewriting tree nodes easily.
  • The Union operator combines multiple inputs with the same schema
  • Define a plan for unnesting values in columns that contain a list type.
  • Values execution plan
  • Physical expressions for window functions
  • Defines the work table query plan

Macros§

  • The handle_async_state macro adapts the handle_state macro for use in asynchronous operations, particularly when dealing with Poll results within async traits like EagerJoinStream. It polls the asynchronous state-changing function using poll_unpin and then passes the result to handle_state for further processing.
  • The handle_state macro is designed to process the result of a state-changing operation, encountered e.g. in implementations of EagerJoinStream. It operates on a StatefulStreamResult by matching its variants and executing corresponding actions. This macro is used to streamline code that deals with state transitions, reducing boilerplate and improving readability.
  • Macro wraps Err($ERR) to add backtrace feature

Structs§

  • Statistics for a column within a relation
  • Stores certain, often expensive to compute, plan properties used in query optimization.
  • Statistics for a relation Fields are optional and can be inexact because the sources sometimes provide approximate estimates for performance reasons and the transformations output are not always predictable.
  • Global TopK

Enums§

  • The result of evaluating an expression.
  • How data is distributed amongst partitions. See Partitioning for more details.
  • Describes the execution mode of an operator’s resulting stream with respect to its size and behavior. There are three possible execution modes: Bounded, Unbounded and PipelineBreaking.
  • Specifies how the input to an aggregation or window operator is ordered relative to their GROUP BY or PARTITION BY expressions.
  • Output partitioning supported by ExecutionPlans.

Traits§

Functions§

  • Visit all children of this plan, according to the order defined on ExecutionPlanVisitor.
  • Execute the ExecutionPlan and collect the results in memory
  • Execute the ExecutionPlan and collect the results in memory
  • Return a wrapper around an ExecutionPlan which can be displayed in various easier to understand ways.
  • Execute the ExecutionPlan and return a single stream of RecordBatches.
  • Execute the ExecutionPlan and return a vec with one stream per output partition
  • Utility function yielding a string representation of the given ExecutionPlan.
  • Indicate whether a data exchange is needed for the input of plan, which will be very helpful especially for the distributed engine to judge whether need to deal with shuffling. Currently there are 3 kinds of execution plan which needs data exchange 1. RepartitionExec for changing the partition number between two ExecutionPlans 2. CoalescePartitionsExec for collapsing all of the partitions into one without ordering guarantee 3. SortPreservingMergeExec for collapsing all of the sorted partitions into one with ordering guarantee
  • Applies an optional projection to a SchemaRef, returning the projected schema
  • Recursively calls pre_visit and post_visit for this node and all of its children, as described on ExecutionPlanVisitor
  • Returns a copy of this plan if we change any child according to the pointer comparison. The size of children must be equal to the size of ExecutionPlan::children().

Type Aliases§