Crate datafusion_expr

source ·
Expand description

DataFusion is an extensible query execution framework that uses Apache Arrow as its in-memory format.

This crate is a submodule of DataFusion that provides types representing logical query plans (LogicalPlan) and logical expressions (Expr) as well as utilities for working with these types.

The expr_fn module contains functions for creating expressions.

Re-exports

Modules

Structs

  • Aggregates its input based on a set of grouping and aggregate expressions (e.g. SUM).
  • Logical representation of a user-defined aggregate function (UDAF) A UDAF is different from a UDF in that it is stateful across batches.
  • Creates a catalog (aka “Database”).
  • Creates a schema.
  • Creates an external table.
  • Creates an in memory table.
  • Creates a view.
  • Apply Cross Join to two logical plans
  • Describe the schema of table
  • Removes duplicate rows from the input
  • The operator that modifies the content of a database (adapted from substrait WriteRel)
  • Drops a table.
  • Drops a view.
  • Produces no rows: An empty relation with an empty schema
  • Produces a relation with string representations of various parts of the plan
  • Extension operator defined outside of DataFusion
  • Filters rows from its input that do not match an expression (essentially a WHERE clause with a predicate expression).
  • Join two logical plans on one or more join columns
  • Produces the first n tuples from its input and discards the rest.
  • Evaluates an arbitrary list of expressions (essentially a SELECT with an expression list) on its input.
  • Repartition the plan based on a partitioning scheme.
  • Logical representation of a UDF.
  • Set a Variable’s value – value in ConfigOptions
  • The Signature of a function defines its supported input types as well as its volatility.
  • Sorts its input according to a list of sort expressions.
  • Represents some sort of execution plan, in String form
  • Subquery
  • Aliased subquery
  • Produces rows from a table provider by reference or from the context
  • Union multiple inputs
  • Unnest a column that contains a nested list type.
  • Values expression. See Postgres VALUES documentation for more details.
  • Window its input based on a set of window spec and window function (e.g. SUM or RANK)

Enums

  • Enum of all built-in scalar functions
  • Represents the result of evaluating an expression: either a single ScalarValue or an [ArrayRef].
  • Join constraint
  • Join type
  • A LogicalPlan represents the different types of relational operators (such as Projection, Filter, etc) and can be created by the SQL query planner and the DataFrame API.
  • Operators applied to expressions
  • Logical partitioning schemes supported by the repartition operator.
  • Represents which type of plan, when storing multiple for use in EXPLAIN plans
  • ! Table source Indicates whether and how a filter expression can be handled by a TableProvider for table scans.
  • Indicates the type of this table for metadata/catalog purposes.
  • A function’s type signature, which defines the function’s supported argument types.
  • A function’s volatility, which defines the functions eligibility for certain optimizations

Statics

  • Currently supported types by the nullif function. The order of these types correspond to the order on which coercion applies This should thus be from least informative to most informative

Traits

  • An accumulator represents a stateful object that lives throughout the evaluation of multiple rows and generically accumulates values.
  • Trait for converting a type to a Literal literal expression.
  • Trait that implements the Visitor pattern for a depth first walk of LogicalPlan nodes. pre_visit is called before any children are visited, and then post_visit is called after all children have been visited. To use, define a struct that implements this trait and then invoke LogicalPlan::accept.
  • The TableSource trait is used during logical query planning and optimizations and provides access to schema information and filter push-down capabilities. This trait provides a subset of the functionality of the TableProvider trait in the core datafusion crate. The TableProvider trait provides additional capabilities needed for physical query execution (such as the ability to perform a scan). The reason for having two separate traits is to avoid having the logical plan code be dependent on the DataFusion execution engine. Other projects may want to use DataFusion’s logical plans and have their own execution engine.
  • Trait for converting a type to a literal timestamp
  • Trait for something that can be formatted as a stringified plan
  • This defines the interface for LogicalPlan nodes that can be used to extend DataFusion with custom relational operators.

Functions