[−][src]Crate datafusion
DataFusion is an extensible query execution framework that uses Apache Arrow as the memory model.
DataFusion supports both SQL and a Table/DataFrame-style API for building logical query plans and also provides a query optimizer and execution engine capable of parallel execution against partitioned data sources (CSV and Parquet) using threads.
DataFusion currently supports simple projection, selection, and aggregate queries.
Modules
datasource | DataFusion data sources |
error | DataFusion error types |
execution | DataFusion query execution |
logicalplan | This module provides a logical query plan enum that can describe queries. Logical query plans can be created from a SQL statement or built programmatically via the Table API. |
optimizer | This module contains a query optimizer that operates against a logical plan and applies some simple rules to a logical plan, such as "Projection Push Down" and "Type Coercion". |
sql | This module provides a SQL parser that translates SQL queries into an abstract syntax tree (AST), and a SQL query planner that creates a logical plan from the AST. |
table | Table API for building a logical query plan. This is similar to the Table API in Ibis and the DataFrame API in Apache Spark |