DataFusion is an extensible query execution framework that uses Apache Arrow as the memory model.
DataFusion supports both SQL and a Table/DataFrame-style API for building logical query plans and also provides a query optimizer and execution engine capable of parallel execution against partitioned data sources (CSV and Parquet) using threads.
DataFusion currently supports simple projection, selection, and aggregate queries.
DataFusion data sources
DataFusion error types
DataFusion query execution
This module provides a logical query plan enum that can describe queries. Logical query plans can be created from a SQL statement or built programmatically via the Table API.
This module contains a query optimizer that operates against a logical plan and applies some simple rules to a logical plan, such as "Projection Push Down" and "Type Coercion".
This module provides a SQL parser that translates SQL queries into an abstract syntax tree (AST), and a SQL query planner that creates a logical plan from the AST.
Table API for building a logical query plan. This is similar to the Table API in Ibis and the DataFrame API in Apache Spark