[][src]Crate datafusion

DataFusion is an extensible query execution framework that uses Apache Arrow as the memory model.

DataFusion supports both SQL and a Table/DataFrame-style API for building logical query plans and also provides a query optimizer and execution engine capable of parallel execution against partitioned data sources (CSV and Parquet) using threads.

DataFusion currently supports simple projection, selection, and aggregate queries.

Modules

datasource

DataFusion data sources

error

DataFusion error types

execution

DataFusion query execution

logicalplan

This module provides a logical query plan enum that can describe queries. Logical query plans can be created from a SQL statement or built programmatically via the Table API.

optimizer

This module contains a query optimizer that operates against a logical plan and applies some simple rules to a logical plan, such as "Projection Push Down" and "Type Coercion".

sql

This module provides a SQL parser that translates SQL queries into an abstract syntax tree (AST), and a SQL query planner that creates a logical plan from the AST.

table

Table API for building a logical query plan. This is similar to the Table API in Ibis and the DataFrame API in Apache Spark