Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
DataFusion Table Providers
Note: This is not an official Apache Software Foundation project.
The goal of this repo is to extend the capabilities of DataFusion to support additional data sources via implementations of the TableProvider trait.
Many of the table providers in this repo are for querying data from other database systems. Those providers also integrate with the datafusion-federation crate to allow for more efficient query execution, such as pushing down joins between multiple tables from the same database system, or efficiently implementing TopK style queries (SELECT * FROM table ORDER BY foo LIMIT 10).
To use these table providers with efficient federation push-down, add the datafusion-federation crate and create a DataFusion SessionContext using the Federation optimizer rule and query planner with:
use SessionContext;
let state = default_session_state;
let ctx = with_state;
// Register the specific table providers into ctx
// queries will now automatically be federated
Table Providers
- PostgreSQL
- MySQL
- SQLite
- ClickHouse
- DuckDB
- Flight SQL
- ODBC
Development
During development, and especially before opening a PR, it is recommended to run:
This verifies that all features and all crates compile without building binaries. It’s much faster than cargo build and avoids issues with native/shared library dependencies and heavy compilation.
Examples (in Rust)
Run the included examples to see how to use the table providers:
DuckDB
# Read from a table in a DuckDB file
# Create an external table backed by DuckDB directly in DataFusion
# Use the result of a DuckDB function call as the source of a table
SQLite
# Run from repo folder
Postgres
In order to run the Postgres example, you need to have a Postgres server running. You can use the following command to start a Postgres server in a Docker container the example can use:
# Wait for the Postgres server to start
# Create a table in the Postgres server and insert some data
# Run from repo folder
ClickHouse
In order to run the Clickhouse example, you need to have a Clickhouse server running. You can use the following command to start a Clickhouse server in a Docker container the example can use:
# 2. Wait for readiness
until | ; do
done
# 3. Create tables and a parameterized view
# Run from repo folder
MySQL
In order to run the MySQL example, you need to have a MySQL server running. You can use the following command to start a MySQL server in a Docker container the example can use:
# Wait for the MySQL server to start
# Create a table in the MySQL server and insert some data
# Run from repo folder
Flight SQL
# or
# cargo install --locked --git https://github.com/roapi/roapi --branch main --bins roapi
&
# Run from repo folder
ODBC
# or
# brew install unixodbc & brew install sqliteodbc
ARM Mac
Please see https://github.com/pacman82/odbc-api#os-x-arm--mac-m1 for reference.
Steps:
- Install unixodbc and sqliteodbc by
brew install unixodbc sqliteodbc. - Find local sqliteodbc driver path by running
brew info sqliteodbc. The path might look like/opt/homebrew/Cellar/sqliteodbc/0.99991. - Set up odbc config file at
~/.odbcinst.iniwith your local sqliteodbc path. Example config file:
[SQLite3]
Description = SQLite3 ODBC Driver
Driver = /opt/homebrew/Cellar/sqliteodbc/0.99991/lib/libsqlite3odbc.dylib
- Test configuration by running
odbcinst -q -d -n SQLite3. If the path is printed out correctly, then you are all set.
Examples (in Python)
- Start a Python venv
- Enter into venv
- Inside python/ folder, run
maturin develop. - Inside python/examples/ folder, run the corresponding test using
python3 [file_name].