stately-arrow
Arrow-based data connectivity and query execution over HTTP APIs.
Overview
This crate provides a flexible abstraction layer over DataFusion for building data query services with support for multiple backend connectors. It's designed to be mounted as an axum router and pairs with the @statelyjs/arrow frontend plugin.
Features
- Multi-Backend Support - Connect to object stores (S3, GCS, Azure) and databases (ClickHouse)
- Streaming Queries - Execute SQL with Arrow IPC streaming responses
- Connector Registry - Manage and register data source connectors
- DataFusion Integration - Leverage DataFusion's query engine with URL tables
Install
Add stately-arrow to your Cargo.toml:
Feature Flags
| Feature | Description |
|---|---|
object-store |
S3, GCS, Azure, and local filesystem backends |
database |
Base database connector types |
clickhouse |
ClickHouse database backend (implies database) |
registry |
Generic registry implementation with stately integration |
strum |
AsRefStr derives for enum types |
Quick Start
use Arc;
use Router;
use ;
async
API Endpoints
| Method | Path | Description |
|---|---|---|
GET |
/connectors |
List available connectors |
POST |
/connectors |
Get details for multiple connectors |
GET |
/connectors/{id} |
Get connector details (tables/files) |
GET |
/register |
List registered connections |
GET |
/register/{id} |
Register a connector with DataFusion |
GET |
/catalogs |
List DataFusion catalogs |
POST |
/query |
Execute SQL query (streaming Arrow IPC) |
Execute Query
The response is an Arrow IPC stream (application/vnd.apache.arrow.stream).
List Connector Contents
# List databases/paths
# Search within a database/path
Core Abstractions
Backend
Implement Backend to create a new data source connector:
use async_trait;
use ;
use SessionContext;
ConnectorRegistry
Implement ConnectorRegistry to manage your connectors:
use async_trait;
use ;
'Generic' Registry
Refer to the generic registry implementation for a complete example of how to implement ConnectorRegistry. generic::Registry is provided as a convenience to use the out-of-the-box backends stately-arrow provides by default. But it serves as a starting point if the goal is to additionally provide custom Backends.
QuerySession
Implement QuerySession for custom DataFusion session behavior:
use async_trait;
use ;
use ;
Built-in Backends
Object Store
Connect to S3, GCS, Azure, or local filesystem:
use ;
let config = Config ;
Credentials are resolved from environment variables:
- AWS:
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY,AWS_REGION - GCP:
GOOGLE_SERVICE_ACCOUNTor application default credentials - Azure:
AZURE_STORAGE_ACCOUNT_NAME,AZURE_STORAGE_ACCOUNT_KEY
ClickHouse
Connect to ClickHouse databases (requires clickhouse feature):
use ;
use ClickHouseConfig;
let config = Config ;
ClickHouse support uses clickhouse-datafusion under the hood to connect ClickHouse and DataFusion.
Generic Registry
Use the built-in generic registry with stately state (requires registry feature):
use ;
// Implement Connectors trait on your state type
// Create registry from state
let registry = new;
OpenAPI
Generate the OpenAPI spec for frontend codegen:
The spec includes conditional schemas based on enabled features.
Module Structure
stately-arrow/
├── api.rs # Router factory
├── api/
│ ├── handlers.rs # HTTP handlers
│ ├── ipc.rs # Arrow IPC streaming
│ └── openapi.rs # OpenAPI documentation
├── backend.rs # Backend trait + metadata types
├── context.rs # QueryContext + QuerySession
├── database.rs # Database connector types
├── database/
│ └── clickhouse.rs # ClickHouse backend
├── error.rs # Error types
├── object_store.rs # Object store backend
├── registry.rs # ConnectorRegistry trait + generic impl
├── request.rs # Request DTOs
├── response.rs # Response DTOs
└── state.rs # QueryState extractor
License
Licensed under the Apache License, Version 2.0. See LICENSE for details.