connector_arrow 0.1.0

Load data from databases to Apache Arrow, the fastest way.

Crate
Source
Builds
Feature flags

Documentation

Coverage
30.65%
19 out of 62 items documented1 out of 58 items with examples
Links
aljazerzen/connector_arrow
47 4 11
crates.io
Dependencies
Versions
Owners

Connector Arrow

An database client for many databases, exposing an interface that produces Apache Arrow.

Documentation

Inspired by ConnectorX, with focus on being a Rust library, instead of a Python library.

To be more specific, this crate:

does not support multiple destinations, but only arrow,
does not include parallelism, but allows downstream creates to implement it themselves,
does not include connection pooling, but allows downstream creates to implement it themselves,
uses minimal dependencies (it even disables default features).

API features

Querying that returns Vec<RecordBatch>
Record batch streaming
Query parameters
Writing to the data store

Sources

None of the sources are enabled by default, use src_ features to enable them:

SQLite (src_sqlite, using rusqlite)
DuckDB (src_duckdb)
PostgreSQL (src_postgres)
Redshift (through postgres protocol, untested)
MySQL
MariaDB (through mysql protocol)
ClickHouse (through mysql protocol)
SQL Server
Azure SQL Database (through mssql protocol)
Oracle
Big Query

Types

When converting non-arrow data sources (everything except DuckDB), only a subset of all possible arrows types is produced. Here is a list of what it is currently possible to produce:

Null
Boolean
Int8
Int16
Int32
Int64
UInt8
UInt16
UInt32
UInt64
Float16
Float32
Float64
Timestamp
Date32
Date64
Time32
Time64
Duration
Interval
Binary
FixedSizeBinary
LargeBinary
Utf8
LargeUtf8
List
FixedSizeList
LargeList
Struct
Union
Dictionary
Decimal128
Decimal256
Map
RunEndEncoded

This restriction mostly has to do with non-trivial mapping of Arrow type into Rust native types.