Connector Arrow
An database client for many databases, exposing an interface that produces Apache Arrow.
Inspired by ConnectorX, with focus on being a Rust library, instead of a Python library.
To be more specific, this crate:
- does not support multiple destinations, but only arrow,
- does not include parallelism, but allows downstream creates to implement it themselves,
- does not include connection pooling, but allows downstream creates to implement it themselves,
- uses minimal dependencies (it even disables default features).
API features
- Querying that returns
Vec<RecordBatch> - Record batch streaming
- Query parameters
- Writing to the data store
Sources
None of the sources are enabled by default, use src_ features to enable them:
- SQLite (
src_sqlite, using rusqlite) - DuckDB (
src_duckdb) - PostgreSQL (
src_postgres) - Redshift (through postgres protocol, untested)
- MySQL
- MariaDB (through mysql protocol)
- ClickHouse (through mysql protocol)
- SQL Server
- Azure SQL Database (through mssql protocol)
- Oracle
- Big Query
Types
When converting non-arrow data sources (everything except DuckDB), only a subset of all possible arrows types is produced. Here is a list of what it is currently possible to produce:
- Null
- Boolean
- Int8
- Int16
- Int32
- Int64
- UInt8
- UInt16
- UInt32
- UInt64
- Float16
- Float32
- Float64
- Timestamp
- Date32
- Date64
- Time32
- Time64
- Duration
- Interval
- Binary
- FixedSizeBinary
- LargeBinary
- Utf8
- LargeUtf8
- List
- FixedSizeList
- LargeList
- Struct
- Union
- Dictionary
- Decimal128
- Decimal256
- Map
- RunEndEncoded
This restriction mostly has to do with non-trivial mapping of Arrow type into Rust native types.