typed-arrow
Compile‑time Arrow schemas for Rust.
typed-arrow provides a strongly typed, fully compile-time way to define Arrow columns and schemas in Rust.
It maps Rust types directly to arrow-rs typed builders/arrays and arrow_schema::DataType
— without any
runtime DataType
switching — enabling fast, monomorphized column construction and ergonomic row-based APIs.
Why compile-time Arrow?
- Performance: monomorphized builders/arrays with zero dynamic dispatch; avoids runtime
DataType
matching. - Safety: column types, names, and nullability live in the type system; mismatches fail at compile time.
- Interop: uses
arrow-array
/arrow-schema
types directly; no bespoke runtime layer to learn.
Quick Start
use ;
use ;
Add to your Cargo.toml
(derives enabled by default):
[]
= { = "0.x" }
When working in this repository/workspace:
[]
= { = "." }
Examples
Run the included examples to see end-to-end usage:
01_primitives
— deriveRecord
, inspectDataType
, build primitives02_lists
—List<T>
andList<Option<T>>
03_dictionary
—Dictionary<K, String>
04_timestamps
—Timestamp<U>
units04b_timestamps_tz
—TimestampTz<U, Z>
withUtc
and custom markers05_structs
— nested structs →StructArray
06_rows_flat
— row-based building for flat records07_rows_nested
— row-based building with#[record(nested)]
08_record_batch
— compile-time schema +RecordBatch
09_duration_interval
— Duration and Interval types10_union
— Dense Union as a Record column (with attributes)11_map
— Map (incl.Option<V>
values) + as a Record column
Run:
Core Concepts
Record
: implemented by the derive macro for structs with named fields.ColAt<I>
: per-column associated itemsRust
,ColumnBuilder
,ColumnArray
,NULLABLE
,NAME
, anddata_type()
.ArrowBinding
: compile-time mapping from a Rust value type to its Arrow builder, array, andDataType
.BuildRows
: derive generates<Type>Builders
and<Type>Arrays
withappend_row(s)
andfinish
.SchemaMeta
: derive providesfields()
andschema()
; arrays structs provideinto_record_batch()
.AppendStruct
andStructMeta
: enable nested struct fields andStructArray
building.
Metadata (Compile-time)
- Schema-level: annotate with
#[schema_metadata(k = "owner", v = "data")]
. - Field-level: annotate with
#[metadata(k = "pii", v = "email")]
. - You can repeat attributes to add multiple pairs; later duplicates win.
Nested Type Wrappers
- Lists:
List<T>
(non-null items),List<Option<T>>
(nullable items). UseOption<List<_>>
for list-level nulls. - Dictionary: dictionary-encoded values with integral keys (
i8/i16/i32/i64/u8/u16/u32/u64
):Dictionary<K, String>
(Utf8)Dictionary<K, Vec<u8>>
(Binary)Dictionary<K, T>
for primitivesT ∈ { i8, i16, i32, i64, u8, u16, u32, u64, f32, f64 }
Arrow DataType Coverage
Supported (arrow-rs v56):
- Primitives: Int8/16/32/64, UInt8/16/32/64, Float16/32/64, Boolean
- Strings/Binary: Utf8, LargeUtf8, Binary, LargeBinary, FixedSizeBinary (via
[u8; N]
) - Temporal: Timestamp (with/without TZ; s/ms/us/ns), Date32/64, Time32(s/ms), Time64(us/ns), Duration(s/ms/us/ns), Interval(YearMonth/DayTime/MonthDayNano)
- Decimal: Decimal128, Decimal256 (const generic precision/scale)
- Nested: List (including nullable items), LargeList, FixedSizeList (nullable/non-null items), Struct,
Map (Vec<(K,V)>; use
Option<V>
for nullable values), OrderedMap (BTreeMap<K,V>) withkeys_sorted = true
- Union: Dense and Sparse (via
#[derive(Union)]
on enums) - Dictionary: keys = all integral types; values = Utf8 (String), LargeUtf8, Binary (Vec), LargeBinary, FixedSizeBinary (
[u8; N]
), primitives (i*, u*, f32, f64)
Missing:
- BinaryView, Utf8View
- Utf8View
- ListView, LargeListView
- RunEndEncoded