typed-arrow-dyn
typed-arrow-dyn is the runtime half of the typed-arrow story. Where the main crate gives you a fully compile-time schema, this crate builds Arrow arrays and RecordBatches from schemas that are only known at runtime.
What It Provides
DynSchema: a thinArc<Schema>wrapper that feeds the unifiedSchemaLiketrait.DynBuilders: one builder per field, created directly from the runtime schema and monomorphized per Arrow logical type.DynRowandDynCell: ergonomics for appending rows where every cell is either a value orNone.DynRowViews,DynRowView,DynCellRef, and friends (DynStructView,DynListView, …): zero-copy views overRecordBatchdata with the same logical model as the owned builders.DynProjection: reusable column/field projections that work for row iteration and hand the same Parquet projection mask to readers.DynError/DynViewError: structured diagnostics for arity, type mismatches, builder failures, view errors, and deferred nullability violations.validate_nullability: a post-build walk that enforces field and item nullability, returning precise paths such asperson.address.street[].
Everything is designed to mirror the infallible typed path: builder allocation happens up front, appends stream through with minimal branching, and nullability is validated once via try_finish_into_batch.
Quick Start
use Arc;
use ;
use ;
For ad hoc debugging you can still call finish_into_batch(), which will panic if Arrow rejects the arrays. Production code should stick with try_finish_into_batch() to get DynError::Nullability or DynError::Builder with context.
Zero-Copy Views & Projection
Sometimes you just need to read a RecordBatch with a runtime schema. The view module exposes zero-copy row iterators that give you borrowed DynCellRef<'_> handles and rich nested views:
use Arc;
use ;
use ;
Views cover all Arrow logical types via the same wrappers the owned path understands (DynStructView, DynListView, DynFixedSizeListView, DynMapView, DynUnionView). Every accessor returns DynViewError with the exact dotted/indexed path (orders[3].items[0].sku) so you can bubble rich diagnostics to callers.
DynProjection lets you describe projected schemas once and reuse them across readers:
DynProjection::from_schema(source, projection)matches by field name (including nested children) and records the selection paths.DynProjection::from_indices(schema, indices)is a quick column slice.DynRowView::project(&projection)lazily remaps columns without copying buffers.DynProjection::project_row_view(&schema, batch, row)anditer_batch_views(...).project(projection)give you the same projected iterator.DynProjection::to_parquet_mask()returns theparquet::arrow::ProjectionMaskso Parquet readers only decode the needed leaf columns.
Borrowed values can be converted to owned DynCell instances via DynCellRef::to_owned() or DynCellRef::into_owned(), which makes it easy to bridge inspected data back into the dynamic builders if you need to rewrite batches.
Rows & Cells
DynRow(Vec<Option<DynCell>>)lines up with the schema width. PassingNoneat the top level appends a null to the entire column.DynCellenumerates every value shape the factory understands: booleans, signed/unsigned integers, floating point, UTF-8 strings, binary blobs, dictionary payloads, and the nested variants:Struct(Vec<Option<DynCell>>)—one entry per child field.List(Vec<Option<DynCell>>), reused forListandLargeList.FixedSizeList(Vec<Option<DynCell>>)—length must match the field’s declared width.Map(Vec<(DynCell, Option<DynCell>)>)—each entry is a(key, value)pair; keys must be non-null and values obey the schema’s nullability.Union { type_id, value }—selects a variant by Arrow tag; helpersDynCell::union_value(tag, cell)andDynCell::union_null(tag)keep construction tidy.
- Dictionary columns accept the payload type (
Str,Bin, or primitive variants); the key handling stays inside the builder.
DynRow::append_into_with_fields performs a lightweight type check before mutating builders, so arity/type mistakes fail fast without leaving partially-written columns.
Dynamic Builders
DynBuilders::new(schema, capacity) constructs one concrete builder per field by calling new_dyn_builder with the logical type. The factory is the only place that matches on arrow_schema::DataType; every builder is stored behind the DynColumnBuilder trait object with methods:
High-level users rarely call the trait directly—the unified facade hands out DynBuilders and keeps the append API aligned with the typed path (append_option_row, append_rows, etc.).
Error Model
DynError keeps the dynamic path predictable without drowning you in variants. Appends return Result<(), DynError>, capturing whether the row shape matches the schema, the value fits the Arrow type, or the builder rejected the insert. Finishing returns Result<RecordBatch, DynError> and adds nullability validation so callers know exactly why Arrow construction failed—no panics, just structured context you can surface to users or logs.
Nullability Enforcement
Dynamic builders defer nullability checks until the batch is sealed. validate_nullability(schema, arrays, union_null_rows) walks the resulting arrays—using the provided union row metadata—and enforces:
- Non-nullable columns have no null slots.
- Struct children obey their own nullability only where the parent is valid.
- List, LargeList, and FixedSizeList items respect child nullability.
- Map columns reject null keys and enforce the value field’s nullability.
- Dense and sparse union variants enforce their field nullability with precise row context.
Violations bubble up as DynError::Nullability with col, path, and index for precise diagnostics, allowing the unified facade to report user-friendly messages instead of panicking.
Integration Points
DynSchemasatisfiestyped_arrow_unified::SchemaLikefor runtime cases, so you can switch between typed and dynamic implementations behind a single API.DynBuildersimplements the unifiedBuildersLikecontract; typed builders use a zero-costNoError, while dynamic builders returnResult.- Lower-level consumers can call
new_dyn_builder(data_type)to embed dynamic columns into custom pipelines without adopting the whole facade.
Supported Data Types
The factory builds the following Arrow logical types (Arrow RS v56):
- Null, Boolean
- Int8/16/32/64, UInt8/16/32/64, Float32/64
- Date32/64, Timestamp (all units, optional timezone), Duration (all units), Time32 (Second/Millisecond), Time64 (Microsecond/Nanosecond)
- Utf8, LargeUtf8, Binary, LargeBinary, FixedSizeBinary
- Dictionary with the above strings/binary types or primitive values
- Struct, List, LargeList, FixedSizeList (including nested combinations)
- Map/OrderedMap (keys non-null, value nullability configurable)
- Union (dense and sparse)
Unsupported types currently fall back to a NullBuilder. Extend new_dyn_builder as Arrow gains new logical types.
Examples & Tests
cargo run -p typed-arrow-dyn --example nested_struct_listshows nested structs and lists.cargo test -p typed-arrow-dynexercises dictionaries, deep nesting, and nullability validation.
License
This crate shares the repository license; see LICENSE.