PROST! Apache Arrow Support
prost-arrow provides a derive trait that can be used to generate arrow
array builders for any protobuf types generated using
prost.
Usage
This crate provides the ToArrow trait and a proc-macro to derive it. It must
be derived on all messages, so we add it as a type_attribute with the
catch-all path ".". The generated impls depend on both the prost-arrow
crate as well as a few arrow crates.
You will need to add the following dependencies to your Cargo.toml:
arrow-array
arrow-buffer
arrow-schema
prost-arrow
In your build script:
// prost
new
.type_attribute
.compile_protos
.unwrap;
// tonic
configure
.type_attribute
.compile
.unwrap;
Finally, to access the array builder for a generated prost type, we use
prost_arrow::new_builder<T> for some prost-generated type T that has the
ToArrow type derived. The builder returned will implement the base
arrow_builder::Builder trait, but will also have append_value and
append_option methods that accepts our prost type T.
// required trait imports
use ArrayBuilder;
use ;
// Rectangle is a prost-generated struct that has ToArrow derived.
let mut builder = ;
builder.append_value;
The builder can be used just like any other arrow builder implementation type,
so the finish or finish_cloned methods can be used to finalize the arrow
array (in our case, a struct array).
// finish the array builder to get an ArrayRef
let arr = builder.finish;
// downcast the array into StructArray
let struct_arr = arr.as_any..unwrap;
// convert to RecordBatch if desired
let record_batch: RecordBatch = struct_arr.into;
Completeness
| feature | supported |
|---|---|
| primitive types | ✅ |
| repeated fields | ✅ |
optional fields (via optional) |
✅ |
| optional fields (via wrapper types) | 🚧 |
| well-known types (e.g. timestamp) | 🚧 |
| oneof fields | 🚧 |
| map fields | 🚧 |
| nested messages | ✅ |
| recursive/cyclic messages | ❌ |