ptars
Fast conversion between Protocol Buffers and Apache Arrow in Rust.
ptars converts directly between the protobuf wire format and Arrow columnar arrays.
No intermediate DynamicMessage objects are created.
Serialized bytes are parsed straight into Arrow builders, and Arrow arrays are encoded directly to protobuf wire format.
Features
- Convert serialized protobuf messages to Arrow
RecordBatch - Convert Arrow
RecordBatchback to serialized protobuf messages - Direct wire format encoding/decoding — no per-row object allocation
- Support for nested messages, repeated fields, and maps
- Special handling for well-known types:
google.protobuf.Timestamp→timestamp[ns]google.type.Date→date32google.type.TimeOfDay→time64[ns]- Wrapper types (
DoubleValue,Int32Value, etc.) → nullable primitives
Usage
Add to your Cargo.toml:
[]
= "0.0.9"
= "0.16"
Converting Protobuf to Arrow
use messages_to_record_batch;
use ;
// Load your protobuf descriptor
let pool = decode.unwrap;
let message_descriptor = pool.get_message_by_name.unwrap;
// Create some messages
let mut msg = new;
msg.set_field_by_name;
msg.set_field_by_name;
let messages = vec!;
// Convert to Arrow RecordBatch
let record_batch = messages_to_record_batch;
Converting Binary Array to Arrow
If you have serialized protobuf messages in an Arrow BinaryArray:
use binary_array_to_record_batch_direct;
use PtarsConfig;
use BinaryArray;
let binary_array: BinaryArray = /* your serialized messages */;
let config = default;
let record_batch = binary_array_to_record_batch_direct.unwrap;
Converting Arrow back to Protobuf
use record_batch_to_array;
// Convert RecordBatch to a BinaryArray of serialized messages
let binary_array = record_batch_to_array;
// Decode individual messages
for i in 0..binary_array.len
License
Apache-2.0