ptars 0.0.17

Fast conversion from protobuf to Apache Arrow and back
Documentation

ptars

Crates.io Documentation License

Fast conversion between Protocol Buffers and Apache Arrow in Rust.

ptars converts directly between the protobuf wire format and Arrow columnar arrays. No intermediate DynamicMessage objects are created. Serialized bytes are parsed straight into Arrow builders, and Arrow arrays are encoded directly to protobuf wire format.

Features

  • Convert serialized protobuf messages to Arrow RecordBatch
  • Convert Arrow RecordBatch back to serialized protobuf messages
  • Direct wire format encoding/decoding — no per-row object allocation
  • Support for nested messages, repeated fields, and maps
  • Special handling for well-known types:
    • google.protobuf.Timestamptimestamp[ns]
    • google.type.Datedate32
    • google.type.TimeOfDaytime64[ns]
    • Wrapper types (DoubleValue, Int32Value, etc.) → nullable primitives

Usage

Add to your Cargo.toml:

[dependencies]
ptars = "0.0.9"
prost-reflect = "0.16"

Converting Protobuf to Arrow

use ptars::messages_to_record_batch;
use prost_reflect::{DescriptorPool, DynamicMessage};

// Load your protobuf descriptor
let pool = DescriptorPool::decode(include_bytes!("descriptor.bin").as_ref()).unwrap();
let message_descriptor = pool.get_message_by_name("my.package.MyMessage").unwrap();

// Create some messages
let mut msg = DynamicMessage::new(message_descriptor.clone());
msg.set_field_by_name("id", prost_reflect::Value::I32(42));
msg.set_field_by_name("name", prost_reflect::Value::String("example".into()));

let messages = vec![msg];

// Convert to Arrow RecordBatch
let record_batch = messages_to_record_batch(&messages, &message_descriptor);

Converting Binary Array to Arrow

If you have serialized protobuf messages in an Arrow BinaryArray:

use ptars::binary_array_to_record_batch_direct;
use ptars::PtarsConfig;
use arrow_array::BinaryArray;

let binary_array: BinaryArray = /* your serialized messages */;
let config = PtarsConfig::default();
let record_batch = binary_array_to_record_batch_direct(&binary_array, &message_descriptor, &config).unwrap();

Converting Arrow back to Protobuf

use ptars::record_batch_to_array;

// Convert RecordBatch to a BinaryArray of serialized messages
let binary_array = record_batch_to_array(&record_batch, &message_descriptor);

// Decode individual messages
for i in 0..binary_array.len() {
    let msg = DynamicMessage::decode(message_descriptor.clone(), binary_array.value(i)).unwrap();
}

License

Apache-2.0