Skip to main content

Module arrow_bridge

Module arrow_bridge 

Source
Expand description

DataTable <-> DataFrame Arrow IPC bridge.

Shape’s DataTable uses Arrow columnar format internally. Python’s pandas/polars/pyarrow ecosystem also speaks Arrow IPC. This module provides zero-copy (or near-zero-copy) transfer between the two:

  1. Shape -> Python: Serialize a DataTable as Arrow IPC bytes, pass through the ABI, reconstruct as pyarrow.RecordBatch on the Python side.

  2. Python -> Shape: The Python function returns a RecordBatch serialized as Arrow IPC, which we deserialize back into a DataTable.

This avoids the overhead of element-wise msgpack serialization for large tabular data.

Functions§

datatable_to_python_ipc
Convert Shape DataTable (Arrow IPC bytes) to a format suitable for Python consumption.
python_ipc_to_datatable
Convert Python DataFrame (Arrow IPC bytes) back to Shape DataTable format.