arrow-pyarrow 56.0.0

Pass Arrow objects from and to PyArrow, using Arrow's C Data Interface and pyo3.

For underlying implementation, see the [ffi] module.

One can use these to write Python functions that take and return PyArrow objects, with automatic conversion to corresponding arrow-rs types.

#[pyfunction]
fn double_array(array: PyArrowType<ArrayData>) -> PyResult<PyArrowType<ArrayData>> {
    let array = array.0; // Extract from PyArrowType wrapper
    let array: Arc<dyn Array> = make_array(array); // Convert ArrayData to ArrayRef
    let array: &Int32Array = array.as_any().downcast_ref()
        .ok_or_else(|| PyValueError::new_err("expected int32 array"))?;
    let array: Int32Array = array.iter().map(|x| x.map(|x| x * 2)).collect();
    Ok(PyArrowType(array.into_data()))
}

pyarrow type	arrow-rs type
`pyarrow.DataType`	[DataType]
`pyarrow.Field`	[Field]
`pyarrow.Schema`	[Schema]
`pyarrow.Array`	[ArrayData]
`pyarrow.RecordBatch`	[RecordBatch]
`pyarrow.RecordBatchReader`	[ArrowArrayStreamReader] / `Box<dyn RecordBatchReader + Send>` (1)

(1) pyarrow.RecordBatchReader can be imported as [ArrowArrayStreamReader]. Either [ArrowArrayStreamReader] or Box<dyn RecordBatchReader + Send> can be exported as pyarrow.RecordBatchReader. (Box<dyn RecordBatchReader + Send> is typically easier to create.)

PyArrow has the notion of chunked arrays and tables, but arrow-rs doesn't have these same concepts. A chunked table is instead represented with Vec<RecordBatch>. A pyarrow.Table can be imported to Rust by calling pyarrow.Table.to_reader() and then importing the reader as a [ArrowArrayStreamReader].