Crate arrow_pyarrow

Crate arrow_pyarrow 

Source
Expand description

Pass Arrow objects from and to PyArrow, using Arrow’s C Data Interface and pyo3.

For underlying implementation, see the ffi module.

One can use these to write Python functions that take and return PyArrow objects, with automatic conversion to corresponding arrow-rs types.

#[pyfunction]
fn double_array(array: PyArrowType<ArrayData>) -> PyResult<PyArrowType<ArrayData>> {
    let array = array.0; // Extract from PyArrowType wrapper
    let array: Arc<dyn Array> = make_array(array); // Convert ArrayData to ArrayRef
    let array: &Int32Array = array.as_any().downcast_ref()
        .ok_or_else(|| PyValueError::new_err("expected int32 array"))?;
    let array: Int32Array = array.iter().map(|x| x.map(|x| x * 2)).collect();
    Ok(PyArrowType(array.into_data()))
}
pyarrow typearrow-rs type
pyarrow.DataTypeDataType
pyarrow.FieldField
pyarrow.SchemaSchema
pyarrow.ArrayArrayData
pyarrow.RecordBatchRecordBatch
pyarrow.RecordBatchReaderArrowArrayStreamReader / Box<dyn RecordBatchReader + Send> (1)
pyarrow.TableTable (2)

(1) pyarrow.RecordBatchReader can be imported as ArrowArrayStreamReader. Either ArrowArrayStreamReader or Box<dyn RecordBatchReader + Send> can be exported as pyarrow.RecordBatchReader. (Box<dyn RecordBatchReader + Send> is typically easier to create.)

(2) Although arrow-rs offers Table, a convenience wrapper for pyarrow.Table that internally holds Vec<RecordBatch>, it is meant primarily for use cases where you already have Vec<RecordBatch> on the Rust side and want to export that in bulk as a pyarrow.Table. In general, it is recommended to use streaming approaches instead of dealing with data in bulk. For example, a pyarrow.Table (or any other object that implements the ArrayStream PyCapsule interface) can be imported to Rust through PyArrowType<ArrowArrayStreamReader> instead of forcing eager reading into Vec<RecordBatch>.

Structs§

ArrowException
A Rust type representing an exception defined in Python code.
PyArrowType
A newtype wrapper for types implementing FromPyArrow or IntoPyArrow.
Table
This is a convenience wrapper around Vec<RecordBatch> that tries to simplify conversion from and to pyarrow.Table.

Traits§

FromPyArrow
Trait for converting Python objects to arrow-rs types.
IntoPyArrow
Convert an arrow-rs type into a PyArrow object.
ToPyArrow
Create a new PyArrow object from a arrow-rs type.

Type Aliases§

PyArrowException
Represents an exception raised by PyArrow.