Expand description
Pass Arrow objects from and to PyArrow, using Arrow’s C Data Interface and pyo3.
For underlying implementation, see the ffi module.
One can use these to write Python functions that take and return PyArrow objects, with automatic conversion to corresponding arrow-rs types.
#[pyfunction]
fn double_array(array: PyArrowType<ArrayData>) -> PyResult<PyArrowType<ArrayData>> {
let array = array.0; // Extract from PyArrowType wrapper
let array: Arc<dyn Array> = make_array(array); // Convert ArrayData to ArrayRef
let array: &Int32Array = array.as_any().downcast_ref()
.ok_or_else(|| PyValueError::new_err("expected int32 array"))?;
let array: Int32Array = array.iter().map(|x| x.map(|x| x * 2)).collect();
Ok(PyArrowType(array.into_data()))
}| pyarrow type | arrow-rs type |
|---|---|
pyarrow.DataType | DataType |
pyarrow.Field | Field |
pyarrow.Schema | Schema |
pyarrow.Array | ArrayData |
pyarrow.RecordBatch | RecordBatch |
pyarrow.RecordBatchReader | ArrowArrayStreamReader / Box<dyn RecordBatchReader + Send> (1) |
pyarrow.Table | Table (2) |
(1) pyarrow.RecordBatchReader can be imported as ArrowArrayStreamReader. Either
ArrowArrayStreamReader or Box<dyn RecordBatchReader + Send> can be exported
as pyarrow.RecordBatchReader. (Box<dyn RecordBatchReader + Send> is typically
easier to create.)
(2) Although arrow-rs offers Table, a convenience wrapper for pyarrow.Table
that internally holds Vec<RecordBatch>, it is meant primarily for use cases where you already
have Vec<RecordBatch> on the Rust side and want to export that in bulk as a pyarrow.Table.
In general, it is recommended to use streaming approaches instead of dealing with data in bulk.
For example, a pyarrow.Table (or any other object that implements the ArrayStream PyCapsule
interface) can be imported to Rust through PyArrowType<ArrowArrayStreamReader> instead of
forcing eager reading into Vec<RecordBatch>.
Structs§
- Arrow
Exception - A Rust type representing an exception defined in Python code.
- PyArrow
Type - A newtype wrapper for types implementing
FromPyArroworIntoPyArrow. - Table
- This is a convenience wrapper around
Vec<RecordBatch>that tries to simplify conversion from and topyarrow.Table.
Traits§
- From
PyArrow - Trait for converting Python objects to arrow-rs types.
- Into
PyArrow - Convert an arrow-rs type into a PyArrow object.
- ToPy
Arrow - Create a new PyArrow object from a arrow-rs type.
Type Aliases§
- PyArrow
Exception - Represents an exception raised by PyArrow.