Skip to main content

ferray_numpy_interop/
lib.rs

1//! # ferray-numpy-interop
2//!
3//! A companion crate providing owning conversions between ferray arrays and
4//! external array ecosystems:
5//!
6//! - **`NumPy`** (via `PyO3`) — feature `"python"`
7//! - **Apache Arrow** — feature `"arrow"`
8//! - **Polars** — feature `"polars"`
9//!
10//! All three backends are feature-gated and disabled by default. Enable them
11//! in your `Cargo.toml`:
12//!
13//! ```toml
14//! [dependencies.ferray-numpy-interop]
15//! version = "0.1"
16//! features = ["arrow"]  # or "python", "polars"
17//! ```
18//!
19//! ## Memory semantics
20//!
21//! Every conversion in this crate currently **copies** the data buffer.
22//! The previous documentation claimed "zero-copy where possible", but in
23//! practice all six conversion paths (`NumPy` / Arrow / Polars × both
24//! directions) allocate a new buffer and memcpy the elements:
25//!
26//! | Path                     | Reason                                       |
27//! |--------------------------|----------------------------------------------|
28//! | `NumPy → ferray`         | `PyReadonlyArray::iter().cloned().collect()` |
29//! | `ferray → NumPy`         | `Array::to_vec_flat()` then `from_vec`       |
30//! | `Arrow ↔ ferray`         | `PrimitiveArray::values()` cloned into `Vec` |
31//! | `Polars ↔ ferray`        | `ChunkedArray` → `Vec<T>` via per-chunk copy |
32//!
33//! True zero-copy ferray↔NumPy would require ferray arrays to share the
34//! raw buffer with a Python-owned `PyArray` (refcount handshake plus
35//! pinning), which is a significant design change. Zero-copy to Arrow
36//! would require ferray arrays to expose their backing buffer as an
37//! `arrow::buffer::Buffer` with a compatible `Drop` hook. Both are
38//! tracked as potential follow-ups; for now the crate provides a
39//! correct, allocation-aware API that clearly acknowledges the copy.
40//!
41//! The copies are usually still *cheap enough* for interop boundaries —
42//! they are a single `memcpy` per conversion, not per element — but
43//! callers on hot paths should prefer to stay inside one ecosystem.
44//!
45//! ## Design principles
46//!
47//! 1. **Safety first** — every conversion validates dtypes and memory
48//!    layout before returning. No silent reinterpretation of memory.
49//! 2. **Honest about allocation** — see the table above. The docstrings
50//!    on individual functions say "copy" explicitly.
51//! 3. **Explicit errors** — dtype mismatches, null values, and
52//!    unsupported types produce clear
53//!    [`FerrayError`](ferray_core::FerrayError) messages.
54
55// Interop kernels marshal byte buffers across `numpy`/`arrow`/`polars`
56// type systems and routinely cross integer-width and signed/unsigned
57// boundaries that those external types contractually represent.
58// Workspace convention is to document FerrayError variants on the type.
59#![allow(
60    clippy::cast_possible_truncation,
61    clippy::cast_possible_wrap,
62    clippy::cast_precision_loss,
63    clippy::cast_sign_loss,
64    clippy::cast_lossless,
65    clippy::missing_errors_doc,
66    clippy::missing_panics_doc,
67    clippy::many_single_char_names,
68    clippy::similar_names,
69    clippy::items_after_statements,
70    clippy::option_if_let_else,
71    clippy::too_long_first_doc_paragraph,
72    clippy::needless_pass_by_value,
73    clippy::match_same_arms
74)]
75
76pub mod dtype_map;
77
78#[cfg(any(feature = "arrow", feature = "polars"))]
79pub mod extras;
80
81#[cfg(feature = "python")]
82pub mod numpy_conv;
83
84#[cfg(feature = "arrow")]
85pub mod arrow_conv;
86
87#[cfg(feature = "polars")]
88pub mod polars_conv;
89
90// Re-export the main conversion traits at crate root for ergonomics.
91
92#[cfg(feature = "arrow")]
93pub use arrow_conv::{
94    FromArrow, FromArrowBool, ToArrow, ToArrowBool, array2_from_arrow_columns,
95    array2_to_arrow_columns, arrayd_from_arrow_flat, arrayd_to_arrow_flat,
96};
97
98#[cfg(feature = "arrow")]
99pub use extras::{
100    array1_to_arrow_with_mask, arrow_to_dynarray_with_mask, dynarray_to_arrow,
101    dynarray_to_arrow_with_mask, record_batch_from_columns, record_batch_to_columns,
102};
103
104#[cfg(all(feature = "arrow", feature = "complex"))]
105pub use extras::{arrow_to_complex32, arrow_to_complex64, complex32_to_arrow, complex64_to_arrow};
106
107#[cfg(feature = "polars")]
108pub use extras::dataframe_from_columns;
109
110#[cfg(feature = "polars")]
111pub use polars_conv::{
112    FromPolars, FromPolarsBool, ToPolars, ToPolarsBool, array2_from_polars_dataframe,
113    array2_to_polars_dataframe,
114};
115
116#[cfg(feature = "python")]
117pub use numpy_conv::{AsFerray, IntoNumPy};