Expand description

Types to safely create references into NumPy arrays

It is assumed that unchecked code - which includes unsafe Rust and Python - is validated by its author which together with the dynamic borrow checking performed by this crate ensures that safe Rust code cannot cause undefined behaviour by creating references into NumPy arrays.

With these borrows established, references to individual elements or reference-based views of whole array can be created safely. These are then the starting point for algorithms iteraing over and operating on the elements of the array.

Examples

The first example shows that dynamic borrow checking works to constrain both what safe Rust code can invoke and how it is invoked.

use numpy::PyArray1;
use ndarray::Zip;
use pyo3::Python;

fn add(x: &PyArray1<f64>, y: &PyArray1<f64>, z: &PyArray1<f64>) {
    let x1 = x.readonly();
    let y1 = y.readonly();
    let mut z1 = z.readwrite();

    let x2 = x1.as_array();
    let y2 = y1.as_array();
    let z2 = z1.as_array_mut();

    Zip::from(x2)
        .and(y2)
        .and(z2)
        .for_each(|x3, y3, z3| *z3 = x3 + y3);

    // Will fail at runtime due to conflict with `x1`.
    let res = catch_unwind(AssertUnwindSafe(|| {
        let _x4 = x.readwrite();
    }));
    assert!(res.is_err());
}

Python::with_gil(|py| {
    let x = PyArray1::<f64>::zeros(py, 42, false);
    let y = PyArray1::<f64>::zeros(py, 42, false);
    let z = PyArray1::<f64>::zeros(py, 42, false);

    // Will work as the three arrays are distinct.
    add(x, y, z);

    // Will work as `x1` and `y1` are compatible borrows.
    add(x, x, z);

    // Will fail at runtime due to conflict between `y1` and `z1`.
    let res = catch_unwind(AssertUnwindSafe(|| {
        add(x, y, y);
    }));
    assert!(res.is_err());
});

The second example shows that non-overlapping and interleaved views are also supported.

use numpy::PyArray1;
use pyo3::{types::IntoPyDict, Python};

Python::with_gil(|py| {
    let array = PyArray1::arange(py, 0.0, 10.0, 1.0);
    let locals = [("array", array)].into_py_dict(py);

    let view1 = py.eval("array[:5]", None, Some(locals)).unwrap().downcast::<PyArray1<f64>>().unwrap();
    let view2 = py.eval("array[5:]", None, Some(locals)).unwrap().downcast::<PyArray1<f64>>().unwrap();
    let view3 = py.eval("array[::2]", None, Some(locals)).unwrap().downcast::<PyArray1<f64>>().unwrap();
    let view4 = py.eval("array[1::2]", None, Some(locals)).unwrap().downcast::<PyArray1<f64>>().unwrap();

    {
        let _view1 = view1.readwrite();
        let _view2 = view2.readwrite();
    }

    {
        let _view3 = view3.readwrite();
        let _view4 = view4.readwrite();
    }
});

The third example shows that some views are incorrectly rejected since the borrows are over-approximated.

use numpy::PyArray2;
use pyo3::{types::IntoPyDict, Python};

Python::with_gil(|py| {
    let array = PyArray2::<f64>::zeros(py, (10, 10), false);
    let locals = [("array", array)].into_py_dict(py);

    let view1 = py.eval("array[:, ::3]", None, Some(locals)).unwrap().downcast::<PyArray2<f64>>().unwrap();
    let view2 = py.eval("array[:, 1::3]", None, Some(locals)).unwrap().downcast::<PyArray2<f64>>().unwrap();

    // A false conflict as the views do not actually share any elements.
    let res = catch_unwind(AssertUnwindSafe(|| {
        let _view1 = view1.readwrite();
        let _view2 = view2.readwrite();
    }));
    assert!(res.is_err());
});

Rationale

Rust references require aliasing discipline to be maintained, i.e. there must always exist only a single mutable (aka exclusive) reference or multiple immutable (aka shared) references for each object, otherwise the program contains undefined behaviour.

The aim of this module is to ensure that safe Rust code is unable to violate these requirements on its own. We cannot prevent unchecked code - this includes unsafe Rust, Python or other native code like C or Fortran - from violating them. Therefore the responsibility to avoid this lies with the author of that code instead of the compiler. However, assuming that the unchecked code is correct, we can ensure that safe Rust is unable to introduce mistakes into an otherwise correct program by dynamically checking which arrays are currently borrowed and in what manner.

This means that we follow the base object chain of each array to the original allocation backing it and track which parts of that allocation are covered by the array and thereby ensure that only a single read-write array or multiple read-only arrays overlapping with that region are borrowed at any time.

In contrast to Rust references, the mere existence of Python references or raw pointers is not an issue because these values are not assumed to follow aliasing discipline by the Rust compiler.

This cannot prevent unchecked code from concurrently modifying an array via callbacks or using multiple threads, but that would lead to incorrect results even if the code that is interfered with is implemented in another language which does not require aliasing discipline.

Concerning multi-threading in particular: While the GIL needs to be acquired to create borrows, they are not bound to the GIL and will stay active after the GIL is released, for example by calling allow_threads. Borrows also do not provide synchronization, i.e. multiple threads borrowing the same array will lead to runtime panics, it will not block those threads until already active borrows are released.

In summary, this crate takes the position that all unchecked code - unsafe Rust, Python, C, Fortran, etc. - must be checked for correctness by its author. Safe Rust code can then rely on this correctness, but should not be able to introduce memory safety issues on its own. Additionally, dynamic borrow checking can catch some mistakes introduced by unchecked code, e.g. Python calling a function with the same array as an input and as an output argument.

Limitations

Note that the current implementation of this is an over-approximation: It will consider borrows potentially conflicting if the initial arrays have the same object at the end of their base object chain. Then, multiple conditions which are sufficient but not necessary to show the absence of conflicts are checked.

While this is sufficient to handle common situations like slicing an array with a non-unit step size which divides the dimension along that axis, there are also cases which it does not handle. For example, if the step size does not divide the dimension along the sliced axis. Under such conditions, borrows are rejected even though the arrays do not actually share any elements.

This does limit the set of programs that can be written using safe Rust in way similar to rustc itself which ensures that all accepted programs are memory safe but does not necessarily accept all memory safe programs. However, the unsafe method PyArray::as_array_mut can be used as an escape hatch. More involved cases like the example from above may be supported in the future.

Structs

Read-only borrow of an array.
Read-write borrow of an array.

Type Definitions

Read-only borrow of a one-dimensional array.
Read-only borrow of a two-dimensional array.
Read-only borrow of a three-dimensional array.
Read-only borrow of a four-dimensional array.
Read-only borrow of a five-dimensional array.
Read-only borrow of a six-dimensional array.
Read-only borrow of an array whose dimensionality is determined at runtime.
Read-write borrow of a one-dimensional array.
Read-write borrow of a two-dimensional array.
Read-write borrow of a three-dimensional array.
Read-write borrow of a four-dimensional array.
Read-write borrow of a five-dimensional array.
Read-write borrow of a six-dimensional array.
Read-write borrow of an array whose dimensionality is determined at runtime.