Crate npyz[][src]

Expand description

Serialize and deserialize the NumPy’s *.npy binary format.

Overview

NPY is a simple binary data format. It stores the type, shape and endianness information in a header, which is followed by a flat binary data field. This crate offers a simple, mostly type-safe way to read and write *.npy files. Files are handled using iterators, so they don’t need to fit in memory.

Optional cargo features

  • "complex" enables parsing of num_complex::Complex
  • "derive" enables derives of traits for working with structured arrays.

Reading

Let’s create a simple *.npy file in Python:

import numpy as np
a = np.array([1, 3.5, -6, 2.3])
np.save('examples/plain.npy', a)

Now, we can load it in Rust using NpyReader:

use npyz::NpyReader;

fn main() -> std::io::Result<()> {
    let bytes = std::fs::read("examples/plain.npy")?;

    // Note: In addition to byte slices, this accepts any io::Read
    let data: NpyReader<f64, _> = NpyReader::new(&bytes[..])?;
    for number in data {
        let number = number?;
        eprintln!("{}", number);
    }
    Ok(())
}

And we can see our data:

1
3.5
-6
2.3

Inspecting properties of the array

NpyReader provides methods that let you inspect the array.

use npyz::NpyReader;

fn main() -> std::io::Result<()> {
    let bytes = std::fs::read("tests/c-order.npy")?;

    let data: NpyReader<i64, _> = NpyReader::new(&bytes[..])?;
    assert_eq!(data.shape(), &[2, 3, 4]);
    assert_eq!(data.order(), npyz::Order::C);
    assert_eq!(data.strides(), &[12, 4, 1]);
    Ok(())
}

Writing

The primary interface for writing npy files is Builder.

fn main() -> std::io::Result<()> {
    // Any io::Write is supported.  For this example we'll
    // use Vec<u8> to serialize in-memory.
    let mut out_buf = vec![];
    let mut writer = {
        npyz::Builder::new()
            .default_dtype()
            .begin_nd(&mut out_buf, &[2, 3])?
    };

    writer.push(&100)?; writer.push(&101)?; writer.push(&102)?;
    writer.push(&200)?; writer.push(&201)?; writer.push(&202)?;
    writer.finish()?;

    eprintln!("{:02x?}", out_buf);
    Ok(())
}

Working with ndarray

Using the ndarray crate? No problem! At the time, no conversion API is provided by npyz, but one can easily be written:

use npyz::NpyReader;

// Example of parsing to an array with fixed NDIM.
fn to_array_3<T>(data: Vec<T>, shape: Vec<u64>, order: npyz::Order) -> ndarray::Array3<T> {
    use ndarray::ShapeBuilder;

    let shape = match shape[..] {
        [i1, i2, i3] => [i1 as usize, i2 as usize, i3 as usize],
        _  => panic!("expected 3D array"),
    };
    let true_shape = shape.set_f(order == npyz::Order::Fortran);

    ndarray::Array3::from_shape_vec(true_shape, data)
        .unwrap_or_else(|e| panic!("shape error: {}", e))
}

// Example of parsing to an array with dynamic NDIM.
fn to_array_d<T>(data: Vec<T>, shape: Vec<u64>, order: npyz::Order) -> ndarray::ArrayD<T> {
    use ndarray::ShapeBuilder;

    let shape = shape.into_iter().map(|x| x as usize).collect::<Vec<_>>();
    let true_shape = shape.set_f(order == npyz::Order::Fortran);

    ndarray::ArrayD::from_shape_vec(true_shape, data)
        .unwrap_or_else(|e| panic!("shape error: {}", e))
}

fn main() -> std::io::Result<()> {
    let bytes = std::fs::read("tests/c-order.npy")?;
    let reader: NpyReader<i64, _> = NpyReader::new(&bytes[..])?;
    let shape = reader.shape().to_vec();
    let order = reader.order();
    let data = reader.into_vec()?;

    println!("{:?}", to_array_3(data.clone(), shape.clone(), order));
    println!("{:?}", to_array_d(data.clone(), shape.clone(), order));
    Ok(())
}

Likewise, here is a function that can be used to write an ndarray:

use ndarray::Array;
use std::io;
use std::fs::File;

// Example of writing an array with unknown shape.  The output is always C-order.
fn write_array<T, S, D>(writer: impl io::Write, array: &ndarray::ArrayBase<S, D>) -> io::Result<()>
where
    T: Clone + npyz::AutoSerialize,
    S: ndarray::Data<Elem=T>,
    D: ndarray::Dimension,
{
    let shape = array.shape().iter().map(|&x| x as u64).collect::<Vec<_>>();
    let c_order_items = array.iter();

    let mut writer = npyz::Builder::new().default_dtype().begin_nd(writer, &shape)?;
    for item in c_order_items {
        writer.push(item)?;
    }
    writer.finish()
}

fn main() -> io::Result<()> {
    let array = Array::from_shape_fn((6, 7, 8), |(i, j, k)| 100*i as i32 + 10*j as i32 + k as i32);
    // even weirdly-ordered axes and non-contiguous arrays are fine
    let view = array.view(); // shape (6, 7, 8), C-order
    let view = view.reversed_axes(); // shape (8, 7, 6), fortran order
    let view = view.slice(ndarray::s![.., .., ..;2]); // shape (8, 7, 3), non-contiguous
    assert_eq!(view.shape(), &[8, 7, 3]);

    let mut file = io::BufWriter::new(File::create("examples/ndarray-out.npy")?);
    write_array(&mut file, &view)
}

Structured arrays

npyz supports structured arrays! Consider the following structured array created in Python:

import numpy as np
a = np.array([(1,2.5,4), (2,3.1,5)], dtype=[('a', 'i4'),('b', 'f4'),('c', 'i8')])
np.save('examples/simple.npy', a)

To load this in Rust, we need to create a corresponding struct. There are three derivable traits we can define for it:

  • Deserialize — Enables easy reading of .npy files.
  • AutoSerialize — Enables easy writing of .npy files. (in a default format)
  • Serialize — Supertrait of AutoSerialize that allows one to specify a custom DType.

Enable the "derive" feature in Cargo.toml, and make sure the field names and types all match up:

use npyz::NpyReader;

// make sure to add `features = ["derive"]` in Cargo.toml!
#[derive(npyz::Deserialize, Debug)]
struct Struct {
    a: i32,
    b: f32,
    c: i64,
}

fn main() -> std::io::Result<()> {
    let bytes = std::fs::read("examples/structured.npy")?;

    let data: NpyReader<Struct, _> = NpyReader::new(&bytes[..])?;
    for row in data {
        let row = row?;
        eprintln!("{:?}", row);
    }
    Ok(())
}

The output is:

Array { a: 1, b: 2.5, c: 4 }
Array { a: 2, b: 3.1, c: 5 }

Structs

Builder for an output .NPY file.

Indicates that a particular rust type does not support serialization or deserialization as a given DType.

A field of a record dtype

NpyDataDeprecated

Legacy type for reading npy files.

Object for reading an npy file.

Serialize into a file one item at a time. To serialize an iterator, use the to_file function.

Error type returned by <TypeStr as FromStr>::parse.

Represents an Array Interface type-string.

Enums

Representation of a Numpy type

Order of axes in a file.

Traits

Subtrait of Serialize for types which have a reasonable default DType.

Trait that permits reading a type from an .npy file.

Trait that permits writing a type to an .npy file.

Like some sort of for<R: io::Read> Fn(R) -> io::Result<T>.

The proper trait to use for trait objects of TypeRead.

Like some sort of for<W: io::Write> Fn(W, &T) -> io::Result<()>.

The proper trait to use for trait objects of TypeWrite.

Functions

to_fileDeprecated

Serialize an iterator over a struct to a NPY file.

Serialize an iterator over a struct to a NPY file.

Type Definitions

OutFileDeprecated

NpyWriter that writes an entire file.

Derive Macros