Crate npyz[−][src]
Expand description
Serialize and deserialize the NumPy’s *.npy binary format.
Overview
NPY is a simple binary data format. It stores the type, shape and endianness information in a header, which is followed by a flat binary data field. This crate offers a simple, mostly type-safe way to read and write *.npy files. Files are handled using iterators, so they don’t need to fit in memory.
Optional cargo features
No features are enabled by default. Here is the list of existing features:
"complex"
enables the use ofnum_complex::Complex
. This requires opt-in because it is a stability hazard;num_complex
sometimes undergoes major semver version bumps and it is your responsibility to make sure that your code andnpyz
are using the same version."derive"
enables derives of traits for working with structured arrays. This will add a build-time dependency on common proc macro utilities (syn
,quote
)."npz"
enables adapters for working with NPZ files (including scipy sparse matrices), adding a public dependency on thezip
crate. This requires opt-in becausezip
has a fair number of transitive dependencies. (note that some npz-related helper functions are available even without the feature)
Reading
Let’s create a simple *.npy file in Python:
import numpy as np
a = np.array([1, 3.5, -6, 2.3])
np.save('test-data/plain.npy', a)
Now, we can load it in Rust using NpyFile
:
fn main() -> Result<(), Box<dyn std::error::Error>> { let bytes = std::fs::read("test-data/plain.npy")?; // Note: In addition to byte slices, this accepts any io::Read let npy = npyz::NpyFile::new(&bytes[..])?; for number in npy.data::<f64>()? { let number = number?; eprintln!("{}", number); } Ok(()) }
And we can see our data:
1
3.5
-6
2.3
Inspecting properties of the array
NpyFile
provides methods that let you inspect the array.
fn main() -> std::io::Result<()> { let bytes = std::fs::read("test-data/c-order.npy")?; let data = npyz::NpyFile::new(&bytes[..])?; assert_eq!(data.shape(), &[2, 3, 4]); assert_eq!(data.order(), npyz::Order::C); assert_eq!(data.strides(), &[12, 4, 1]); // convenience method for reading to vec println!("{:?}", data.into_vec::<f64>()); Ok(()) }
Writing
The primary interface for writing npy files is the WriterBuilder
trait.
use npyz::WriterBuilder; fn main() -> std::io::Result<()> { // Any io::Write is supported. For this example we'll // use Vec<u8> to serialize in-memory. let mut out_buf = vec![]; let mut writer = { npyz::WriteOptions::new() .default_dtype() .shape(&[2, 3]) .writer(&mut out_buf) .begin_nd()? }; writer.push(&100)?; writer.push(&101)?; writer.push(&102)?; // you can also write multiple items at once writer.extend(vec![200, 201, 202])?; writer.finish()?; eprintln!("{:02x?}", out_buf); Ok(()) }
Working with ndarray
Using the ndarray
crate? No problem!
At the time, no conversion API is provided by npyz
, but one can easily be written:
// Example of parsing to an array with fixed NDIM. fn to_array_3<T>(data: Vec<T>, shape: Vec<u64>, order: npyz::Order) -> ndarray::Array3<T> { use ndarray::ShapeBuilder; let shape = match shape[..] { [i1, i2, i3] => [i1 as usize, i2 as usize, i3 as usize], _ => panic!("expected 3D array"), }; let true_shape = shape.set_f(order == npyz::Order::Fortran); ndarray::Array3::from_shape_vec(true_shape, data) .unwrap_or_else(|e| panic!("shape error: {}", e)) } // Example of parsing to an array with dynamic NDIM. fn to_array_d<T>(data: Vec<T>, shape: Vec<u64>, order: npyz::Order) -> ndarray::ArrayD<T> { use ndarray::ShapeBuilder; let shape = shape.into_iter().map(|x| x as usize).collect::<Vec<_>>(); let true_shape = shape.set_f(order == npyz::Order::Fortran); ndarray::ArrayD::from_shape_vec(true_shape, data) .unwrap_or_else(|e| panic!("shape error: {}", e)) } pub fn main() -> std::io::Result<()> { let bytes = std::fs::read("test-data/c-order.npy")?; let reader = npyz::NpyFile::new(&bytes[..])?; let shape = reader.shape().to_vec(); let order = reader.order(); let data = reader.into_vec::<i64>()?; println!("{:?}", to_array_3(data.clone(), shape.clone(), order)); println!("{:?}", to_array_d(data.clone(), shape.clone(), order)); Ok(()) }
Likewise, here is a function that can be used to write an ndarray:
use std::io; use std::fs::File; use ndarray::Array; use npyz::WriterBuilder; // Example of writing an array with unknown shape. The output is always C-order. fn write_array<T, S, D>(writer: impl io::Write, array: &ndarray::ArrayBase<S, D>) -> io::Result<()> where T: Clone + npyz::AutoSerialize, S: ndarray::Data<Elem=T>, D: ndarray::Dimension, { let shape = array.shape().iter().map(|&x| x as u64).collect::<Vec<_>>(); let c_order_items = array.iter(); let mut writer = npyz::WriteOptions::new().default_dtype().shape(&shape).writer(writer).begin_nd()?; writer.extend(c_order_items)?; writer.finish() } pub fn main() -> io::Result<()> { let array = Array::from_shape_fn((6, 7, 8), |(i, j, k)| 100*i as i32 + 10*j as i32 + k as i32); // even weirdly-ordered axes and non-contiguous arrays are fine let view = array.view(); // shape (6, 7, 8), C-order let view = view.reversed_axes(); // shape (8, 7, 6), fortran order let view = view.slice(ndarray::s![.., .., ..;2]); // shape (8, 7, 3), non-contiguous assert_eq!(view.shape(), &[8, 7, 3]); let mut file = io::BufWriter::new(File::create("examples/output/ndarray.npy")?); write_array(&mut file, &view) }
Structured arrays
npyz
supports structured arrays! Consider the following structured array created in Python:
import numpy as np
a = np.array([(1,2.5,4), (2,3.1,5)], dtype=[('a', 'i4'),('b', 'f4'),('c', 'i8')])
np.save('test-data/simple.npy', a)
To load this in Rust, we need to create a corresponding struct. There are three derivable traits we can define for it:
Deserialize
— Enables easy reading of.npy
files.AutoSerialize
— Enables easy writing of.npy
files. (in a default format)Serialize
— Supertrait ofAutoSerialize
that allows one to specify a customDType
.
Enable the "derive"
feature in Cargo.toml
,
and make sure the field names and types all match up:
// make sure to add `features = ["derive"]` in Cargo.toml! #[derive(npyz::Deserialize, Debug)] struct Struct { a: i32, b: f32, c: i64, } fn main() -> Result<(), Box<dyn std::error::Error>> { let bytes = std::fs::read("test-data/structured.npy")?; let npy = npyz::NpyFile::new(&bytes[..])?; for row in npy.data::<Struct>()? { let row = row?; eprintln!("{:?}", row); } Ok(()) }
The output is:
Array { a: 1, b: 2.5, c: 4 }
Array { a: 2, b: 3.1, c: 5 }
.npz
files
- To work with
.npz
files in general, see thenpz
module. - To work with
scipy.sparse
matrices see thesparse
module.
Re-exports
pub use num_complex;
pub use zip;
Modules
Utilities for working with npz
files.
Tools for reading and writing Scipy sparse matrices in NPZ format.
Types and traits related to the implementation of WriteOptions
.
Structs
Indicates that a particular rust type does not support serialization or deserialization
as a given DType
.
A field of a structured array dtype
Legacy type for reading npy
files.
Object for reading an npy
file.
Iterator returned by NpyFile::data
which reads elements of type T from the
data portion of an NPY file.
Interface for writing an NPY file to a data stream.
Error type returned by <TypeStr as FromStr>::parse
.
Represents an Array Interface type-string.
Represents an almost-empty configuration for an NpyWriter
.
Enums
Traits
Trait that permits reading a type from an .npy
file.
Trait that permits writing a type to an .npy
file.
Like some sort of for<R: io::Read> Fn(R) -> io::Result<T>
.
The proper trait to use for trait objects of TypeRead
.
Like some sort of for<W: io::Write> Fn(W, &T) -> io::Result<()>
.
The proper trait to use for trait objects of TypeWrite
.
Trait that provides methods on WriteOptions
.
Functions
Serialize an iterator over a struct to a NPY file.
Serialize an iterator over a struct to a NPY file.