Expand description
Indexing
Many libraries represent tensors as N dimensional arrays, however there is often some semantic meaning to each dimension. You may have a batch of 2000 images, each 100 pixels wide and high, with each pixel representing 3 numbers for rgb values. This can be represented as a 2000 x 100 x 100 x 3 tensor, but a 4 dimensional array does not track the semantic meaning of each dimension and associated index.
6 months later you could come back to the code and forget which order the dimensions were created in, at best getting the indexes out of bounds and causing a crash in your application, and at worst silently reading the wrong data without realising. Was it width then height or height then width?…
Easy ML moves the N dimensional array to an implementation detail, and most of its APIs work
on the names of each dimension in a tensor instead of just the order. Instead of a
2000 x 100 x 100 x 3 tensor in which the last element is at [1999, 99, 99, 2], Easy ML tracks
the names of the dimensions, so you have a
[("batch", 2000), ("width", 100), ("height", 100), ("rgb", 3)]
shaped tensor.
This can’t stop you from getting the math wrong, but confusion over which dimension
means what is reduced. Tensors carry around their pairs of dimension name and length
so adding a [("batch", 2000), ("width", 100), ("height", 100), ("rgb", 3)]
shaped tensor
to a [("batch", 2000), ("height", 100), ("width", 100), ("rgb", 3)]
will fail unless you
reorder one first, and you could access an element as
tensor.index_by(["batch", "width", "height", "rgb"]).get([1999, 0, 99, 3])
or
tensor.index_by(["batch", "height", "width", "rgb"]).get([1999, 99, 0, 3])
and read the same data,
because you index into dimensions based on their name, not just the order they are stored in
memory.
Even with a name for each dimension, at some point you still need to say what order you want
to index each dimension with, and this is where TensorAccess
comes in. It
creates a mapping from the dimension name order you want to access elements with to the order
the dimensions are stored as.
Re-exports
pub use crate::matrices::iterators::WithIndex;