Struct parenchyma::SharedTensor [] [src]

pub struct SharedTensor<T = f32> {
    pub shape: Shape,
    // some fields omitted
}

A shared tensor for framework-agnostic, memory-aware, n-dimensional storage.

A SharedTensor is used for the purpose of tracking the location of memory across devices for one similar piece of data. SharedTensor handles synchronization of memory of type T, by which it is parameterized, and provides the functionality for memory management across devices.

SharedTensor holds copies and their version numbers. A user can request any number of immutable Tensors or a single mutable Tensor (enforced by borrowck). It's possible to validate at runtime that tensor data is initialized when a user requests a tensor for reading and skip the initialization check if a tensor is requested only for writing.

Terminology

In Parenchyma, multidimensional Rust arrays represent tensors. A vector, a tensor with a rank of 1, in an n-dimensional space is represented by a one-dimensional Rust array of length n. Scalars, tensors with a rank of 0, are represented by numbers (e.g., 3). An array of arrays, such as [[1, 2, 3], [4, 5, 6]], represents a tensor with a rank of 2.

A tensor is essentially a generalization of vectors. A Parenchyma shared tensor tracks the memory copies of the numeric data of a tensor across the device of the backend and manages:

  • the location of these memory copies
  • the location of the latest memory copy and
  • the synchronization of memory copies between devices

This is important, as it provides a unified data interface for executing tensor operations on CUDA, OpenCL and common host CPU.

Read/Write

The methods view, view_mut, and write use unsafe to extend the lifetime of the returned reference to the internally owned memory chunk. The borrowck guarantees that the shared tensor outlives all of its tensors, and that there is only one mutable borrow.

TODO:

  • Therefore, we only need to make sure the memory locations won't be dropped or moved while there are active tensors.

  • Contexts and devices should also remain in scope, although it's unlikely that a context will have the same ID as a previous context...

Summary

If the caller reads (view or view_mut), memory is synchronized and the latest memory object is returned. If the caller mutably borrows memory (view_mut and write), it's expected that the memory will be overwritten, so the other memory locations are immediately considered outdated.

Examples

TODO

Fields

The shape of the shared tensor.

Methods

impl<T> SharedTensor<T> where Device: Alloc<T> + Synch<T>
[src]

Constructs a new SharedTensor with a shape of sh.

Constructs a new SharedTensor containing a chunk of data with a shape of sh.

Allocates memory on the active device and tracks it.

Drops memory allocation on the specified device. Returns error if no memory has been allocated on this device.

Change the shape of the Tensor.

Returns

Returns an error if the size of the new shape is not equal to the size of the old shape. If you want to change the shape to one of a different size, use SharedTensor::realloc.

Returns the number of elements the tensor can hold without reallocating.

impl<T> SharedTensor<T> where Device: Alloc<T> + Synch<T>
[src]

This block contains the read/write/auto-sync logic.

View an underlying tensor for reading on the active device.

This method can fail if memory allocation fails or if no memory is initialized. The borrowck guarantees that the shared tensor outlives all of its tensors.

Summary:

1) Check if there is initialized data anywhere 2) Lookup memory and its version for device, allocate it if it doesn't exist 3) Check version, if it's old, synchronize

View an underlying tensor for reading and writing on the active device. The memory location is set as the latest.

This method can fail is memory allocation fails or if no memory is initialized.

Summary:

1) Check if there is initialized data anywhere 2) Lookup memory and its version for device, allocate it if it doesn't exist 3) Check version, if it's old, synchronize 4) Increase memory version and latest_version

View an underlying tensor for writing only.

This method skips synchronization and initialization logic since its data will be overwritten anyway. The caller must initialize all elements contained in the tensor. This convention isn't enforced, but failure to do so may result in undefined data later.

Summary:

1) *Skip initialization check 2) Lookup memory and its version for device, allocate it if it doesn't exist 3) *Skip synchronization 4) Increase memory version and latest_version

TODO

  • Add an invalidate method:

    If the caller fails to overwrite memory, it must call invalidate to return the vector to an uninitialized state.

impl<T> SharedTensor<T> where Device: Alloc<T> + Synch<T>
[src]

Sync if necessary

TODO:

  • Choose the best source to copy data from. That would require some additional traits that return costs for transferring data between different backends.

note: Typically, there would be transfers between Native <-> GPU in foreseeable future, so it's best to not over-engineer here.

Trait Implementations

impl<T: Debug> Debug for SharedTensor<T>
[src]

Formats the value using the given formatter.