Struct parenchyma::SharedTensor
[−]
[src]
pub struct SharedTensor<T = f32> { pub shape: Shape, // some fields omitted }
A shared tensor for framework-agnostic, memory-aware, n-dimensional storage.
A SharedTensor
is used for the purpose of tracking the location of memory across devices
for one similar piece of data. SharedTensor
handles synchronization of memory of type T
, by
which it is parameterized, and provides the functionality for memory management across devices.
SharedTensor
holds copies and their version numbers. A user can request any number of
immutable Tensor
s or a single mutable Tensor
(enforced by borrowck). It's possible to
validate at runtime that tensor data is initialized when a user requests a tensor for reading
and skip the initialization check if a tensor is requested only for writing.
Terminology
In Parenchyma, multidimensional Rust arrays represent tensors. A vector, a tensor with a
rank of 1, in an n-dimensional space is represented by a one-dimensional Rust array of
length n. Scalars, tensors with a rank of 0, are represented by numbers (e.g., 3
). An array of
arrays, such as [[1, 2, 3], [4, 5, 6]]
, represents a tensor with a rank of 2.
A tensor is essentially a generalization of vectors. A Parenchyma shared tensor tracks the memory copies of the numeric data of a tensor across the device of the backend and manages:
- the location of these memory copies
- the location of the latest memory copy and
- the synchronization of memory copies between devices
This is important, as it provides a unified data interface for executing tensor operations on CUDA, OpenCL and common host CPU.
Read/Write
The methods view
, view_mut
, and write
use unsafe
to extend the lifetime of the returned
reference to the internally owned memory chunk. The borrowck guarantees that the shared tensor
outlives all of its tensors, and that there is only one mutable borrow.
TODO:
Therefore, we only need to make sure the memory locations won't be dropped or moved while there are active tensors.
Contexts and devices should also remain in scope, although it's unlikely that a context will have the same ID as a previous context...
Summary
If the caller reads (view
or view_mut
), memory is synchronized and the latest memory
object is returned. If the caller mutably borrows memory (view_mut
and write
), it's expected
that the memory will be overwritten, so the other memory locations are immediately considered
outdated.
Examples
TODO
Fields
shape: Shape
The shape of the shared tensor.
Methods
impl<T> SharedTensor<T> where Device: Alloc<T> + Synch<T>
[src]
fn new<A>(sh: A) -> Self where A: Into<Shape>
Constructs a new SharedTensor
with a shape of sh
.
fn with<H, I>(con: &H, sh: I, chunk: Vec<T>) -> Result<Self> where H: Has<Device>, I: Into<Shape>
Constructs a new SharedTensor
containing a chunk
of data with a shape of sh
.
fn alloc<H, I>(con: &H, sh: I) -> Result<Self> where H: Has<Device>, I: Into<Shape>
Allocates memory on the active device and tracks it.
fn dealloc<H>(&mut self, con: &H) -> Result<Memory<T>> where H: Has<Device>
Drops memory allocation on the specified device. Returns error if no memory has been allocated on this device.
fn reshape<I>(&mut self, sh: I) -> Result where I: Into<Shape>
Change the shape of the Tensor.
Returns
Returns an error if the size of the new shape is not equal to the size of the old shape.
If you want to change the shape to one of a different size, use SharedTensor::realloc
.
fn capacity(&self) -> usize
Returns the number of elements the tensor can hold without reallocating.
impl<T> SharedTensor<T> where Device: Alloc<T> + Synch<T>
[src]
This block contains the read/write/auto-sync logic.
fn read<'shared, H>(&'shared self, dev: &H) -> Result<&'shared Memory<T>> where H: Has<Device>
View an underlying tensor for reading on the active device.
This method can fail if memory allocation fails or if no memory is initialized. The borrowck guarantees that the shared tensor outlives all of its tensors.
Summary:
1) Check if there is initialized data anywhere
2) Lookup memory and its version for device
, allocate it if it doesn't exist
3) Check version, if it's old, synchronize
fn read_write<'shared, H>(&'shared mut self,
dev: &H)
-> Result<&'shared mut Memory<T>> where H: Has<Device>
dev: &H)
-> Result<&'shared mut Memory<T>> where H: Has<Device>
View an underlying tensor for reading and writing on the active device. The memory location is set as the latest.
This method can fail is memory allocation fails or if no memory is initialized.
Summary:
1) Check if there is initialized data anywhere
2) Lookup memory and its version for device
, allocate it if it doesn't exist
3) Check version, if it's old, synchronize
4) Increase memory version and latest_version
fn write<'shared, H>(&'shared mut self,
con: &H)
-> Result<&'shared mut Memory<T>> where H: Has<Device>
con: &H)
-> Result<&'shared mut Memory<T>> where H: Has<Device>
View an underlying tensor for writing only.
This method skips synchronization and initialization logic since its data will be overwritten anyway. The caller must initialize all elements contained in the tensor. This convention isn't enforced, but failure to do so may result in undefined data later.
Summary:
1) *Skip initialization check
2) Lookup memory and its version for device
, allocate it if it doesn't exist
3) *Skip synchronization
4) Increase memory version and latest_version
TODO
Add an
invalidate
method:If the caller fails to overwrite memory, it must call
invalidate
to return the vector to an uninitialized state.
impl<T> SharedTensor<T> where Device: Alloc<T> + Synch<T>
[src]
fn autosync<H>(&self, dev: &H, tick: bool) -> Result<usize> where H: Has<Device>
Sync if necessary
TODO:
- Choose the best source to copy data from. That would require some additional traits that return costs for transferring data between different backends.
note: Typically, there would be transfers between Native
<-> GPU
in foreseeable
future, so it's best to not over-engineer here.