pub struct TensorRef<'a, T, const N: usize>where
T: DeviceRepr + Copy + 'static,{
pub data: DeviceSlice<'a, T>,
pub shape: [i32; N],
pub stride: [i64; N],
}Expand description
Read-only view of a device-resident rank-N tensor.
shape[i] is the extent along axis i. stride[i] is the element
stride along axis i (memory offset to advance one step along that
axis). A stride of 0 along an axis signals a broadcast operand —
the kernel reads the same memory cell for every step along that
axis. The default row-major contiguous stride for shape
[d0, d1, …, dN-1] is [d1·d2·…·dN-1, …, dN-1, 1].
T is bounded by DeviceRepr only (not by crate::Element /
crate::IntElement) so the same view struct can carry any scalar
payload — input, output, mask, or auxiliary buffer. The per-op
element-class enforcement happens at the Plan layer.
Fields§
§data: DeviceSlice<'a, T>Device-resident element storage.
shape: [i32; N]Extent along each axis (in elements).
stride: [i64; N]Element stride along each axis. Stride 0 marks a broadcast axis.
Implementations§
Source§impl<'a, T, const N: usize> TensorRef<'a, T, N>where
T: DeviceRepr + Copy + 'static,
impl<'a, T, const N: usize> TensorRef<'a, T, N>where
T: DeviceRepr + Copy + 'static,
Sourcepub fn numel(&self) -> i64
pub fn numel(&self) -> i64
Total number of logical elements (product of shape).
Returns 1 for the rank-0 (scalar) case. Saturates on overflow
rather than wrapping — a tensor with overflowing element count
cannot fit in CUDA’s int64_t element-count surface anyway, so
reporting i64::MAX is a fine sentinel.
Sourcepub fn is_contiguous(&self) -> bool
pub fn is_contiguous(&self) -> bool
true iff stride matches the standard row-major contiguous
layout (rightmost axis has stride 1; each prior axis multiplies
by the extent to its right).
The rank-0 case is always contiguous. A broadcast tensor
(any stride[i] == 0) is not contiguous.