pub struct Tensor { /* private fields */ }Expand description
Tensor represents a multi-dimensional array with lazy evaluation.
Operations like addition and multiplication build a computation graph without allocating buffers. Buffers are only allocated when:
- Creating input tensors via
from_slice() - Evaluating the computation graph via
realize()
§Global Graph Substitution
Tensors are registered in a global registry to support atomic graph substitution.
When rangeify transforms a UOp (e.g., NEG → BUFFERIZE(NEG)), all tensors
referencing it are updated atomically via apply_map_to_tensors().
This is critical for diamond patterns (like argmin’s NEG feeding both MAX and EQ) where different consumers must see the same transformed version.
§Buffer Ownership (RAII)
Tensors own their buffers via Arc<Buffer>. When all Tensor clones referencing
a buffer are dropped, the buffer is automatically freed. This provides RAII
cleanup without manual buffer management.
§Examples
let a = Tensor::from_slice(&[1.0f32, 2.0, 3.0]);
let b = Tensor::from_slice(&[4.0f32, 5.0, 6.0]);
let mut c = &a + &b; // Lazy - only builds UOp graph
c.realize().unwrap(); // Executes the computationImplementations§
Source§impl Tensor
impl Tensor
Sourcepub fn relu(&self) -> Result<Self>
pub fn relu(&self) -> Result<Self>
Rectified Linear Unit: max(0, x).
ReLU is one of the most common activation functions in deep learning. It’s simple, efficient, and helps mitigate the vanishing gradient problem.
§Examples
let x = Tensor::from_slice(&[-2.0f32, -1.0, 0.0, 1.0, 2.0]);
let y = x.relu()?;
// y = [0.0, 0.0, 0.0, 1.0, 2.0]Sourcepub fn softmax(&self, axis: impl Into<AxisSpec>) -> Result<Self>
pub fn softmax(&self, axis: impl Into<AxisSpec>) -> Result<Self>
Softmax activation: exp(x - max(x)) / sum(exp(x - max(x))).
Converts logits to probability distribution over specified axis. Numerically stable implementation using max subtraction.
§Arguments
axis- Axis along which to compute softmax (default: -1, last axis)
§Examples
let logits = Tensor::from_slice(&[1.0f32, 2.0, 3.0, 4.0]);
let probs = logits.softmax(-1)?;
// sum(probs) = 1.0, probs[i] > 0 for all iSourcepub fn log_softmax(&self, axis: impl Into<AxisSpec>) -> Result<Self>
pub fn log_softmax(&self, axis: impl Into<AxisSpec>) -> Result<Self>
Log-softmax activation: log(softmax(x)).
Numerically stable implementation: x - max(x) - log(sum(exp(x - max(x)))).
More numerically stable than computing log(softmax(x)) separately.
§Arguments
axis- Axis along which to compute log-softmax (default: -1, last axis)
§Examples
let logits = Tensor::from_slice(&[1.0f32, 2.0, 3.0, 4.0]);
let log_probs = logits.log_softmax(-1)?;
// More numerically stable than logits.softmax(-1)?.try_log()Sourcepub fn gelu_exact(&self) -> Result<Self>
pub fn gelu_exact(&self) -> Result<Self>
Exact GELU: 0.5 * x * (1 + erf(x / sqrt(2))).
Sourcepub fn hard_sigmoid(&self, alpha: f64, beta: f64) -> Result<Self>
pub fn hard_sigmoid(&self, alpha: f64, beta: f64) -> Result<Self>
Hard Sigmoid: clamp(alpha * x + beta, 0, 1).
Piecewise linear approximation of sigmoid. Faster to compute.
§Arguments
alpha- Slope (default 0.2 in ONNX)beta- Offset (default 0.5 in ONNX)
Sourcepub fn leaky_relu(&self, alpha: f64) -> Result<Self>
pub fn leaky_relu(&self, alpha: f64) -> Result<Self>
Leaky ReLU: x if x > 0, alpha * x otherwise.
§Arguments
alpha- Negative slope (default 0.01 in ONNX)
Sourcepub fn prelu(&self, slope: &Tensor) -> Result<Self>
pub fn prelu(&self, slope: &Tensor) -> Result<Self>
PReLU: x if x > 0, slope * x otherwise.
Like LeakyReLU but with a learned per-channel slope.
Sourcepub fn thresholded_relu(&self, alpha: f64) -> Result<Self>
pub fn thresholded_relu(&self, alpha: f64) -> Result<Self>
Sourcepub fn elu(&self, alpha: f64) -> Result<Self>
pub fn elu(&self, alpha: f64) -> Result<Self>
ELU: x if x > 0, alpha * (exp(x) - 1) otherwise.
§Arguments
alpha- Scale for negative part (default 1.0 in ONNX)
Sourcepub fn selu(&self, alpha: f64, gamma: f64) -> Result<Self>
pub fn selu(&self, alpha: f64, gamma: f64) -> Result<Self>
SELU: gamma * (alpha * exp(x) - alpha) if x <= 0, gamma * x if x > 0.
Self-normalizing activation with fixed constants.
§Arguments
alpha- Default 1.6732632…gamma- Default 1.0507010…
Sourcepub fn glu(&self, dim: isize) -> Result<Self>
pub fn glu(&self, dim: isize) -> Result<Self>
Gated Linear Unit: splits self along dim into two halves,
returns first_half * sigmoid(second_half).
Sourcepub fn softplus(&self, beta: f64) -> Result<Self>
pub fn softplus(&self, beta: f64) -> Result<Self>
Softplus: log(1 + exp(beta*x)) / beta, numerically stable via logaddexp.
Sourcepub fn celu(&self, alpha: f64) -> Result<Self>
pub fn celu(&self, alpha: f64) -> Result<Self>
CELU: max(0, x) + min(0, alpha*(exp(x/alpha)-1)).
Sourcepub fn batchnorm<'f1, 'f2, 'f3, 'f4, 'f5>(
&'f1 self,
) -> TensorBatchnormBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
pub fn batchnorm<'f1, 'f2, 'f3, 'f4, 'f5>( &'f1 self, ) -> TensorBatchnormBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
Batch Normalization.
Applies: y = scale * (x - mean) * invstd + bias
where invstd = 1 / sqrt(var + epsilon)
This is the inference mode batchnorm (no running stats update). The caller provides pre-computed mean and inverse standard deviation.
§Arguments
scale- Gamma/weight parameter (optional, defaults to 1)bias- Beta parameter (optional, defaults to 0)mean- Running meaninvstd- Inverse standard deviation (1 / sqrt(var + eps))axis- Axis/axes to normalize over (default: 1 for NCHW)
§Examples
let x = Tensor::randn(&[8, 4, 16, 16]);
let mean = x.mean(AxisSpec::Multiple(vec![0, 2, 3]))?;
let var = x.var(AxisSpec::Multiple(vec![0, 2, 3]))?;
let eps = Tensor::from_slice([1e-5]);
let invstd = var.try_add(&eps)?.try_rsqrt()?;
let normalized = x.batchnorm().mean(&mean).invstd(&invstd).call()?;Source§impl Tensor
impl Tensor
pub fn try_add(&self, other: &Tensor) -> Result<Tensor>
pub fn try_sub(&self, other: &Tensor) -> Result<Tensor>
pub fn try_mul(&self, other: &Tensor) -> Result<Tensor>
pub fn try_div(&self, other: &Tensor) -> Result<Tensor>
pub fn try_mod(&self, other: &Tensor) -> Result<Tensor>
pub fn try_pow(&self, other: &Tensor) -> Result<Tensor>
pub fn try_eq(&self, other: &Tensor) -> Result<Tensor>
pub fn try_ne(&self, other: &Tensor) -> Result<Tensor>
pub fn try_lt(&self, other: &Tensor) -> Result<Tensor>
pub fn try_le(&self, other: &Tensor) -> Result<Tensor>
pub fn try_gt(&self, other: &Tensor) -> Result<Tensor>
pub fn try_ge(&self, other: &Tensor) -> Result<Tensor>
pub fn try_bitor(&self, other: &Tensor) -> Result<Tensor>
pub fn try_bitand(&self, other: &Tensor) -> Result<Tensor>
pub fn try_bitxor(&self, other: &Tensor) -> Result<Tensor>
pub fn try_shl(&self, other: &Tensor) -> Result<Tensor>
pub fn try_shr(&self, other: &Tensor) -> Result<Tensor>
pub fn try_neg(&self) -> Result<Tensor>
pub fn try_abs(&self) -> Result<Tensor>
pub fn try_sqrt(&self) -> Result<Tensor>
pub fn try_rsqrt(&self) -> Result<Tensor>
pub fn try_exp(&self) -> Result<Tensor>
pub fn try_exp2(&self) -> Result<Tensor>
pub fn try_log(&self) -> Result<Tensor>
pub fn try_log2(&self) -> Result<Tensor>
Sourcepub fn logical_not(&self) -> Result<Tensor>
pub fn logical_not(&self) -> Result<Tensor>
Logical NOT for boolean tensors.
Converts to boolean dtype and applies logical negation. For non-boolean tensors, treats zero as false, non-zero as true.
§Examples
let t = Tensor::from_slice(&[true, false, true]);
let result = t.logical_not()?; // [false, true, false]
let nums = Tensor::from_slice(&[0.0f32, 1.0, 2.0]);
let result = nums.logical_not()?; // [true, false, false]Sourcepub fn bitwise_not(&self) -> Result<Tensor>
pub fn bitwise_not(&self) -> Result<Tensor>
Source§impl Tensor
impl Tensor
Sourcepub fn bitwise_and(&self, other: &Tensor) -> Result<Tensor>
pub fn bitwise_and(&self, other: &Tensor) -> Result<Tensor>
Bitwise AND operation.
Performs element-wise bitwise AND between two tensors with broadcasting. Both tensors must have integer or boolean dtype.
Sourcepub fn bitwise_or(&self, other: &Tensor) -> Result<Tensor>
pub fn bitwise_or(&self, other: &Tensor) -> Result<Tensor>
Bitwise OR operation.
Performs element-wise bitwise OR between two tensors with broadcasting. Both tensors must have integer or boolean dtype.
Sourcepub fn bitwise_xor(&self, other: &Tensor) -> Result<Tensor>
pub fn bitwise_xor(&self, other: &Tensor) -> Result<Tensor>
Bitwise XOR operation.
Performs element-wise bitwise XOR between two tensors with broadcasting. Both tensors must have integer or boolean dtype.
Source§impl Tensor
impl Tensor
Sourcepub fn broadcast_to(&self, target_shape: &Shape) -> Result<Tensor>
pub fn broadcast_to(&self, target_shape: &Shape) -> Result<Tensor>
Broadcast tensor to a target shape.
This is the low-level broadcast operation that reshapes (adds explicit 1 dimensions) and then expands (replicates data along size-1 dimensions).
§Algorithm
- If shape already matches, return self
- Pad shape with 1s on the left to match rank
- Reshape to add explicit 1 dimensions
- Expand size-1 dimensions to target size
§Examples
// [3] -> [2, 3]
let t = Tensor::from_slice([1.0f32, 2.0, 3.0]);
let target = vec![SInt::from(2), SInt::from(3)];
let broadcasted = t.broadcast_to(&target)?;§Errors
Returns error if:
- Shape has more dimensions than target
- Dimension sizes are incompatible (not 1 and not equal to target)
Source§impl Tensor
impl Tensor
Sourcepub fn where_(&self, condition: &Tensor, other: &Tensor) -> Result<Self>
pub fn where_(&self, condition: &Tensor, other: &Tensor) -> Result<Self>
Element-wise conditional selection: condition ? self : other.
For each element, returns self[i] if condition[i] is true, else other[i].
§Arguments
condition- Boolean tensor (dtype should be Bool or will be treated as boolean)other- Alternative value tensor
§Shape Requirements
All three tensors (self, condition, other) must be broadcastable to the same shape.
§Examples
let x = Tensor::from_slice(&[1.0f32, 2.0, 3.0, 4.0]);
let condition = &x.gt(&Tensor::from_slice(&[2.0f32]))?; // [false, false, true, true]
let zeros = Tensor::from_slice(&[0.0f32]);
// Replace values > 2.0 with the original value, else 0
let result = x.where_(condition, &zeros)?;
// result = [0.0, 0.0, 3.0, 4.0]Sourcepub fn maximum(&self, other: &Tensor) -> Result<Self>
pub fn maximum(&self, other: &Tensor) -> Result<Self>
Element-wise maximum: max(self, other).
Returns the element-wise maximum of two tensors. This is NOT a reduction - it returns a tensor of the same shape.
§Shape Requirements
Both tensors must be broadcastable to the same shape.
§Examples
let a = Tensor::from_slice(&[1.0f32, 5.0, 3.0]);
let b = Tensor::from_slice(&[2.0f32, 3.0, 4.0]);
let result = a.maximum(&b)?;
// result = [2.0, 5.0, 4.0]Sourcepub fn minimum(&self, other: &Tensor) -> Result<Self>
pub fn minimum(&self, other: &Tensor) -> Result<Self>
Element-wise minimum: min(self, other).
Returns the element-wise minimum of two tensors. This is NOT a reduction - it returns a tensor of the same shape.
§Shape Requirements
Both tensors must be broadcastable to the same shape.
§Examples
let a = Tensor::from_slice(&[1.0f32, 5.0, 3.0]);
let b = Tensor::from_slice(&[2.0f32, 3.0, 4.0]);
let result = a.minimum(&b)?;
// result = [1.0, 3.0, 3.0]Sourcepub fn clamp<'f1, 'f2, 'f3>(&'f1 self) -> TensorClampBuilder<'f1, 'f2, 'f3>
pub fn clamp<'f1, 'f2, 'f3>(&'f1 self) -> TensorClampBuilder<'f1, 'f2, 'f3>
Clamp values to a range: max(min_val, min(self, max_val)).
Constrains all elements to be within [min_val, max_val].
§Examples
let x = Tensor::from_slice(&[-1.0f32, 0.0, 1.0, 2.0, 3.0]);
let min = Tensor::from_slice(&[0.0f32, 0.0, 0.0, 0.0, 0.0]);
let max = Tensor::from_slice(&[2.0f32, 2.0, 2.0, 2.0, 2.0]);
// Clamp to [0, 2]
let result = x.clamp().min(&min).max(&max).call()?;
// result = [0.0, 0.0, 1.0, 2.0, 2.0]
// Clamp only lower bound
let result = x.clamp().min(&min).call()?;
// result = [0.0, 0.0, 1.0, 2.0, 3.0]
// Clamp only upper bound
let result = x.clamp().max(&max).call()?;
// result = [-1.0, 0.0, 1.0, 2.0, 2.0]Sourcepub fn clip<'f1, 'f2, 'f3>(&'f1 self) -> TensorClipBuilder<'f1, 'f2, 'f3>
pub fn clip<'f1, 'f2, 'f3>(&'f1 self) -> TensorClipBuilder<'f1, 'f2, 'f3>
Alias for clamp (matches NumPy/PyTorch naming).
§Examples
let x = Tensor::from_slice(&[-1.0f32, 0.0, 1.0, 2.0, 3.0]);
let min = Tensor::from_slice(&[0.0f32, 0.0, 0.0, 0.0, 0.0]);
let max = Tensor::from_slice(&[2.0f32, 2.0, 2.0, 2.0, 2.0]);
// Clip to [0, 2]
let result = x.clip().min(&min).max(&max).call()?;Source§impl Tensor
impl Tensor
Sourcepub fn from_slice<T: HasDType, C: AsRef<[T]>>(source: C) -> Self
pub fn from_slice<T: HasDType, C: AsRef<[T]>>(source: C) -> Self
Create tensor from slice on CPU (default device).
§Examples
let a = Tensor::from_slice(&[1.0f32, 2.0, 3.0]);Sourcepub fn from_slice_with<T: HasDType, C: AsRef<[T]>>() -> TensorFromSliceWithBuilder<T, C>
pub fn from_slice_with<T: HasDType, C: AsRef<[T]>>() -> TensorFromSliceWithBuilder<T, C>
Create tensor from slice with explicit device specification using builder pattern.
Source§impl Tensor
impl Tensor
Sourcepub fn from_raw_bytes(
data: &[u8],
shape: &[usize],
dtype: DType,
) -> Result<Self>
pub fn from_raw_bytes( data: &[u8], shape: &[usize], dtype: DType, ) -> Result<Self>
Create tensor from raw bytes with explicit dtype and shape.
The bytes are interpreted as little-endian values of the given dtype.
Length must equal product(shape) * dtype.bytes().
Used for types without a native Rust representation (Float16, BFloat16, FP8).
Sourcepub fn from_ndarray<T, S, D>(array: &ArrayBase<S, D>) -> Self
pub fn from_ndarray<T, S, D>(array: &ArrayBase<S, D>) -> Self
Create tensor from an ndarray (owned Array or ArrayView).
When the array is already C-contiguous, uses the backing slice directly
(no intermediate allocation). Falls back to .iter().cloned().collect()
for Fortran-order or non-contiguous layouts.
§Examples
let t = Tensor::from_ndarray(&array![[1.0f32, 2.0, 3.0], [4.0, 5.0, 6.0]]);
let view = t.array_view::<f32>().unwrap();
assert_eq!(view[[1, 2]], 6.0);Sourcepub fn buffer(&self) -> Option<Buffer>
pub fn buffer(&self) -> Option<Buffer>
Get a reference to the underlying buffer.
Returns None for lazy tensors that haven’t been realized yet.
Returns Some(buffer) for input tensors and realized tensors.
Sourcepub fn as_ndarray<T: HasDType + Default + Clone>(&self) -> Result<ArrayD<T>>
pub fn as_ndarray<T: HasDType + Default + Clone>(&self) -> Result<ArrayD<T>>
Read realized tensor data as an ndarray.
The tensor must have a buffer (from from_slice, realize(), etc.).
Returns error if the tensor has not been realized.
§Examples
let t = Tensor::from_slice(&[1.0f32, 2.0, 3.0]);
let result = t.as_ndarray::<f32>().unwrap();
assert_eq!(result.shape(), &[3]);Sourcepub fn as_vec<T: HasDType + Default + Clone>(&self) -> Result<Vec<T>>
pub fn as_vec<T: HasDType + Default + Clone>(&self) -> Result<Vec<T>>
Read realized tensor data as a flat Vec<T>.
The tensor must have a buffer (from from_slice, realize(), etc.).
Returns error if the tensor has not been realized.
§Examples
let t = Tensor::from_slice(&[1.0f32, 2.0, 3.0]);
let v = t.as_vec::<f32>().unwrap();
assert_eq!(v, vec![1.0, 2.0, 3.0]);Sourcepub fn array_view<T: HasDType>(&self) -> Result<ArrayViewD<'_, T>>
pub fn array_view<T: HasDType>(&self) -> Result<ArrayViewD<'_, T>>
Typed immutable view into the buffer, shaped by the tensor’s logical shape.
Uses the tensor’s concrete shape for multidimensional indexing. Falls back to the buffer’s flat shape for symbolic tensors.
§Examples
let t = Tensor::from_ndarray(&array![[1.0f32, 2.0], [3.0, 4.0]]);
let view = t.array_view::<f32>().unwrap();
assert_eq!(view[[0, 1]], 2.0);Sourcepub fn array_view_mut<T: HasDType>(&self) -> Result<ArrayViewMutD<'_, T>>
pub fn array_view_mut<T: HasDType>(&self) -> Result<ArrayViewMutD<'_, T>>
Typed mutable view into the buffer, shaped by the tensor’s logical shape.
§Examples
let t = Tensor::from_ndarray(&array![[0.0f32, 0.0, 0.0], [0.0, 0.0, 0.0]]);
t.array_view_mut::<f32>().unwrap()[[1, 2]] = 42.0;
assert_eq!(t.array_view::<f32>().unwrap()[[1, 2]], 42.0);Source§impl Tensor
impl Tensor
Sourcepub fn gather(&self, dim: isize, index: &Tensor) -> Result<Self>
pub fn gather(&self, dim: isize, index: &Tensor) -> Result<Self>
Gather values along an axis specified by dim, using index for element selection.
Sourcepub fn index_select(&self, dim: isize, index: &Tensor) -> Result<Self>
pub fn index_select(&self, dim: isize, index: &Tensor) -> Result<Self>
Select elements along dim using a 1D index tensor.
For input shape [A, B, C] with dim=1 and index shape [K],
returns shape [A, K, C].
Sourcepub fn one_hot_along_dim(
&self,
num_classes: usize,
dim: isize,
) -> Result<Tensor>
pub fn one_hot_along_dim( &self, num_classes: usize, dim: isize, ) -> Result<Tensor>
One-hot encoding: self == arange(num_classes) broadcast along dim. Returns a boolean tensor with True at the class positions.
Sourcepub fn normalize_negative_indices(&self, dim_size: i64) -> Result<Tensor>
pub fn normalize_negative_indices(&self, dim_size: i64) -> Result<Tensor>
Normalize negative indices: indices[i] = indices[i] < 0 ? indices[i] + dim_size : indices[i]
Sourcepub fn scatter(
&self,
dim: isize,
index: &Tensor,
src: &Tensor,
) -> Result<Tensor>
pub fn scatter( &self, dim: isize, index: &Tensor, src: &Tensor, ) -> Result<Tensor>
Scatter values along dim using index positions.
For each position in index, places the corresponding src value into self at the specified index along dim. When multiple indices map to the same position, the last value wins (matching PyTorch/Tinygrad semantics).
Sourcepub fn scatter_reduce(
&self,
dim: isize,
index: &Tensor,
src: &Tensor,
reduce: ScatterReduction,
include_self: bool,
) -> Result<Tensor>
pub fn scatter_reduce( &self, dim: isize, index: &Tensor, src: &Tensor, reduce: ScatterReduction, include_self: bool, ) -> Result<Tensor>
Scatter with reduction. Applies reduce (sum/prod/amax/amin) at scatter positions.
Sourcepub fn masked_select(&self, mask: &Tensor) -> Result<Tensor>
pub fn masked_select(&self, mask: &Tensor) -> Result<Tensor>
Select elements where mask is true, returning a flat tensor.
Requires realize() internally (data-dependent output size).
Sourcepub fn compress(
&self,
condition: &[bool],
axis: Option<isize>,
) -> Result<Tensor>
pub fn compress( &self, condition: &[bool], axis: Option<isize>, ) -> Result<Tensor>
Select elements along an axis where condition is true.
If axis is None, the input is flattened first and selection is along axis 0.
The condition is a 1D boolean/integer tensor; nonzero values select.
Sourcepub fn sort(&self, dim: isize, descending: bool) -> Result<(Tensor, Tensor)>
pub fn sort(&self, dim: isize, descending: bool) -> Result<(Tensor, Tensor)>
Bitonic sort along a dimension. Returns (sorted_values, indices).
Sourcepub fn topk(
&self,
k: usize,
dim: isize,
largest: bool,
) -> Result<(Tensor, Tensor)>
pub fn topk( &self, k: usize, dim: isize, largest: bool, ) -> Result<(Tensor, Tensor)>
Top-k elements along a dimension. Returns (values, indices).
Sourcepub fn nonzero(&self) -> Result<Tensor>
pub fn nonzero(&self) -> Result<Tensor>
Indices of non-zero elements. Returns [num_nonzero, ndim] tensor.
Sourcepub fn reverse_sequence(
&self,
sequence_lens: &Tensor,
time_axis: usize,
batch_axis: usize,
) -> Result<Self>
pub fn reverse_sequence( &self, sequence_lens: &Tensor, time_axis: usize, batch_axis: usize, ) -> Result<Self>
Reverse the first sequence_lens[i] elements along time_axis for each
batch element i along batch_axis, leaving the rest unchanged.
Sourcepub fn gather_nd(&self, indices: &Tensor, batch_dims: usize) -> Result<Tensor>
pub fn gather_nd(&self, indices: &Tensor, batch_dims: usize) -> Result<Tensor>
Gather values using N-dimensional indices.
Source§impl Tensor
impl Tensor
Sourcepub fn round(&self) -> Result<Tensor>
pub fn round(&self) -> Result<Tensor>
Round function: round to nearest integer (half to even).
Rounds each element to the nearest integer. Ties are rounded to the nearest even number. For integer dtypes, returns the tensor unchanged.
§Examples
let t = Tensor::from_slice(&[1.2f32, 1.5, 2.5, -1.5]);
let result = t.round()?; // [1.0, 2.0, 2.0, -2.0]Sourcepub fn reciprocal(&self) -> Result<Tensor>
pub fn reciprocal(&self) -> Result<Tensor>
Sourcepub fn lerp(&self, end: &Tensor, weight: &Tensor) -> Result<Tensor>
pub fn lerp(&self, end: &Tensor, weight: &Tensor) -> Result<Tensor>
Linear interpolation: self + (end - self) * weight.
Sourcepub fn isinf(
&self,
detect_positive: bool,
detect_negative: bool,
) -> Result<Tensor>
pub fn isinf( &self, detect_positive: bool, detect_negative: bool, ) -> Result<Tensor>
Returns true where elements are infinite.
Detects ±∞ via bitcast to the corresponding unsigned integer type and a
bit-pattern compare. Operating in integer space sidesteps Svod’s float
range analysis, which folds x == ±inf to false because dtype_bounds
returns finite ±max for floats. Tinygrad gets away with the float compare
because its dtype.min/max are ±inf.
Sourcepub fn asin(&self) -> Result<Tensor>
pub fn asin(&self) -> Result<Tensor>
Arcsine using polynomial approximation (Abramowitz & Stegun 4.4.46).
Source§impl Tensor
impl Tensor
Sourcepub fn dot(&self, other: &Tensor) -> Result<Tensor>
pub fn dot(&self, other: &Tensor) -> Result<Tensor>
Dot product / matrix multiplication.
Core method following Tinygrad’s API:
- 1D @ 1D: dot product (scalar)
- 2D @ 2D: matrix multiplication
- 1D @ 2D: vector @ matrix
- 2D @ 1D: matrix @ vector
- 3D+: batched matmul (batch dims broadcast)
§Arguments
other- Right-hand tensor
§Examples
// Vector dot product
let a = Tensor::from_slice(&[1.0f32, 2.0, 3.0]);
let b = Tensor::from_slice(&[4.0f32, 5.0, 6.0]);
let result = a.dot(&b)?; // scalar: 32.0
// Matrix multiplication
let a = Tensor::from_slice(&[1.0f32, 2.0, 3.0, 4.0]).try_reshape(&[2, 2])?;
let b = Tensor::from_slice(&[5.0f32, 6.0, 7.0, 8.0]).try_reshape(&[2, 2])?;
let result = a.dot(&b)?; // [2, 2]Source§impl Tensor
impl Tensor
Sourcepub fn matmul_with<'f1, 'f2>(&'f1 self) -> TensorMatmulWithBuilder<'f1, 'f2>
pub fn matmul_with<'f1, 'f2>(&'f1 self) -> TensorMatmulWithBuilder<'f1, 'f2>
Sourcepub fn gemm<'f1, 'f2, 'f3>(&'f1 self) -> TensorGemmBuilder<'f1, 'f2, 'f3>
pub fn gemm<'f1, 'f2, 'f3>(&'f1 self) -> TensorGemmBuilder<'f1, 'f2, 'f3>
General Matrix Multiplication: alpha * A @ B + beta * C
Sourcepub fn linear<'f1, 'f2, 'f3>(&'f1 self) -> TensorLinearBuilder<'f1, 'f2, 'f3>
pub fn linear<'f1, 'f2, 'f3>(&'f1 self) -> TensorLinearBuilder<'f1, 'f2, 'f3>
Linear transformation: self @ weight.T + bias.
Common operation in neural networks (fully connected layers).
Follows PyTorch convention where weight has shape [out_features, in_features]
and is transposed before multiplication.
§Arguments
weight- Weight matrix (shape:[out_features, in_features])bias- Optional bias vector (shape:[out_features])
§Shape Requirements
- self:
[..., in_features] - weight:
[out_features, in_features] - bias:
[out_features]or None - result:
[..., out_features]
§Examples
let input = Tensor::from_slice(&[1.0f32, 2.0, 3.0]).try_reshape(&[1, 3])?;
let weight = Tensor::from_slice(&[1.0f32, 2.0, 3.0, 4.0, 5.0, 6.0]).try_reshape(&[2, 3])?;
let bias = Tensor::from_slice(&[0.1f32, 0.2f32]);
let result = input.linear().weight(&weight).bias(&bias).call()?;
// result shape: [1, 2]Source§impl Tensor
impl Tensor
Sourcepub fn conv2d<'f1, 'f2, 'f3, 'f4, 'f5, 'f6>(
&'f1 self,
) -> TensorConv2dBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6>
pub fn conv2d<'f1, 'f2, 'f3, 'f4, 'f5, 'f6>( &'f1 self, ) -> TensorConv2dBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6>
N-d convolution. Input (N, Cin, *spatial), Weight (Cout, Cin/groups, *kernel).
Computes cross-correlation (conv without kernel flip) by extracting sliding
windows via pool, then contracting against the weight tensor.
Supports grouped convolution, strided/dilated kernels, and asymmetric padding.
§Examples
Basic 2D convolution with uniform data:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 5, 5), 1.0f32));
let w = Tensor::from_ndarray(&Array4::from_elem((1, 1, 3, 3), 1.0f32));
let mut y = x.conv2d().weight(&w).call().unwrap();
y.realize().unwrap();
// 3x3 kernel of ones on input of ones => each output element is 9.0
assert_eq!(y.as_vec::<f32>().unwrap(), vec![9.0; 9]);With stride:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 5, 5), 1.0f32));
let w = Tensor::from_ndarray(&Array4::from_elem((1, 1, 3, 3), 1.0f32));
let mut y = x.conv2d().weight(&w).stride(&[2, 2]).call().unwrap();
y.realize().unwrap();
let shape: Vec<_> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 1, 2, 2]);
assert_eq!(y.as_vec::<f32>().unwrap(), vec![9.0; 4]);With padding:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 3, 3), 1.0f32));
let w = Tensor::from_ndarray(&Array4::from_elem((1, 1, 3, 3), 1.0f32));
// padding=1 on each side: output matches input spatial dims
let mut y = x.conv2d().weight(&w).padding(&[(1, 1), (1, 1)]).call().unwrap();
y.realize().unwrap();
let vals = y.as_vec::<f32>().unwrap();
assert_eq!(vals.len(), 9); // 3x3 output
// Center element sees full 3x3 window of ones = 9.0
assert_eq!(vals[4], 9.0);
// Corner element sees 2x2 window = 4.0
assert_eq!(vals[0], 4.0);With bias:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 3, 3), 1.0f32));
let w = Tensor::from_ndarray(&Array4::from_elem((1, 1, 3, 3), 1.0f32));
let b = Tensor::from_slice([10.0f32]);
let mut y = x.conv2d().weight(&w).bias(&b).call().unwrap();
y.realize().unwrap();
// Each output element: 9.0 + 10.0 = 19.0
assert_eq!(y.as_vec::<f32>().unwrap(), vec![19.0]);Sourcepub fn conv_transpose2d<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7>(
&'f1 self,
) -> TensorConvTranspose2dBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7>
pub fn conv_transpose2d<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7>( &'f1 self, ) -> TensorConvTranspose2dBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7>
Transposed convolution (fractionally-strided convolution).
Computes the gradient of a forward convolution, commonly used for upsampling.
Internally flips the kernel, interleaves zeros for stride > 1, computes
transposed padding, then delegates to conv2d.
Input (N, Cin, *spatial), Weight (Cin, Cout/groups, *kernel).
§Examples
Basic transposed convolution (upsampling):
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 2, 2), 1.0f32));
let w = Tensor::from_ndarray(&Array4::from_elem((1, 1, 3, 3), 1.0f32));
let mut y = x.conv_transpose2d().weight(&w).call().unwrap();
y.realize().unwrap();
let vals = y.as_vec::<f32>().unwrap();
assert_eq!(vals.len(), 16); // 4x4 output
// Center elements see full overlap of both input positions
assert_eq!(vals[5], 4.0);With stride (stronger upsampling):
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 2, 2), 1.0f32));
let w = Tensor::from_ndarray(&Array4::from_elem((1, 1, 3, 3), 1.0f32));
let mut y = x.conv_transpose2d().weight(&w).stride(&[2, 2]).call().unwrap();
y.realize().unwrap();
let vals = y.as_vec::<f32>().unwrap();
assert_eq!(vals.len(), 25); // 5x5 outputWith padding and output padding:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 2, 2), 1.0f32));
let w = Tensor::from_ndarray(&Array4::from_elem((1, 1, 3, 3), 1.0f32));
let mut y = x.conv_transpose2d()
.weight(&w)
.stride(&[2, 2])
.padding(&[(1, 1), (1, 1)])
.output_padding(&[1, 1])
.call()
.unwrap();
y.realize().unwrap();
let vals = y.as_vec::<f32>().unwrap();
assert_eq!(vals.len(), 16); // 4x4 outputSource§impl Tensor
impl Tensor
Sourcepub fn affine_grid<'f1, 'f2>() -> TensorAffineGridBuilder<'f1, 'f2>
pub fn affine_grid<'f1, 'f2>() -> TensorAffineGridBuilder<'f1, 'f2>
Generate an affine sampling grid from transformation parameters.
Produces a grid of normalized coordinates suitable for grid_sample.
theta holds affine matrices of shape [N, spatial_dims, spatial_dims+1].
size is the target output shape [N, C, *spatial_dims].
§Examples
Identity transform producing a 4x4 grid:
let theta = Tensor::from_ndarray(&array![[[1.0f32, 0.0, 0.0], [0.0, 1.0, 0.0]]]);
let grid = Tensor::affine_grid().theta(&theta).size(&[1, 1, 4, 4]).call().unwrap();
let shape: Vec<usize> = grid.shape().unwrap().iter()
.map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 4, 4, 2]); // [N, H, W, 2]With align_corners:
let theta = Tensor::from_ndarray(&array![[[1.0f32, 0.0, 0.0], [0.0, 1.0, 0.0]]]);
let grid = Tensor::affine_grid()
.theta(&theta)
.size(&[1, 1, 4, 4])
.align_corners(true)
.call()
.unwrap();
let shape: Vec<usize> = grid.shape().unwrap().iter()
.map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 4, 4, 2]);Sourcepub fn grid_sample<'f1, 'f2>(&'f1 self) -> TensorGridSampleBuilder<'f1, 'f2>
pub fn grid_sample<'f1, 'f2>(&'f1 self) -> TensorGridSampleBuilder<'f1, 'f2>
Sample input at positions specified by a coordinate grid.
self: Input tensor[N, C, *spatial_dims]grid: Coordinate grid[N, *output_spatial_dims, n_spatial]with values in[-1, 1]- Returns:
[N, C, *output_spatial_dims]
§Examples
Sample with a grid from affine_grid:
let theta = Tensor::from_ndarray(&array![[[1.0f32, 0.0, 0.0], [0.0, 1.0, 0.0]]]);
let grid = Tensor::affine_grid().theta(&theta).size(&[1, 1, 4, 4]).call().unwrap();
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 4, 4), 1.0f32));
let y = x.grid_sample().grid(&grid).call().unwrap();
let shape: Vec<usize> = y.shape().unwrap().iter()
.map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 1, 4, 4]);With nearest-mode interpolation:
let theta = Tensor::from_ndarray(&array![[[1.0f32, 0.0, 0.0], [0.0, 1.0, 0.0]]]);
let grid = Tensor::affine_grid().theta(&theta).size(&[1, 1, 4, 4]).call().unwrap();
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 4, 4), 1.0f32));
let y = x.grid_sample()
.grid(&grid)
.mode(GridSampleMode::Nearest)
.call()
.unwrap();
let shape: Vec<usize> = y.shape().unwrap().iter()
.map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 1, 4, 4]);Source§impl Tensor
impl Tensor
Sourcepub fn layernorm(&self, axis: isize, eps: f64) -> Result<Tensor>
pub fn layernorm(&self, axis: isize, eps: f64) -> Result<Tensor>
Layer normalization over axes [axis..ndim). Casts to f32 internally
for numerical stability.
Normalizes the input so that the slice along the specified trailing axes has zero mean and unit variance, then returns the result cast back to the original dtype.
§Examples
let x = Tensor::from_ndarray(&array![[1.0f32, 2.0, 3.0], [4.0, 5.0, 6.0]]);
let mut y = x.layernorm(-1, 1e-5).unwrap();
y.realize().unwrap();
let vals = y.as_vec::<f32>().unwrap();
// Each row is independently normalized to mean~0, std~1
assert!((vals[0] + vals[1] + vals[2]).abs() < 1e-5);Sourcepub fn layernorm_with_stats(
&self,
axis: isize,
eps: f64,
) -> Result<(Tensor, Tensor, Tensor)>
pub fn layernorm_with_stats( &self, axis: isize, eps: f64, ) -> Result<(Tensor, Tensor, Tensor)>
Layer normalization returning (normalized, mean, inv_std_dev).
Computes in f32 for numerical stability (matches ONNX stash_type=1).
The mean and inv_std_dev tensors remain in f32 regardless of input dtype.
§Examples
let x = Tensor::from_ndarray(&array![[1.0f32, 2.0, 3.0]]);
let (_normed, mut mean, _inv_std) = x.layernorm_with_stats(-1, 1e-5).unwrap();
mean.realize().unwrap();
let mean_val = mean.as_vec::<f32>().unwrap();
assert!((mean_val[0] - 2.0).abs() < 1e-5);Sourcepub fn rms_norm(&self, axis: isize, eps: f64) -> Result<Tensor>
pub fn rms_norm(&self, axis: isize, eps: f64) -> Result<Tensor>
RMS normalization over axes [axis..ndim).
Like layernorm but without mean subtraction: divides each element by the root-mean-square of its slice. Computes the normalization factor in f32, then multiplies the original (unconverted) input.
§Examples
let x = Tensor::from_ndarray(&array![[1.0f32, 2.0, 3.0]]);
let mut y = x.rms_norm(-1, 1e-5).unwrap();
y.realize().unwrap();
let vals = y.as_vec::<f32>().unwrap();
// RMS of [1,2,3] = sqrt((1+4+9)/3) ≈ 2.16
// Output ≈ [0.46, 0.93, 1.39]
assert!((vals[0] - 1.0 / (14.0f32 / 3.0).sqrt()).abs() < 1e-4);Sourcepub fn lp_normalize(&self, axis: isize, p: i64) -> Result<Tensor>
pub fn lp_normalize(&self, axis: isize, p: i64) -> Result<Tensor>
Lp normalization along an axis.
Divides each element by the Lp norm of its slice along axis,
so that every such slice has unit Lp norm. Only p=1 (L1) and
p=2 (L2) are implemented; any p != 1 defaults to L2.
§Examples
L2 normalization (default p=2):
let x = Tensor::from_ndarray(&array![[3.0f32, 4.0]]);
let mut y = x.lp_normalize(-1, 2).unwrap();
y.realize().unwrap();
let vals = y.as_vec::<f32>().unwrap();
// L2 norm of [3,4] = 5, so output ≈ [0.6, 0.8]
assert!((vals[0] - 0.6).abs() < 1e-5);
assert!((vals[1] - 0.8).abs() < 1e-5);L1 normalization (p=1):
let x = Tensor::from_ndarray(&array![[3.0f32, 4.0]]);
let mut y = x.lp_normalize(-1, 1).unwrap();
y.realize().unwrap();
let vals = y.as_vec::<f32>().unwrap();
// L1 norm of [3,4] = 7, so output ≈ [3/7, 4/7]
assert!((vals[0] - 3.0 / 7.0).abs() < 1e-5);Sourcepub fn mean_variance_normalize(
&self,
axes: &[isize],
eps: f64,
) -> Result<Tensor>
pub fn mean_variance_normalize( &self, axes: &[isize], eps: f64, ) -> Result<Tensor>
Mean Variance Normalization.
Subtracts the mean and divides by the population standard deviation
(plus eps) over the given axes. Implements the ONNX
MeanVarianceNormalization operator.
§Examples
let x = Tensor::from_ndarray(&array![[1.0f32, 2.0, 3.0], [4.0, 5.0, 6.0]]);
let mut y = x.mean_variance_normalize(&[0, 1], 1e-5).unwrap();
y.realize().unwrap();
let vals = y.as_vec::<f32>().unwrap();
// Global mean = 3.5, std ≈ 1.708
assert!((vals[0] - (1.0 - 3.5) / (35.0f32 / 12.0).sqrt()).abs() < 1e-4);
assert!(vals[0] < 0.0);
assert!(vals[5] > 0.0);Sourcepub fn group_norm<'f1, 'f2, 'f3>(
&'f1 self,
) -> TensorGroupNormBuilder<'f1, 'f2, 'f3>
pub fn group_norm<'f1, 'f2, 'f3>( &'f1 self, ) -> TensorGroupNormBuilder<'f1, 'f2, 'f3>
Group normalization: reshape into groups, layernorm each group, then apply per-channel scale and bias.
Input must be at least 2-D with shape [N, C, ...]. Channels are split
into num_groups groups and each group is independently normalized.
Casts to f32 internally for numerical stability.
§Examples
let x = Tensor::from_ndarray(&Array4::from_elem((1, 4, 2, 2), 1.0f32));
let scale = Tensor::from_slice([1.0f32; 4]);
let bias = Tensor::from_slice([0.0f32; 4]);
let y = x.group_norm().scale(&scale).bias(&bias).num_groups(2).call().unwrap();
let shape: Vec<_> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, [1, 4, 2, 2]);Custom epsilon:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 4, 2, 2), 1.0f32));
let scale = Tensor::from_slice([1.0f32; 4]);
let bias = Tensor::from_slice([0.0f32; 4]);
let y = x.group_norm().scale(&scale).bias(&bias).num_groups(2).eps(1e-6).call().unwrap();
let shape: Vec<_> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, [1, 4, 2, 2]);Source§impl Tensor
impl Tensor
Sourcepub fn try_pad_value(
&self,
padding: &[(isize, isize)],
value: f64,
) -> Result<Tensor>
pub fn try_pad_value( &self, padding: &[(isize, isize)], value: f64, ) -> Result<Tensor>
Pad with a custom fill value. Delegates to try_pad when value == 0.0.
Each element of padding is (before, after) for the corresponding dimension.
Non-zero fill is implemented via an additive mask to avoid nested WHERE conditions.
§Examples
Zero padding (delegates to try_pad):
let x = Tensor::from_slice([1.0f32, 2.0, 3.0]);
let mut y = x.try_pad_value(&[(1, 1)], 0.0).unwrap();
y.realize().unwrap();
assert_eq!(y.as_vec::<f32>().unwrap(), vec![0.0, 1.0, 2.0, 3.0, 0.0]);Negative-infinity padding (useful for max pooling):
let x = Tensor::from_slice([1.0f32, 2.0, 3.0]);
let mut y = x.try_pad_value(&[(1, 0)], f64::NEG_INFINITY).unwrap();
y.realize().unwrap();
assert_eq!(y.as_vec::<f32>().unwrap(), vec![f32::NEG_INFINITY, 1.0, 2.0, 3.0]);Sourcepub fn pad_with<'f1, 'f2>(&'f1 self) -> TensorPadWithBuilder<'f1, 'f2>
pub fn pad_with<'f1, 'f2>(&'f1 self) -> TensorPadWithBuilder<'f1, 'f2>
Pad with configurable mode and fill value.
Supports four padding modes via PadMode:
Constant(default): fill withvalue(default 0.0)Replicate: repeat boundary valuesReflect: mirror without repeating boundaryCircular: wrap around
§Examples
Constant padding (default mode):
let x = Tensor::from_slice([1.0f32, 2.0, 3.0]);
let mut y = x.pad_with().padding(&[(1, 1)]).call().unwrap();
y.realize().unwrap();
assert_eq!(y.as_vec::<f32>().unwrap(), vec![0.0, 1.0, 2.0, 3.0, 0.0]);Constant padding with a custom fill value:
let x = Tensor::from_slice([1.0f32, 2.0, 3.0]);
let mut y = x.pad_with().padding(&[(1, 1)]).value(-f64::INFINITY).call().unwrap();
y.realize().unwrap();
assert_eq!(y.as_vec::<f32>().unwrap(), vec![f32::NEG_INFINITY, 1.0, 2.0, 3.0, f32::NEG_INFINITY]);Replicate (edge) padding:
let x = Tensor::from_slice([1.0f32, 2.0, 3.0]);
let mut y = x.pad_with().padding(&[(2, 2)]).mode(PadMode::Replicate).call().unwrap();
y.realize().unwrap();
assert_eq!(y.as_vec::<f32>().unwrap(), vec![1.0, 1.0, 1.0, 2.0, 3.0, 3.0, 3.0]);Reflect padding:
let x = Tensor::from_slice([1.0f32, 2.0, 3.0]);
let mut y = x.pad_with().padding(&[(2, 2)]).mode(PadMode::Reflect).call().unwrap();
y.realize().unwrap();
assert_eq!(y.as_vec::<f32>().unwrap(), vec![3.0, 2.0, 1.0, 2.0, 3.0, 2.0, 1.0]);Circular (wrap) padding:
let x = Tensor::from_slice([1.0f32, 2.0, 3.0]);
let mut y = x.pad_with().padding(&[(2, 2)]).mode(PadMode::Circular).call().unwrap();
y.realize().unwrap();
assert_eq!(y.as_vec::<f32>().unwrap(), vec![2.0, 3.0, 1.0, 2.0, 3.0, 1.0, 2.0]);Source§impl Tensor
impl Tensor
Sourcepub fn pool(
&self,
kernel: &[usize],
stride: &[usize],
dilation: &[usize],
) -> Result<Tensor>
pub fn pool( &self, kernel: &[usize], stride: &[usize], dilation: &[usize], ) -> Result<Tensor>
Sliding window extraction via shape manipulation (Tinygrad’s _pool).
Input: (..., *spatial) → Output: (..., *out_spatial, *kernel).
This is a low-level building block for pooling and convolution. It extracts all sliding windows of the given kernel size, stride, and dilation from the spatial dimensions, appending the kernel dimensions at the end.
Source§impl Tensor
impl Tensor
Sourcepub fn avg_pool2d<'f1, 'f2, 'f3, 'f4, 'f5>(
&'f1 self,
) -> TensorAvgPool2dBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
pub fn avg_pool2d<'f1, 'f2, 'f3, 'f4, 'f5>( &'f1 self, ) -> TensorAvgPool2dBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
Average pooling over spatial dimensions.
Computes the mean of each sliding window. Supports padding, dilation,
count_include_pad (whether padded zeros count in the denominator),
and ceil_mode (round output size up instead of down).
Stride defaults to kernel_size when not specified.
§Examples
Basic 2x2 average pooling:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 4, 4), 1.0f32));
let mut y = x.avg_pool2d().kernel_size(&[2, 2]).call().unwrap();
y.realize().unwrap();
let shape: Vec<_> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 1, 2, 2]);
assert_eq!(y.as_vec::<f32>().unwrap(), vec![1.0; 4]);With explicit stride:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 4, 4), 1.0f32));
let y = x.avg_pool2d().kernel_size(&[2, 2]).stride(&[1, 1]).call().unwrap();
let shape: Vec<_> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 1, 3, 3]);With padding and count_include_pad disabled:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 2, 2), 1.0f32));
let mut y = x.avg_pool2d()
.kernel_size(&[2, 2])
.stride(&[1, 1])
.padding(&[(1, 1), (1, 1)])
.count_include_pad(false)
.call()
.unwrap();
y.realize().unwrap();
// With count_include_pad=false, only non-padded elements count in the average
assert_eq!(y.as_vec::<f32>().unwrap(), vec![1.0; 9]);Sourcepub fn max_pool2d<'f1, 'f2, 'f3, 'f4, 'f5>(
&'f1 self,
) -> TensorMaxPool2dBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
pub fn max_pool2d<'f1, 'f2, 'f3, 'f4, 'f5>( &'f1 self, ) -> TensorMaxPool2dBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
Max pooling over spatial dimensions.
Returns the maximum value in each sliding window. Padded positions are
filled with -inf (float) or i64::MIN (integer) so they never win.
Stride defaults to kernel_size when not specified.
§Examples
Basic 2x2 max pooling:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 4, 4), 1.0f32));
let mut y = x.max_pool2d().kernel_size(&[2, 2]).call().unwrap();
y.realize().unwrap();
let shape: Vec<_> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 1, 2, 2]);
assert_eq!(y.as_vec::<f32>().unwrap(), vec![1.0; 4]);With stride and padding:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 4, 4), 1.0f32));
let mut y = x.max_pool2d()
.kernel_size(&[3, 3])
.stride(&[1, 1])
.padding(&[(1, 1), (1, 1)])
.call()
.unwrap();
y.realize().unwrap();
let shape: Vec<_> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 1, 4, 4]);
assert_eq!(y.as_vec::<f32>().unwrap(), vec![1.0; 16]);Sourcepub fn max_pool2d_with_indices<'f1, 'f2, 'f3, 'f4, 'f5>(
&'f1 self,
) -> TensorMaxPool2dWithIndicesBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
pub fn max_pool2d_with_indices<'f1, 'f2, 'f3, 'f4, 'f5>( &'f1 self, ) -> TensorMaxPool2dWithIndicesBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
Max pooling returning both values and flat indices.
Returns (values, indices) where indices are flat offsets into the
input spatial dimensions. Indices can be passed to
max_unpool2d to invert the operation.
Uses a reverse-arange trick (from Tinygrad) to compute first-occurrence indices without explicit argmax.
§Examples
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 4, 4), 1.0f32));
let (mut values, indices) = x.max_pool2d_with_indices()
.kernel_size(&[2, 2])
.call()
.unwrap();
let _ = indices;
values.realize().unwrap();
let shape: Vec<_> = values.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 1, 2, 2]);
assert_eq!(values.as_vec::<f32>().unwrap(), vec![1.0; 4]);Sourcepub fn max_unpool2d<'f1, 'f2, 'f3, 'f4, 'f5, 'f6>(
&'f1 self,
) -> TensorMaxUnpool2dBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6>
pub fn max_unpool2d<'f1, 'f2, 'f3, 'f4, 'f5, 'f6>( &'f1 self, ) -> TensorMaxUnpool2dBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6>
Inverse of max pooling: scatter pooled values back to their original positions.
Indices are flat offsets into the inferred output spatial shape (computed
from kernel/stride/padding). When output_size exceeds the inferred shape,
the result is zero-padded to match.
Uses one-hot encoding of indices to scatter values: one_hot(idx) * vals -> sum.
§Examples
Round-trip with max_pool2d_with_indices:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 4, 4), 1.0f32));
let (values, indices) = x.max_pool2d_with_indices()
.kernel_size(&[2, 2])
.call()
.unwrap();
let unpooled = values.max_unpool2d()
.indices(&indices)
.kernel_size(&[2, 2])
.call()
.unwrap();
let shape: Vec<_> = unpooled.shape().unwrap().iter()
.map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 1, 4, 4]);Sourcepub fn col2im<'f1, 'f2, 'f3, 'f4, 'f5, 'f6>(
&'f1 self,
) -> TensorCol2imBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6>
pub fn col2im<'f1, 'f2, 'f3, 'f4, 'f5, 'f6>( &'f1 self, ) -> TensorCol2imBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6>
Col2Im: adjoint of im2col. Reconstructs an image from columns, summing overlaps.
Input shape: [N, C * prod(block_shape), L] where L is the number of sliding positions.
Output shape: [N, C, *image_shape].
Uses the adjoint of pool: for each kernel position, stride-dilate
the column data, pad to the correct offset, and accumulate. O(output_size) memory,
O(bl * output_size) compute – no large one-hot intermediates.
§Examples
Reconstruct a 4x4 image from 2x2 blocks with no overlap:
// 1 batch, 1 channel, 2x2 block = 4 cols, 4 sliding positions
let cols = Tensor::from_ndarray(&Array3::from_elem((1, 4, 4), 1.0f32));
let mut img = cols.col2im()
.image_shape(&[4, 4])
.block_shape(&[2, 2])
.strides(&[2, 2])
.call()
.unwrap();
img.realize().unwrap();
let shape: Vec<_> = img.shape().unwrap().iter()
.map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 1, 4, 4]);
// Non-overlapping blocks of ones reconstruct to all ones
assert_eq!(img.as_vec::<f32>().unwrap(), vec![1.0; 16]);Source§impl Tensor
impl Tensor
Sourcepub fn clamp_cast(&self, dtype: DType) -> Result<Self>
pub fn clamp_cast(&self, dtype: DType) -> Result<Self>
Clamp to the representable range of dtype, then cast.
Values outside the target type’s range are saturated to its min/max before casting, preventing overflow wrap-around.
§Examples
let x = Tensor::from_slice([300.0f32, -10.0, 128.0]);
let mut y = x.clamp_cast(DType::UInt8).unwrap();
y.realize().unwrap();
let vals = y.as_vec::<u8>().unwrap();
assert_eq!(vals, vec![255, 0, 128]);Sourcepub fn qlinear_conv<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7, 'f8, 'f9, 'f10, 'f11, 'f12, 'f13>(
&'f1 self,
) -> TensorQlinearConvBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7, 'f8, 'f9, 'f10, 'f11, 'f12, 'f13>
pub fn qlinear_conv<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7, 'f8, 'f9, 'f10, 'f11, 'f12, 'f13>( &'f1 self, ) -> TensorQlinearConvBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7, 'f8, 'f9, 'f10, 'f11, 'f12, 'f13>
Quantized convolution: zero-point–adjust inputs, convolve in int32, rescale and requantize to the output dtype.
Implements the ONNX QLinearConv operator. The flow is:
- Subtract zero points from input and weights
- Perform integer convolution
- Rescale by
(x_scale * w_scale) / y_scaleand addy_zero_point
§Examples
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 3, 3), 128u8));
let x_scale = Tensor::from_slice([0.1f32]);
let x_zp = Tensor::from_slice([128u8]);
let weight = Tensor::from_ndarray(&Array4::from_elem((1, 1, 1, 1), 128u8));
let w_scale = Tensor::from_slice([0.1f32]);
let w_zp = Tensor::from_slice([128u8]);
let y_scale = Tensor::from_slice([0.1f32]);
let y_zp = Tensor::from_slice([128u8]);
let y = x.qlinear_conv()
.x_scale(&x_scale).x_zero_point(&x_zp)
.weight(&weight).w_scale(&w_scale).w_zero_point(&w_zp)
.y_scale(&y_scale).y_zero_point(&y_zp)
.call()
.unwrap();
let shape: Vec<usize> = y.shape().unwrap().iter()
.map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 1, 3, 3]);Sourcepub fn conv_integer<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7, 'f8, 'f9>(
&'f1 self,
) -> TensorConvIntegerBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7, 'f8, 'f9>
pub fn conv_integer<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7, 'f8, 'f9>( &'f1 self, ) -> TensorConvIntegerBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7, 'f8, 'f9>
Integer convolution: zero-point–adjust inputs and convolve in int32. No rescaling — returns raw int32 result.
Implements the ONNX ConvInteger operator. Subtracts optional zero points
from input and weights, then convolves in int32. Unlike qlinear_conv,
no output rescaling is applied.
§Examples
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 3, 3), 10u8));
let weight = Tensor::from_ndarray(&Array4::from_elem((1, 1, 1, 1), 1u8));
let y = x.conv_integer().weight(&weight).call().unwrap();
let shape: Vec<usize> = y.shape().unwrap().iter()
.map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 1, 3, 3]);Sourcepub fn qlinear_matmul<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7, 'f8>(
&'f1 self,
) -> TensorQlinearMatmulBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7, 'f8>
pub fn qlinear_matmul<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7, 'f8>( &'f1 self, ) -> TensorQlinearMatmulBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7, 'f8>
Quantized matrix multiplication: zero-point–adjust inputs, matmul in int32, rescale and requantize to the output dtype.
Implements the ONNX QLinearMatMul operator. The flow is:
- Subtract zero points from both inputs
- Perform integer matrix multiplication
- Rescale by
(a_scale * b_scale) / y_scaleand addy_zero_point
§Examples
let a = Tensor::from_ndarray(&Array2::from_elem((2, 3), 128u8));
let a_scale = Tensor::from_slice([0.1f32]);
let a_zp = Tensor::from_slice([128u8]);
let b = Tensor::from_ndarray(&Array2::from_elem((3, 4), 128u8));
let b_scale = Tensor::from_slice([0.1f32]);
let b_zp = Tensor::from_slice([128u8]);
let y_scale = Tensor::from_slice([0.1f32]);
let y_zp = Tensor::from_slice([128u8]);
let y = a.qlinear_matmul()
.a_scale(&a_scale).a_zero_point(&a_zp)
.b(&b).b_scale(&b_scale).b_zero_point(&b_zp)
.y_scale(&y_scale).y_zero_point(&y_zp)
.call()
.unwrap();
let shape: Vec<usize> = y.shape().unwrap().iter()
.map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![2, 4]);Source§impl Tensor
impl Tensor
Sourcepub fn resize<'f1, 'f2, 'f3, 'f4, 'f5>(
&'f1 self,
) -> TensorResizeBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
pub fn resize<'f1, 'f2, 'f3, 'f4, 'f5>( &'f1 self, ) -> TensorResizeBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
Resize a tensor using interpolation (ONNX Resize operator).
Supports nearest, linear, and cubic interpolation modes with various
coordinate transformation modes. Either scales or sizes must be
provided to specify the target dimensions.
§Examples
Nearest-mode 2x upscale via scales:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 2, 2), 1.0f32));
let mut y = x.resize().scales(&[1.0, 1.0, 2.0, 2.0]).call().unwrap();
y.realize().unwrap();
let shape: Vec<usize> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 1, 4, 4]);
assert!(y.as_vec::<f32>().unwrap().iter().all(|&v| (v - 1.0).abs() < 1e-5));Resize to explicit output sizes:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 2, 2), 1.0f32));
let mut y = x.resize().sizes(&[1, 1, 6, 6]).call().unwrap();
y.realize().unwrap();
let shape: Vec<usize> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 1, 6, 6]);
assert!(y.as_vec::<f32>().unwrap().iter().all(|&v| (v - 1.0).abs() < 1e-5));Linear interpolation mode:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 2, 2), 1.0f32));
let mut y = x.resize()
.scales(&[1.0, 1.0, 2.0, 2.0])
.mode(ResizeMode::Linear)
.call()
.unwrap();
y.realize().unwrap();
let shape: Vec<usize> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, vec![1, 1, 4, 4]);
assert!(y.as_vec::<f32>().unwrap().iter().all(|&v| (v - 1.0).abs() < 1e-5));Source§impl Tensor
impl Tensor
Sourcepub fn rnn<'f1, 'f2, 'f3, 'f4, 'f5>(
&'f1 self,
) -> TensorRnnBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
pub fn rnn<'f1, 'f2, 'f3, 'f4, 'f5>( &'f1 self, ) -> TensorRnnBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
Simple RNN (Elman network).
H_t = tanh(X_t @ W^T + H_{t-1} @ R^T + Wb + Rb)
x: input[seq_length, batch_size, input_size](layout=0) or[batch_size, seq_length, input_size](layout=1)w: input weights[num_directions, hidden_size, input_size]r: recurrence weights[num_directions, hidden_size, hidden_size]bias: optional bias[num_directions, 2 * hidden_size](Wb ++ Rb)initial_h: optional initial hidden state[num_directions, batch_size, hidden_size]layout: 0 = seq-first (default), 1 = batch-first
§Examples
// seq=2, batch=1, input=3
let x = Tensor::from_ndarray(&Array3::from_elem((2, 1, 3), 0.1f32));
let w = Tensor::from_ndarray(&Array3::from_elem((1, 4, 3), 0.1f32)); // [1, hidden=4, input=3]
let r = Tensor::from_ndarray(&Array3::from_elem((1, 4, 4), 0.1f32)); // [1, hidden=4, hidden=4]
let out = x.rnn().w(&w).r(&r).hidden_size(4).call().unwrap();
let y_shape: Vec<usize> = out.y.shape().unwrap().iter()
.map(|d| d.as_const().unwrap()).collect();
assert_eq!(y_shape, vec![2, 1, 1, 4]); // [seq, num_directions, batch, hidden]
let yh_shape: Vec<usize> = out.y_h.shape().unwrap().iter()
.map(|d| d.as_const().unwrap()).collect();
assert_eq!(yh_shape, vec![1, 1, 4]); // [num_directions, batch, hidden]Sourcepub fn gru<'f1, 'f2, 'f3, 'f4, 'f5>(
&'f1 self,
) -> TensorGruBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
pub fn gru<'f1, 'f2, 'f3, 'f4, 'f5>( &'f1 self, ) -> TensorGruBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
GRU (Gated Recurrent Unit).
Gate order: [z, r, h] (update, reset, hidden).
Equations (default, linear_before_reset=0):
z = sigmoid(X @ W_z^T + H @ R_z^T + w_bz + r_bz)r = sigmoid(X @ W_r^T + H @ R_r^T + w_br + r_br)h = tanh(X @ W_h^T + (r * H) @ R_h^T + w_bh + r_bh)H_new = (1 - z) * h + z * H_prev
When linear_before_reset=1:
-
h = tanh(X @ W_h^T + r * (H @ R_h^T + r_bh) + w_bh) -
x: input[seq_length, batch_size, input_size](layout=0) or[batch_size, seq_length, input_size](layout=1) -
w: input weights[num_directions, 3*hidden_size, input_size] -
r_weights: recurrence weights[num_directions, 3*hidden_size, hidden_size] -
bias: optional[num_directions, 6*hidden_size](Wb ++ Rb) -
initial_h: optional[num_directions, batch_size, hidden_size] -
linear_before_reset: 0 (default) or 1 -
layout: 0 = seq-first (default), 1 = batch-first
§Examples
// seq=2, batch=1, input=3, hidden=4
let x = Tensor::from_ndarray(&Array3::from_elem((2, 1, 3), 0.1f32));
// GRU: w is [num_directions, 3*hidden_size, input_size]
let w = Tensor::from_ndarray(&Array3::from_elem((1, 12, 3), 0.1f32));
// GRU: r is [num_directions, 3*hidden_size, hidden_size]
let r = Tensor::from_ndarray(&Array3::from_elem((1, 12, 4), 0.1f32));
let out = x.gru().w(&w).r_weights(&r).hidden_size(4).call().unwrap();
let y_shape: Vec<usize> = out.y.shape().unwrap().iter()
.map(|d| d.as_const().unwrap()).collect();
assert_eq!(y_shape, vec![2, 1, 1, 4]); // [seq, num_directions, batch, hidden]Sourcepub fn lstm<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7>(
&'f1 self,
) -> TensorLstmBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7>
pub fn lstm<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7>( &'f1 self, ) -> TensorLstmBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7>
LSTM (Long Short-Term Memory).
Gate order: [i, o, f, c] (input, output, forget, cell).
x: input[seq_length, batch_size, input_size](layout=0) or[batch_size, seq_length, input_size](layout=1)w: input weights[num_directions, 4*hidden_size, input_size]r: recurrence weights[num_directions, 4*hidden_size, hidden_size]bias: optional[num_directions, 8*hidden_size](Wb ++ Rb)initial_h: optional[num_directions, batch_size, hidden_size]initial_c: optional[num_directions, batch_size, hidden_size]peepholes: optional[num_directions, 3*hidden_size](p_i, p_o, p_f)layout: 0 = seq-first (default), 1 = batch-first
§Examples
// seq=2, batch=1, input=3, hidden=4
let x = Tensor::from_ndarray(&Array3::from_elem((2, 1, 3), 0.1f32));
// LSTM: w is [num_directions, 4*hidden_size, input_size]
let w = Tensor::from_ndarray(&Array3::from_elem((1, 16, 3), 0.1f32));
// LSTM: r is [num_directions, 4*hidden_size, hidden_size]
let r = Tensor::from_ndarray(&Array3::from_elem((1, 16, 4), 0.1f32));
let out = x.lstm().w(&w).r(&r).hidden_size(4).call().unwrap();
let y_shape: Vec<usize> = out.y.shape().unwrap().iter()
.map(|d| d.as_const().unwrap()).collect();
assert_eq!(y_shape, vec![2, 1, 1, 4]); // [seq, num_directions, batch, hidden]
let yc_shape: Vec<usize> = out.y_c.shape().unwrap().iter()
.map(|d| d.as_const().unwrap()).collect();
assert_eq!(yc_shape, vec![1, 1, 4]); // [num_directions, batch, hidden]Source§impl Tensor
impl Tensor
Sourcepub fn space_to_depth(&self, blocksize: usize) -> Result<Tensor>
pub fn space_to_depth(&self, blocksize: usize) -> Result<Tensor>
Rearrange spatial data into depth (inverse of depth_to_space).
Reshapes a [N, C, H, W] tensor to [N, C*b*b, H/b, W/b] where b
is the blocksize. Both H and W must be divisible by blocksize.
§Examples
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 4, 4), 1.0f32));
let mut y = x.space_to_depth(2).unwrap();
y.realize().unwrap();
let shape: Vec<_> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, [1, 4, 2, 2]);
assert_eq!(y.as_vec::<f32>().unwrap(), vec![1.0; 16]);Sourcepub fn nll_loss<'f1, 'f2, 'f3>(&'f1 self) -> TensorNllLossBuilder<'f1, 'f2, 'f3>
pub fn nll_loss<'f1, 'f2, 'f3>(&'f1 self) -> TensorNllLossBuilder<'f1, 'f2, 'f3>
Negative log-likelihood loss.
self is [N, C, ...] log-probabilities, target is [N, ...] class indices
(dtype i64). Gathers the log-prob at the target class and negates it.
Supports optional per-class weight, ignore_index to mask out a class,
and reduction (default Mean).
§Examples
let logprobs = Tensor::from_ndarray(&array![[-0.5f32, -1.0, -2.0]]);
let target = Tensor::from_slice([0i64]);
let mut loss = logprobs.nll_loss().target(&target).call().unwrap();
loss.realize().unwrap();
let val = loss.as_vec::<f32>().unwrap();
// -(-0.5) = 0.5
assert!((val[0] - 0.5).abs() < 1e-5);With sum reduction:
let logprobs = Tensor::from_ndarray(&array![[-0.5f32, -1.0], [-2.0, -0.3]]);
let target = Tensor::from_slice([0i64, 1]);
let mut loss = logprobs.nll_loss().target(&target).reduction(Reduction::Sum).call().unwrap();
loss.realize().unwrap();
let val = loss.as_vec::<f32>().unwrap();
// sum of 0.5 + 0.3 = 0.8
assert!((val[0] - 0.8).abs() < 1e-5);Sourcepub fn dropout<'f1>(&'f1 self) -> TensorDropoutBuilder<'f1>
pub fn dropout<'f1>(&'f1 self) -> TensorDropoutBuilder<'f1>
Dropout: randomly zeros elements during training, passes through in inference.
Returns (output, mask) where mask is a boolean tensor (true = kept).
In inference mode (training=false, the default), the output is identical
to the input and the mask is all-true.
Note: Training mode is not yet implemented (requires RNG); currently
returns identity regardless of training.
§Examples
let x = Tensor::from_ndarray(&array![1.0f32, 2.0, 3.0]);
let (mut out, mut mask) = x.dropout().p(0.5).call().unwrap();
out.realize().unwrap();
mask.realize().unwrap();
// Default is inference mode: output == input
assert_eq!(out.as_vec::<f32>().unwrap(), vec![1.0, 2.0, 3.0]);
assert_eq!(mask.as_vec::<bool>().unwrap(), vec![true, true, true]);Sourcepub fn conv<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7>(
&'f1 self,
) -> TensorConvBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7>
pub fn conv<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7>( &'f1 self, ) -> TensorConvBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7>
Convolution with ONNX-style parameters.
Wraps the lower-level conv2d after resolving ONNX padding conventions
(auto_pad, flat pads). Input shape is [N, C, H, W, ...] and weight
shape is [out_channels, in_channels/group, kH, kW, ...].
§Examples
Basic convolution with no padding:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 5, 5), 1.0f32));
let w = Tensor::from_ndarray(&Array4::from_elem((1, 1, 3, 3), 1.0f32));
let mut y = x.conv().weight(&w).call().unwrap();
y.realize().unwrap();
let shape: Vec<_> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, [1, 1, 3, 3]);
// Each output element sums a 3x3 window of ones = 9.0
assert_eq!(y.as_vec::<f32>().unwrap(), vec![9.0; 9]);With explicit padding and strides:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 5, 5), 1.0f32));
let w = Tensor::from_ndarray(&Array4::from_elem((1, 1, 3, 3), 1.0f32));
let mut y = x.conv().weight(&w).pads(&[1, 1, 1, 1]).strides(&[2, 2]).call().unwrap();
y.realize().unwrap();
let shape: Vec<_> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, [1, 1, 3, 3]);
assert_eq!(y.as_vec::<f32>().unwrap(), vec![4.0, 6.0, 4.0, 6.0, 9.0, 6.0, 4.0, 6.0, 4.0]);Sourcepub fn conv_transpose<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7, 'f8, 'f9>(
&'f1 self,
) -> TensorConvTransposeBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7, 'f8, 'f9>
pub fn conv_transpose<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7, 'f8, 'f9>( &'f1 self, ) -> TensorConvTransposeBuilder<'f1, 'f2, 'f3, 'f4, 'f5, 'f6, 'f7, 'f8, 'f9>
Transposed convolution with ONNX-style parameters.
Wraps conv_transpose2d after resolving ONNX padding conventions.
Supports output_shape and output_padding for precise output size control.
§Examples
Basic transposed convolution (upsampling):
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 2, 2), 1.0f32));
let w = Tensor::from_ndarray(&Array4::from_elem((1, 1, 3, 3), 1.0f32));
let mut y = x.conv_transpose().weight(&w).call().unwrap();
y.realize().unwrap();
let vals = y.as_vec::<f32>().unwrap();
assert_eq!(vals.len(), 16); // 4x4 output
assert_eq!(vals[5], 4.0); // center sees full overlapWith stride (larger upsampling factor):
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 2, 2), 1.0f32));
let w = Tensor::from_ndarray(&Array4::from_elem((1, 1, 3, 3), 1.0f32));
let mut y = x.conv_transpose().weight(&w).strides(&[2, 2]).call().unwrap();
y.realize().unwrap();
let vals = y.as_vec::<f32>().unwrap();
assert_eq!(vals.len(), 25); // 5x5 outputSourcepub fn avg_pool<'f1, 'f2, 'f3, 'f4, 'f5>(
&'f1 self,
) -> TensorAvgPoolBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
pub fn avg_pool<'f1, 'f2, 'f3, 'f4, 'f5>( &'f1 self, ) -> TensorAvgPoolBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
Average pooling with ONNX-style parameters.
Wraps avg_pool2d after resolving ONNX padding and stride conventions.
Stride defaults to 1 (unlike avg_pool2d which defaults to kernel_size).
Input shape is [N, C, H, W].
§Examples
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 4, 4), 1.0f32));
let mut y = x.avg_pool().kernel_shape(&[2, 2]).call().unwrap();
y.realize().unwrap();
let shape: Vec<_> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, [1, 1, 3, 3]);
// Average of all-ones windows is 1.0
assert!(y.as_vec::<f32>().unwrap().iter().all(|&v| (v - 1.0).abs() < 1e-6));With strides:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 4, 4), 1.0f32));
let mut y = x.avg_pool().kernel_shape(&[2, 2]).strides(&[2, 2]).call().unwrap();
y.realize().unwrap();
let shape: Vec<_> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, [1, 1, 2, 2]);
assert_eq!(y.as_vec::<f32>().unwrap(), vec![1.0; 4]);Sourcepub fn lp_pool<'f1, 'f2, 'f3, 'f4, 'f5>(
&'f1 self,
) -> TensorLpPoolBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
pub fn lp_pool<'f1, 'f2, 'f3, 'f4, 'f5>( &'f1 self, ) -> TensorLpPoolBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
Lp norm pooling with ONNX-style parameters.
Computes (sum(|x|^p))^(1/p) over each pooling window. Defaults to
p=2 (L2 pooling). Input shape is [N, C, H, W].
§Examples
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 4, 4), 1.0f32));
let mut y = x.lp_pool().kernel_shape(&[2, 2]).call().unwrap();
y.realize().unwrap();
let shape: Vec<_> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, [1, 1, 3, 3]);
// L2 pool of 2x2 window of ones = sqrt(4) = 2.0
assert!((y.as_vec::<f32>().unwrap()[0] - 2.0).abs() < 1e-5);Sourcepub fn depth_to_space<'f1>(&'f1 self) -> TensorDepthToSpaceBuilder<'f1>
pub fn depth_to_space<'f1>(&'f1 self) -> TensorDepthToSpaceBuilder<'f1>
Rearrange depth data into spatial blocks (inverse of space_to_depth).
Equivalent to PyTorch’s F.pixel_shuffle. Reshapes a [N, C, H, W]
tensor to [N, C/(b*b), H*b, W*b] where b is the blocksize.
§Examples
let x = Tensor::from_ndarray(&Array4::from_elem((1, 4, 1, 1), 1.0f32));
let mut y = x.depth_to_space().blocksize(2).call().unwrap();
y.realize().unwrap();
let shape: Vec<_> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, [1, 1, 2, 2]);
assert_eq!(y.as_vec::<f32>().unwrap(), vec![1.0; 4]);Using CRD mode (PyTorch pixel_shuffle order):
let x = Tensor::from_ndarray(&Array4::from_elem((1, 4, 1, 1), 1.0f32));
let mut y = x.depth_to_space().blocksize(2).mode(DepthToSpaceMode::Crd).call().unwrap();
y.realize().unwrap();
assert_eq!(y.as_vec::<f32>().unwrap(), vec![1.0; 4]);Sourcepub fn max_pool<'f1, 'f2, 'f3, 'f4, 'f5>(
&'f1 self,
) -> TensorMaxPoolBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
pub fn max_pool<'f1, 'f2, 'f3, 'f4, 'f5>( &'f1 self, ) -> TensorMaxPoolBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
Max pooling with ONNX-style parameters.
Always returns (values, indices) where indices are flattened positions
(dtype i64). Wraps max_pool2d_with_indices after resolving ONNX
padding conventions.
§Examples
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 4, 4), 1.0f32));
let (vals, indices) = x.max_pool().kernel_shape(&[2, 2]).call().unwrap();
let shape: Vec<_> = vals.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, [1, 1, 3, 3]);With strides:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 1, 4, 4), 1.0f32));
let (vals, _) = x.max_pool().kernel_shape(&[2, 2]).strides(&[2, 2]).call().unwrap();
let shape: Vec<_> = vals.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, [1, 1, 2, 2]);Sourcepub fn lrn<'f1>(&'f1 self) -> TensorLrnBuilder<'f1>
pub fn lrn<'f1>(&'f1 self) -> TensorLrnBuilder<'f1>
Local Response Normalization (LRN).
Normalizes each element by dividing by a scaled sum of squares over a
local neighborhood of size channels:
y = x / (bias + alpha * avg_pool(x^2, size))^beta.
Input must be 4-D [N, C, H, W].
§Examples
let x = Tensor::from_ndarray(&Array4::from_elem((1, 3, 2, 2), 1.0f32));
let y = x.lrn().size(3).call().unwrap();
let shape: Vec<_> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, [1, 3, 2, 2]);Custom alpha, beta, and bias:
let x = Tensor::from_ndarray(&Array4::from_elem((1, 3, 2, 2), 1.0f32));
let y = x.lrn().size(3).alpha(0.001).beta(0.5).bias(2.0).call().unwrap();
let shape: Vec<_> = y.shape().unwrap().iter().map(|d| d.as_const().unwrap()).collect();
assert_eq!(shape, [1, 3, 2, 2]);Source§impl Tensor
impl Tensor
Sourcepub fn sequential(&self, layers: &[&dyn Layer]) -> Result<Tensor>
pub fn sequential(&self, layers: &[&dyn Layer]) -> Result<Tensor>
Apply a sequence of layers to this tensor.
Source§impl Tensor
impl Tensor
Sourcepub fn uniform(shape: &[usize], low: f64, high: f64) -> Result<Tensor>
pub fn uniform(shape: &[usize], low: f64, high: f64) -> Result<Tensor>
Uniform [low, high) random tensor, float32, on the default (CPU) device.
Convenience wrapper around Tensor::uniform_with_dtype with f32 output.
Sourcepub fn uniform_with_dtype(
shape: &[usize],
low: f64,
high: f64,
dtype: DType,
) -> Result<Tensor>
pub fn uniform_with_dtype( shape: &[usize], low: f64, high: f64, dtype: DType, ) -> Result<Tensor>
Uniform [low, high) random tensor with explicit float dtype.
Generates a [0, 1) sample at f32, scales by (high - low), casts
to the target dtype, then adds low. Casting before the offset
keeps the addition honest in low-precision targets (f16/bf16) where
low might otherwise be lost to rounding if applied at f32.
Sourcepub fn randn(shape: &[usize]) -> Result<Tensor>
pub fn randn(shape: &[usize]) -> Result<Tensor>
Standard normal N(0, 1) random tensor (float32, Box-Muller).
Each output element draws from two [0, 1) uniforms via one combined
rand([2, *shape]) call, so the RNG counter advances exactly once per
randn invocation regardless of shape.
Sourcepub fn normal(shape: &[usize], mean: f64, std: f64) -> Result<Tensor>
pub fn normal(shape: &[usize], mean: f64, std: f64) -> Result<Tensor>
Normal N(mean, std) random tensor. Requires std >= 0.
Sourcepub fn randint(shape: &[usize], low: i64, high: i64) -> Result<Tensor>
pub fn randint(shape: &[usize], low: i64, high: i64) -> Result<Tensor>
Uniform integer tensor [low, high), dtype int32. Requires low < high.
Truncates (high - low) · rand to int32 before adding low.
Casting after the add would truncate-toward-zero asymmetrically for
negative low (e.g. low=-3, rand≈0.005 would yield -2 instead of
the correct -3).
Sourcepub fn scaled_uniform(shape: &[usize]) -> Result<Tensor>
pub fn scaled_uniform(shape: &[usize]) -> Result<Tensor>
uniform(-1, 1) · prod(shape)^(-½). Same dtype contract as uniform.
Sourcepub fn glorot_uniform(shape: &[usize]) -> Result<Tensor>
pub fn glorot_uniform(shape: &[usize]) -> Result<Tensor>
Glorot/Xavier uniform initializer, float32 output.
Sourcepub fn glorot_uniform_with_dtype(
shape: &[usize],
dtype: DType,
) -> Result<Tensor>
pub fn glorot_uniform_with_dtype( shape: &[usize], dtype: DType, ) -> Result<Tensor>
Glorot/Xavier uniform initializer with explicit dtype.
bound = √(6 / (shape[0] + prod(shape[1..]))); uniform(-bound, bound).
Sourcepub fn kaiming_uniform(shape: &[usize], a: f64) -> Result<Tensor>
pub fn kaiming_uniform(shape: &[usize], a: f64) -> Result<Tensor>
Kaiming/He uniform initializer for ReLU-family activations, float32 output.
Sourcepub fn kaiming_uniform_with_dtype(
shape: &[usize],
a: f64,
dtype: DType,
) -> Result<Tensor>
pub fn kaiming_uniform_with_dtype( shape: &[usize], a: f64, dtype: DType, ) -> Result<Tensor>
Kaiming/He uniform initializer with explicit dtype.
bound = √(6 / ((1 + a²) · prod(shape[1..]))); uniform(-bound, bound).
a is the negative slope of the activation:
0.0— plain ReLU (PyTorch default).0.01— leaky-ReLU with default slope.
Source§impl Tensor
impl Tensor
Sourcepub fn rand_like_with_dtype(&self, dtype: DType) -> Result<Tensor>
pub fn rand_like_with_dtype(&self, dtype: DType) -> Result<Tensor>
rand_like with a dtype override (device and shape still inherited).
Sourcepub fn rand_like(&self) -> Result<Tensor>
pub fn rand_like(&self) -> Result<Tensor>
Uniform [0, 1) random tensor with the same shape/dtype/device as self.
Sourcepub fn randn_like_with_dtype(&self, dtype: DType) -> Result<Tensor>
pub fn randn_like_with_dtype(&self, dtype: DType) -> Result<Tensor>
randn_like with a dtype override.
Internally generates f32 samples via Box-Muller, then casts to the target dtype. Using f32 inside Box-Muller keeps cos/log/sqrt accurate even when the caller wants low-precision output.
Sourcepub fn randn_like(&self) -> Result<Tensor>
pub fn randn_like(&self) -> Result<Tensor>
Standard normal N(0, 1) random tensor with the same shape/dtype/device as self.
Sourcepub fn randint_like(&self, low: i64, high: i64) -> Result<Tensor>
pub fn randint_like(&self, low: i64, high: i64) -> Result<Tensor>
Uniform integer [low, high) random tensor with the same shape/dtype/device as self.
The underlying Tensor::randint returns Int32; if self’s dtype
differs the result is cast to match (e.g. Int64 template → Int64
result). Requires low < high.
Source§impl Tensor
impl Tensor
Sourcepub fn rand(shape: &[usize]) -> Result<Tensor>
pub fn rand(shape: &[usize]) -> Result<Tensor>
Uniform [0, 1) random tensor with float32 dtype on the default CPU device.
THREEFRY-backed; deterministic for a fixed seed (set via
crate::rand::manual_seed).
Sourcepub fn rand_with(
shape: &[usize],
dtype: DType,
device: DeviceSpec,
) -> Result<Tensor>
pub fn rand_with( shape: &[usize], dtype: DType, device: DeviceSpec, ) -> Result<Tensor>
Variant of Tensor::rand with explicit dtype and device.
Supported dtypes: Float16, BFloat16, Float32, Float64. Integer
dtypes are not supported here — use Tensor::randint instead.
Source§impl Tensor
impl Tensor
Sourcepub fn realize(&mut self) -> Result<()>
pub fn realize(&mut self) -> Result<()>
Realize (execute) this tensor’s computation graph.
This is a convenience method that prepares and executes in one call.
For repeated executions of the same computation, use prepare() instead.
§Pipeline
- Prepare: Creates an
ExecutionPlan(compiles kernels, allocates buffers) - Execute: Runs all kernels in dependency order
- Return: Links output buffer to this tensor’s UOp
§Example
let a = Tensor::from_slice(&[1.0f32, 2.0, 3.0]);
let b = Tensor::from_slice(&[4.0f32, 5.0, 6.0]);
let c = (&a + &b).realize()?;
// c's buffer now contains [5.0, 7.0, 9.0]§Errors
Returns error if preparation or execution fails.
Sourcepub fn realize_with(&mut self, config: &PrepareConfig) -> Result<()>
pub fn realize_with(&mut self, config: &PrepareConfig) -> Result<()>
Realize tensor with custom configuration.
Like realize() but allows specifying optimization strategy
and codegen backend.
§Example
use svod_tensor::PrepareConfig;
use svod_schedule::{OptStrategy, OptimizerConfig};
let c = a.matmul(&b)?;
let config = PrepareConfig::from(
OptimizerConfig::builder()
.strategy(OptStrategy::Beam { width: 4 })
.build()
);
let c = c.realize_with(&config)?;Sourcepub fn prepare(&mut self) -> Result<ExecutionPlan>
pub fn prepare(&mut self) -> Result<ExecutionPlan>
Prepare an execution plan for this tensor’s computation graph.
This performs all one-time work:
- Creates schedule from computation graph
- Instantiates strict range-expanded callable schedule items
- Compiles all kernels
- Allocates all buffers
- Builds dependency-ordered prepared op execution plan
The returned ExecutionPlan can then be executed multiple times
without recompilation overhead.
§Example
let a = Tensor::from_slice(&[1.0f32, 2.0, 3.0]);
let b = Tensor::from_slice(&[4.0f32, 5.0, 6.0]);
let mut c = &a + &b;
// One-time preparation (wires output tensor to plan buffer)
let plan = c.prepare()?;
// Fast execution (can be called many times)
plan.execute()?;
// Get results
let output = plan.output_buffer();§Errors
Returns error if:
- Rangeify transformation fails
- No kernels found after scheduling
- Kernel compilation fails
- Buffer allocation fails
Sourcepub fn prepare_with(&mut self, config: &PrepareConfig) -> Result<ExecutionPlan>
pub fn prepare_with(&mut self, config: &PrepareConfig) -> Result<ExecutionPlan>
Prepare an execution plan with explicit configuration.
This method allows fine-grained control over kernel optimization settings and codegen backend selection.
§Example
use svod_tensor::PrepareConfig;
use svod_schedule::{OptimizerConfig, OptStrategy, BeamConfig};
// Beam search with width 8 and 120s timeout
let config = PrepareConfig::from(
OptimizerConfig::builder()
.strategy(OptStrategy::Beam { width: 8 })
.beam(BeamConfig::builder()
.timeout_secs(120)
.build())
.build()
);
let plan = tensor.prepare_with(&config)?;
plan.execute()?;Sourcepub fn realize_batch<'a>(
tensors: impl IntoIterator<Item = &'a mut Tensor>,
) -> Result<()>
pub fn realize_batch<'a>( tensors: impl IntoIterator<Item = &'a mut Tensor>, ) -> Result<()>
Realize multiple tensors in a single batch, sharing computation.
Merges all tensor computation graphs into one SINK, enabling the scheduler
to share kernels across outputs. More efficient than calling realize()
individually when tensors share subgraphs.
Sourcepub fn realize_batch_with<'a>(
tensors: impl IntoIterator<Item = &'a mut Tensor>,
config: &PrepareConfig,
) -> Result<()>
pub fn realize_batch_with<'a>( tensors: impl IntoIterator<Item = &'a mut Tensor>, config: &PrepareConfig, ) -> Result<()>
Realize multiple tensors with custom configuration.
Sourcepub fn prepare_batch<'a>(
tensors: impl IntoIterator<Item = &'a mut Tensor>,
) -> Result<ExecutionPlan>
pub fn prepare_batch<'a>( tensors: impl IntoIterator<Item = &'a mut Tensor>, ) -> Result<ExecutionPlan>
Prepare a batch execution plan for multiple tensors.
Output tensors are wired to plan buffers — after execute/execute_with_vars,
results are readable directly via tensor.as_vec() or tensor.array_view().
Sourcepub fn prepare_batch_with<'a>(
tensors: impl IntoIterator<Item = &'a mut Tensor>,
config: &PrepareConfig,
) -> Result<ExecutionPlan>
pub fn prepare_batch_with<'a>( tensors: impl IntoIterator<Item = &'a mut Tensor>, config: &PrepareConfig, ) -> Result<ExecutionPlan>
Prepare a batch execution plan with custom configuration.
Source§impl Tensor
impl Tensor
Sourcepub fn sum(&self, axes: impl Into<AxisSpec>) -> Result<Self>
pub fn sum(&self, axes: impl Into<AxisSpec>) -> Result<Self>
Sum of tensor elements over given axes.
Auto-promotes accumulation dtype (bool→int32, float16→float32) like Tinygrad.
Use sum_with().promote(false) to preserve input dtype.
Sourcepub fn prod(&self, axes: impl Into<AxisSpec>) -> Result<Self>
pub fn prod(&self, axes: impl Into<AxisSpec>) -> Result<Self>
Product of tensor elements over given axes.
Preserves input dtype. Use prod_with().promote(true) or .dtype(...) for different accumulation.
Sourcepub fn max(&self, axes: impl Into<AxisSpec>) -> Result<Self>
pub fn max(&self, axes: impl Into<AxisSpec>) -> Result<Self>
Maximum of tensor elements over given axes.
Always preserves input dtype.
Sourcepub fn min(&self, axes: impl Into<AxisSpec>) -> Result<Self>
pub fn min(&self, axes: impl Into<AxisSpec>) -> Result<Self>
Minimum of tensor elements over given axes.
Always preserves input dtype.
Sourcepub fn mean(&self, axes: impl Into<AxisSpec>) -> Result<Self>
pub fn mean(&self, axes: impl Into<AxisSpec>) -> Result<Self>
Mean of tensor elements over given axes.
For integer inputs, automatically uses float32 accumulation. For float inputs, preserves input dtype.
Sourcepub fn var(&self, axes: impl Into<AxisSpec>) -> Result<Self>
pub fn var(&self, axes: impl Into<AxisSpec>) -> Result<Self>
Variance of tensor elements over given axes.
Computes unbiased sample variance (divides by N-1). For integer inputs, automatically uses float32 accumulation. For float inputs, preserves input dtype.
§Examples
let t = Tensor::from_slice(&[1.0f32, 2.0, 3.0, 4.0]);
let v = t.var(())?; // Variance over all elementsSourcepub fn std(&self, axes: impl Into<AxisSpec>) -> Result<Self>
pub fn std(&self, axes: impl Into<AxisSpec>) -> Result<Self>
Standard deviation of tensor elements over given axes.
Computes unbiased sample standard deviation (divides by N-1). For integer inputs, automatically uses float32 accumulation. For float inputs, preserves input dtype.
§Examples
let t = Tensor::from_slice(&[1.0f32, 2.0, 3.0, 4.0]);
let s = t.std(())?; // Std dev over all elementsSourcepub fn sum_with<'f1, I1>(&'f1 self) -> TensorSumWithBuilder<'f1, I1>
pub fn sum_with<'f1, I1>(&'f1 self) -> TensorSumWithBuilder<'f1, I1>
Sourcepub fn prod_with<'f1, I1>(&'f1 self) -> TensorProdWithBuilder<'f1, I1>
pub fn prod_with<'f1, I1>(&'f1 self) -> TensorProdWithBuilder<'f1, I1>
Product with additional options (keepdim, dtype, promote).
Sourcepub fn max_with<'f1, I1>(&'f1 self) -> TensorMaxWithBuilder<'f1, I1>
pub fn max_with<'f1, I1>(&'f1 self) -> TensorMaxWithBuilder<'f1, I1>
Maximum with keepdim option.
Sourcepub fn min_with<'f1, I1>(&'f1 self) -> TensorMinWithBuilder<'f1, I1>
pub fn min_with<'f1, I1>(&'f1 self) -> TensorMinWithBuilder<'f1, I1>
Minimum with keepdim option.
Sourcepub fn mean_with<'f1, I1>(&'f1 self) -> TensorMeanWithBuilder<'f1, I1>
pub fn mean_with<'f1, I1>(&'f1 self) -> TensorMeanWithBuilder<'f1, I1>
Mean with keepdim option.
Sourcepub fn var_with<'f1, I1>(&'f1 self) -> TensorVarWithBuilder<'f1, I1>
pub fn var_with<'f1, I1>(&'f1 self) -> TensorVarWithBuilder<'f1, I1>
Variance with keepdim option.
Sourcepub fn std_with<'f1, I1>(&'f1 self) -> TensorStdWithBuilder<'f1, I1>
pub fn std_with<'f1, I1>(&'f1 self) -> TensorStdWithBuilder<'f1, I1>
Standard deviation with keepdim option.
Sourcepub fn var_mean_with<'f1, I1>(&'f1 self) -> TensorVarMeanWithBuilder<'f1, I1>
pub fn var_mean_with<'f1, I1>(&'f1 self) -> TensorVarMeanWithBuilder<'f1, I1>
Variance and mean with keepdim option.
Sourcepub fn std_mean_with<'f1, I1>(&'f1 self) -> TensorStdMeanWithBuilder<'f1, I1>
pub fn std_mean_with<'f1, I1>(&'f1 self) -> TensorStdMeanWithBuilder<'f1, I1>
Standard deviation and mean with keepdim option.
Source§impl Tensor
impl Tensor
Sourcepub fn argmax(&self, axis: impl Into<Option<isize>>) -> Result<Self>
pub fn argmax(&self, axis: impl Into<Option<isize>>) -> Result<Self>
Index of maximum value along axis.
Returns int32 tensor with indices of maximum values. For ties, returns the index of the first occurrence.
§Arguments
axis- Axis to reduce (None = flatten first)
§Examples
let t = Tensor::from_slice(&[[1.0, 3.0, 2.0], [4.0, 2.0, 5.0]]);
t.argmax(None)?; // 5 (flattened: max is at index 5)
t.argmax(Some(0))?; // [1, 0, 1] (row indices of max per column)
t.argmax(Some(1))?; // [1, 2] (column indices of max per row)Sourcepub fn hardmax(&self, axis: isize) -> Result<Self>
pub fn hardmax(&self, axis: isize) -> Result<Self>
Hard maximum: one-hot encoding of the argmax along an axis.
Returns a tensor of the same shape with 1.0 at the position of the
maximum value along axis and 0.0 elsewhere, cast to the input dtype.
Sourcepub fn argmin(&self, axis: impl Into<Option<isize>>) -> Result<Self>
pub fn argmin(&self, axis: impl Into<Option<isize>>) -> Result<Self>
Index of minimum value along axis.
Returns int32 tensor with indices of minimum values. For ties, returns the index of the first occurrence.
Sourcepub fn any(&self, axes: impl Into<AxisSpec>) -> Result<Self>
pub fn any(&self, axes: impl Into<AxisSpec>) -> Result<Self>
Test if any element is true along axes.
Logical OR reduction. Returns bool dtype. Non-zero values are treated as true.
§Examples
let t = Tensor::from_slice(&[[true, false], [false, false]]);
t.any(())?; // true (any element is true)
t.any(0)?; // [true, false] (any true per column)
t.any(1)?; // [true, false] (any true per row)Sourcepub fn all(&self, axes: impl Into<AxisSpec>) -> Result<Self>
pub fn all(&self, axes: impl Into<AxisSpec>) -> Result<Self>
Test if all elements are true along axes.
Logical AND reduction. Returns bool dtype. Non-zero values are treated as true.
§Examples
let t = Tensor::from_slice(&[[true, true], [true, false]]);
t.all(())?; // false (not all elements are true)
t.all(0)?; // [true, false] (all true per column)
t.all(1)?; // [true, false] (all true per row)Sourcepub fn argmax_with<'f1, I1>(&'f1 self) -> TensorArgmaxWithBuilder<'f1, I1>
pub fn argmax_with<'f1, I1>(&'f1 self) -> TensorArgmaxWithBuilder<'f1, I1>
Argmax with keepdim option.
Sourcepub fn argmin_with<'f1, I1>(&'f1 self) -> TensorArgminWithBuilder<'f1, I1>
pub fn argmin_with<'f1, I1>(&'f1 self) -> TensorArgminWithBuilder<'f1, I1>
Argmin with keepdim option.
Sourcepub fn any_with<'f1, I1>(&'f1 self) -> TensorAnyWithBuilder<'f1, I1>
pub fn any_with<'f1, I1>(&'f1 self) -> TensorAnyWithBuilder<'f1, I1>
Any with keepdim option.
Sourcepub fn all_with<'f1, I1>(&'f1 self) -> TensorAllWithBuilder<'f1, I1>
pub fn all_with<'f1, I1>(&'f1 self) -> TensorAllWithBuilder<'f1, I1>
All with keepdim option.
Source§impl Tensor
impl Tensor
Sourcepub fn try_reshape(
&self,
new_shape: impl IntoIterator<Item = impl Into<SInt>>,
) -> Result<Tensor>
pub fn try_reshape( &self, new_shape: impl IntoIterator<Item = impl Into<SInt>>, ) -> Result<Tensor>
Reshape tensor to a new shape.
The total number of elements must remain the same. Supports negative indices: -1 means “infer this dimension”.
§Examples
let t = Tensor::from_slice(&[1.0f32, 2.0, 3.0, 4.0, 5.0, 6.0]);
let reshaped = t.try_reshape(&[2, 3]).unwrap(); // [6] -> [2, 3]
let inferred = t.try_reshape(&[-1, 2]).unwrap(); // [6] -> [3, 2]§Errors
Returns error if:
- Shape contains negative values other than -1
- Multiple -1 dimensions specified
- Total elements don’t match
Sourcepub fn try_expand(
&self,
new_shape: impl IntoIterator<Item = impl Into<SInt>>,
) -> Result<Tensor>
pub fn try_expand( &self, new_shape: impl IntoIterator<Item = impl Into<SInt>>, ) -> Result<Tensor>
Expand tensor to a new shape with mixed concrete/symbolic dimensions.
Sourcepub fn try_permute(&self, axes: &[isize]) -> Result<Tensor>
pub fn try_permute(&self, axes: &[isize]) -> Result<Tensor>
Permute (reorder) tensor dimensions.
The axes parameter specifies the new order of dimensions. Each axis index 0..ndim must appear exactly once.
§Examples
// Tensor with shape [2, 3, 4]
// t.try_permute(&[2, 0, 1]) -> shape [4, 2, 3]
// t.try_permute(&[1, 0, 2]) -> shape [3, 2, 4]§Errors
Returns error if:
- Axes is not a valid permutation
- Axis indices out of range
Sourcepub fn try_transpose(&self, dim0: isize, dim1: isize) -> Result<Tensor>
pub fn try_transpose(&self, dim0: isize, dim1: isize) -> Result<Tensor>
Transpose two dimensions.
Convenience method for swapping two dimensions. Equivalent to permute with the two dimensions swapped.
§Examples
// Tensor with shape [2, 3, 4]
// t.try_transpose(0, 1) -> shape [3, 2, 4]
// t.try_transpose(-1, 0) -> shape [4, 3, 2] (negative indices supported)§Errors
Returns error if axis indices are out of range.
Sourcepub fn try_squeeze(&self, dim: Option<isize>) -> Result<Tensor>
pub fn try_squeeze(&self, dim: Option<isize>) -> Result<Tensor>
Expand (broadcast) dimensions.
Dimensions of size 1 can be expanded to larger sizes. Use -1 to keep the current dimension size.
§Examples
// Tensor with shape [1, 3, 1]
// t.try_expand(&[4, -1, 5]) -> shape [4, 3, 5]Squeeze dimensions of size 1.
If dim is None, removes all dimensions of size 1. If dim is Some(axis), removes only that dimension if it’s size 1.
§Examples
// Tensor with shape [1, 3, 1, 4]
// t.try_squeeze(None) -> shape [3, 4]
// t.try_squeeze(Some(0)) -> shape [3, 1, 4]
// t.try_squeeze(Some(2)) -> shape [1, 3, 4]§Errors
Returns error if:
- Specified dimension is not size 1
- Axis index out of range
Sourcepub fn try_unsqueeze(&self, dim: isize) -> Result<Tensor>
pub fn try_unsqueeze(&self, dim: isize) -> Result<Tensor>
Add a dimension of size 1.
Inserts a new dimension at the specified position. Supports negative indices: -1 means after the last dimension.
§Examples
// Tensor with shape [3, 4]
// t.try_unsqueeze(0) -> shape [1, 3, 4]
// t.try_unsqueeze(1) -> shape [3, 1, 4]
// t.try_unsqueeze(-1) -> shape [3, 4, 1]§Errors
Returns error if axis index is out of range.
Sourcepub fn repeat(&self, repeats: &[SInt]) -> Result<Tensor>
pub fn repeat(&self, repeats: &[SInt]) -> Result<Tensor>
Repeat tensor along each dimension.
repeats[i] is the number of times to repeat along dimension i.
Accepts &[SInt] — supports both concrete and symbolic repeat counts.
§Examples
use svod_ir::SInt;
let t = Tensor::from_slice(&[1.0f32, 2.0, 3.0]).try_reshape(&[1, 3])?;
let tiled = t.repeat(&[SInt::from(3), SInt::from(2)])?; // Shape [3, 6]Sourcepub fn try_pad(&self, padding: &[(isize, isize)]) -> Result<Tensor>
pub fn try_pad(&self, padding: &[(isize, isize)]) -> Result<Tensor>
Pad tensor with zeros (or other padding value).
Each tuple in padding specifies (begin, end) padding for a dimension.
Use 0 for no padding on that side.
§Examples
let t = Tensor::from_slice(&[1.0f32, 2.0, 3.0]); // Shape [3]
let padded = t.try_pad(&[(1, 2)]).unwrap(); // Shape [6]: [0, 1, 2, 3, 0, 0]§Errors
Returns error if:
- Padding values are symbolic (not concrete)
- Number of padding pairs doesn’t match dimensions
Sourcepub fn cat(tensors: &[&Tensor], dim: isize) -> Result<Tensor>
pub fn cat(tensors: &[&Tensor], dim: isize) -> Result<Tensor>
Concatenate tensors along an axis.
All tensors must have the same shape except in the concatenating dimension.
§Examples
let a = Tensor::from_slice(&[1.0f32, 2.0, 3.0]).try_reshape(&[3]).unwrap();
let b = Tensor::from_slice(&[4.0f32, 5.0]).try_reshape(&[2]).unwrap();
let c = Tensor::cat(&[&a, &b], 0).unwrap(); // Shape [5]: [1, 2, 3, 4, 5]§Errors
Returns error if:
- Tensors have different number of dimensions
- Non-concat dimensions don’t match
Sourcepub fn stack(tensors: &[&Tensor], dim: isize) -> Result<Tensor>
pub fn stack(tensors: &[&Tensor], dim: isize) -> Result<Tensor>
Stack tensors along a new dimension.
Creates a new axis at dim by unsqueezing each tensor, then concatenating.
Sourcepub fn unflatten(&self, dim: isize, sizes: &[isize]) -> Result<Tensor>
pub fn unflatten(&self, dim: isize, sizes: &[isize]) -> Result<Tensor>
Replace a single dimension with multiple dimensions.
Inverse of flatten: splits dimension dim into the shape given by sizes.
Sourcepub fn meshgrid(
tensors: &[&Tensor],
indexing: MeshgridIndexing,
) -> Result<Vec<Tensor>>
pub fn meshgrid( tensors: &[&Tensor], indexing: MeshgridIndexing, ) -> Result<Vec<Tensor>>
Create coordinate grids from 1D tensors.
indexing: Ij (matrix/default) or Xy (Cartesian, swaps first two inputs).
Sourcepub fn shape_tensor(&self) -> Result<Tensor>
pub fn shape_tensor(&self) -> Result<Tensor>
Get the shape of this tensor as a new tensor.
Returns a 1D tensor of int64 containing the shape dimensions. This is useful for ONNX Shape operator compatibility.
§Examples
let t = Tensor::from_slice(&[1.0f32; 6]).try_reshape(&[2, 3]).unwrap();
let shape_tensor = t.shape_tensor().unwrap(); // Tensor([2, 3]) with dtype int64§Errors
Supports symbolic dimensions — symbolic dims produce scalar UOp tensors.
Sourcepub fn try_shrink<R: IntoShrinkRange>(
&self,
ranges: impl IntoIterator<Item = R>,
) -> Result<Tensor>
pub fn try_shrink<R: IntoShrinkRange>( &self, ranges: impl IntoIterator<Item = R>, ) -> Result<Tensor>
Shrink (slice) tensor along each dimension.
Each tuple in ranges specifies (begin, end) for a dimension.
Use (0, size) to keep full dimension.
§Examples
let t = Tensor::from_slice(&[1.0f32, 2.0, 3.0, 4.0, 5.0]);
let sliced = t.try_shrink(&[(1, 4)]).unwrap(); // Elements [2, 3, 4]§Errors
Returns error if negative indices are used with symbolic shape dimensions.
Sourcepub fn center_crop_pad(
&self,
target_shape: &[usize],
axes: Option<&[usize]>,
) -> Result<Tensor>
pub fn center_crop_pad( &self, target_shape: &[usize], axes: Option<&[usize]>, ) -> Result<Tensor>
Center-crop or center-pad each specified axis to the target size.
For axes where target < current, crops from the center.
For axes where target > current, pads symmetrically around the center.
Axes where target == current are unchanged.
axes specifies which dimensions to apply (default: all).
Sourcepub fn numel(&self) -> Result<usize>
pub fn numel(&self) -> Result<usize>
Total number of elements. Fails if any dimension is symbolic.
Source§impl Tensor
impl Tensor
Sourcepub fn slice_with<'f1, 'f2, 'f3, 'f4, 'f5>(
&'f1 self,
) -> TensorSliceWithBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
pub fn slice_with<'f1, 'f2, 'f3, 'f4, 'f5>( &'f1 self, ) -> TensorSliceWithBuilder<'f1, 'f2, 'f3, 'f4, 'f5>
Slice tensor with Python-style indexing: negative indices, steps, and axis selection.
Source§impl Tensor
impl Tensor
Sourcepub fn embedding(&self, indices: &Tensor) -> Result<Tensor>
pub fn embedding(&self, indices: &Tensor) -> Result<Tensor>
Embedding lookup: self is the weight table [vocab_size, embed_dim].
Returns self[indices] with shape [*indices.shape, embed_dim].
Sourcepub fn apply_rotary_emb(
&self,
cos: &Tensor,
sin: &Tensor,
interleaved: bool,
) -> Result<Tensor>
pub fn apply_rotary_emb( &self, cos: &Tensor, sin: &Tensor, interleaved: bool, ) -> Result<Tensor>
Apply rotary position embedding rotation.
self: [..., rot_dim] tensor to rotate.
cos, sin: broadcastable to self’s shape [..., rot_dim/2].
If interleaved: pairs are (even, odd) indices.
If not interleaved: pairs are (first_half, second_half).
Source§impl Tensor
impl Tensor
Sourcepub fn scaled_dot_product_attention<'f1, 'f2, 'f3, 'f4>(
&'f1 self,
) -> TensorScaledDotProductAttentionBuilder<'f1, 'f2, 'f3, 'f4>
pub fn scaled_dot_product_attention<'f1, 'f2, 'f3, 'f4>( &'f1 self, ) -> TensorScaledDotProductAttentionBuilder<'f1, 'f2, 'f3, 'f4>
Scaled dot-product attention.
self (Q): [B, H, Sq, D], key (K): [B, H, Sk, D], value (V): [B, H, Sk, Dv].
Returns [B, H, Sq, Dv].
Source§impl Tensor
impl Tensor
Sourcepub fn from_lazy(uop: Arc<UOp>) -> Self
pub fn from_lazy(uop: Arc<UOp>) -> Self
Create a lazy tensor from a UOp graph (no buffer allocated). Used for deferred computation graphs like ONNX weight views.
Sourcepub fn from_path(path: &Path) -> Result<Self>
pub fn from_path(path: &Path) -> Result<Self>
Create a file-backed tensor using the DISK device (Tinygrad: Tensor(pathlib.Path)).
The file is memory-mapped lazily — no data is read until the tensor is realized.
The resulting tensor has dtype uint8 and shape (file_size,).
Sourcepub fn uop(&self) -> Arc<UOp>
pub fn uop(&self) -> Arc<UOp>
Get the current UOp for this tensor.
This reads from the registry, so it reflects any global substitutions.
Sourcepub fn kernels(&self) -> Vec<KernelInfo>
pub fn kernels(&self) -> Vec<KernelInfo>
Get kernels for THIS tensor.
Note: Kernel tracking is not yet implemented with the new registry. This returns an empty list for now.
Sourcepub fn empty(shape: &[usize], dtype: DType) -> Self
pub fn empty(shape: &[usize], dtype: DType) -> Self
Create an uninitialized buffer-backed tensor with the given shape and dtype.
No device memory is allocated — only the BUFFER UOp is created.
Use assign() to bind real data before realize().
Matches Tinygrad’s Tensor.empty(*shape).
Sourcepub fn empty_dynamic(shape: &[SInt], dtype: DType) -> Self
pub fn empty_dynamic(shape: &[SInt], dtype: DType) -> Self
Create an uninitialized buffer-backed tensor with symbolic (dynamic) dimensions.
Buffer is sized to prod(vmax) — each symbolic dim uses its Variable’s
max_val for allocation. This enables rebinding to any value in [min, max]
without reallocation. Matches Tinygrad’s
prod([x.vmax if isinstance(x, UOp) else x for x in shape]).
Sourcepub fn empty_zero(dtype: DType) -> Self
pub fn empty_zero(dtype: DType) -> Self
Create an empty 0-element tensor with the given dtype and shape [0].
Sourcepub fn full(
shape: &[usize],
value: impl Into<ConstValue>,
dtype: DType,
) -> Result<Self>
pub fn full( shape: &[usize], value: impl Into<ConstValue>, dtype: DType, ) -> Result<Self>
Create a tensor filled with a constant value, broadcast to the given shape.
Sourcepub fn zeros(shape: &[usize], dtype: DType) -> Result<Self>
pub fn zeros(shape: &[usize], dtype: DType) -> Result<Self>
Create a zero-filled tensor with the given concrete shape.
Sourcepub fn ones(shape: &[usize], dtype: DType) -> Result<Self>
pub fn ones(shape: &[usize], dtype: DType) -> Result<Self>
Create a one-filled tensor with the given concrete shape.
Sourcepub fn full_dynamic(
shape: &[SInt],
value: impl Into<ConstValue>,
dtype: DType,
) -> Result<Self>
pub fn full_dynamic( shape: &[SInt], value: impl Into<ConstValue>, dtype: DType, ) -> Result<Self>
Create a tensor filled with a constant value, using symbolic (dynamic) dimensions.
Dimensions can be concrete (SInt::Const) or symbolic (SInt::Symbolic
from Variable::bind()).
§Example
use svod_tensor::{Tensor, Variable};
use svod_dtype::DType;
let batch = Variable::new("batch", 1, 32);
let x = Tensor::full_dynamic(&[batch.bind(16)?.into(), 784.into()], 0.0, DType::Float32)?;Sourcepub fn zeros_dynamic(shape: &[SInt], dtype: DType) -> Result<Self>
pub fn zeros_dynamic(shape: &[SInt], dtype: DType) -> Result<Self>
Create a zero-filled tensor with symbolic (dynamic) dimensions.
Sourcepub fn ones_dynamic(shape: &[SInt], dtype: DType) -> Result<Self>
pub fn ones_dynamic(shape: &[SInt], dtype: DType) -> Result<Self>
Create a one-filled tensor with symbolic (dynamic) dimensions.
Sourcepub fn arange(start: i64, stop: Option<i64>, step: Option<i64>) -> Result<Self>
pub fn arange(start: i64, stop: Option<i64>, step: Option<i64>) -> Result<Self>
Create 1D tensor with evenly spaced Int32 values.
Sourcepub fn arange_f64(
start: f64,
stop: f64,
step: f64,
dtype: DType,
) -> Result<Self>
pub fn arange_f64( start: f64, stop: f64, step: f64, dtype: DType, ) -> Result<Self>
Create 1D tensor with evenly spaced values (float parameters).
Sourcepub fn linspace(
start: f64,
end: f64,
steps: usize,
dtype: DType,
) -> Result<Self>
pub fn linspace( start: f64, end: f64, steps: usize, dtype: DType, ) -> Result<Self>
Create 1D tensor with steps evenly spaced values from start to end (inclusive).
Sourcepub fn const_<T: Into<ConstValue>>(value: T, dtype: DType) -> Self
pub fn const_<T: Into<ConstValue>>(value: T, dtype: DType) -> Self
Create a scalar constant tensor.
Creates a 0-dimensional tensor containing a single constant value. The constant is embedded directly in the IR and does not allocate a buffer until realized (if needed).
§Arguments
value- The constant value (will be converted to ConstValue)dtype- The data type for the tensor
§Examples
// Float constant
let pi = Tensor::const_(3.14159, DType::Float32);
// Integer constant
let forty_two = Tensor::const_(42i64, DType::Int64);Sourcepub fn from_const<T: Into<ConstValue> + HasDType>(value: T) -> Self
pub fn from_const<T: Into<ConstValue> + HasDType>(value: T) -> Self
Sourcepub fn device(&self) -> DeviceSpec
pub fn device(&self) -> DeviceSpec
Get device specification from underlying UOp graph.
Returns the device where this tensor’s data resides. For lazy tensors (not yet realized), returns the target device. Defaults to CPU if no device is found in the graph.
§Examples
let cpu_tensor = Tensor::from_slice(&[1.0f32, 2.0, 3.0]);
assert_eq!(cpu_tensor.device(), DeviceSpec::Cpu);Sourcepub fn to(&self, device: DeviceSpec) -> Self
pub fn to(&self, device: DeviceSpec) -> Self
Move tensor to a different device.
Creates a lazy COPY operation. Data is not transferred until realize().
If already on target device, returns a clone (no-op).
§Examples
let cpu_tensor = Tensor::from_slice(&[1.0f32, 2.0, 3.0]);
let mut gpu_tensor = cpu_tensor.to(DeviceSpec::Cuda { device_id: 0 });
gpu_tensor.realize()?; // Actually transfers dataSourcepub fn custom_kernel<F>(
&self,
others: &[&Tensor],
fxn: F,
) -> Result<Vec<Tensor>>
pub fn custom_kernel<F>( &self, others: &[&Tensor], fxn: F, ) -> Result<Vec<Tensor>>
Build and apply a custom UOp kernel over this tensor and additional inputs.
The closure receives PARAM placeholders (as UOps) corresponding to
[self, others...] and must return the kernel body UOp (typically a SINK).
Returns tensors wrapped with AFTER(CALL) dependencies in argument order.
Sourcepub fn custom_kernel_with<F>(
&self,
others: &[&Tensor],
info: CallInfo,
fxn: F,
) -> Result<Vec<Tensor>>
pub fn custom_kernel_with<F>( &self, others: &[&Tensor], info: CallInfo, fxn: F, ) -> Result<Vec<Tensor>>
custom_kernel with explicit CALL metadata.
Sourcepub fn bitcast(&self, dtype: DType) -> Result<Self>
pub fn bitcast(&self, dtype: DType) -> Result<Self>
Bitcast tensor to a different dtype, reinterpreting bits.
For equal-itemsize dtypes (e.g. f32 ↔ i32) this is the pure
IR-level reinterpretation. For different-itemsize dtypes (e.g.
u32 → u16 or u32 → u64) the last axis is split or combined via
shifts + reshape, matching Tinygrad’s tensor.py::bitcast. The total
byte count is preserved; the last axis grows (src_size > dst_size)
or shrinks (src_size < dst_size) by rate = max(...)/min(...).
Requires:
- source and destination are both scalar (vector dtypes unsupported);
(shape[-1] * src_size)divides evenly bydst_size;- the last shape dim is concrete (not symbolic).
Sourcepub fn arange_with_dtype() -> TensorArangeWithDtypeBuilder
pub fn arange_with_dtype() -> TensorArangeWithDtypeBuilder
Create 1D tensor with evenly spaced values and explicit dtype.
Matches Tinygrad’s Tensor.arange(): full(step) → cumsum → + (start - step).
Accepts concrete i64 or symbolic Arc<UOp> for start/stop/step.
If stop is None, treats start as stop and starts from 0.
Source§impl Tensor
impl Tensor
Sourcepub fn try_assign(&self, value: &Tensor) -> Result<()>
pub fn try_assign(&self, value: &Tensor) -> Result<()>
Assign a value tensor to this tensor in-place.
Embeds the write as AFTER(target, STORE(target, value)).
§Example
let placeholder = Tensor::empty(&[2, 3], DType::Float32);
let real_data = Tensor::from_slice(&[1.0f32, 2.0, 3.0, 4.0, 5.0, 6.0])
.try_reshape(&[2, 3]).unwrap();
placeholder.assign(&real_data);pub fn assign(&self, value: &Tensor)
Sourcepub fn contiguous(&self) -> Self
pub fn contiguous(&self) -> Self
Ensure this tensor has contiguous memory layout.
Creates a CONTIGUOUS UOp that forces materialization when realized.
Following Tinygrad’s approach, calling .contiguous().realize() on
a pure constant tensor will create an actual buffer.
§Examples
// Force a constant to be materialized
let mut c = Tensor::const_(5.0f32, DType::Float32).contiguous();
c.realize()?;
assert!(c.buffer().is_some());Source§impl Tensor
impl Tensor
Sourcepub fn cumsum_with<'f1>(&'f1 self) -> TensorCumsumWithBuilder<'f1>
pub fn cumsum_with<'f1>(&'f1 self) -> TensorCumsumWithBuilder<'f1>
Cumulative sum with exclusive and reverse options.
Sourcepub fn cumprod_with<'f1>(&'f1 self) -> TensorCumprodWithBuilder<'f1>
pub fn cumprod_with<'f1>(&'f1 self) -> TensorCumprodWithBuilder<'f1>
Cumulative product with exclusive and reverse options.
Trait Implementations§
Auto Trait Implementations§
impl Freeze for Tensor
impl !RefUnwindSafe for Tensor
impl Send for Tensor
impl Sync for Tensor
impl Unpin for Tensor
impl UnsafeUnpin for Tensor
impl !UnwindSafe for Tensor
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more