Expand description

Implementations of all ops for tensors including activations like relu(), binary operations like matmul(), and more.

Generic function and struct methods:

All functionality is provided in two ways.

  1. The generic standalone function that takes a generic parameter. e.g. mean().
  2. The struct method for tensor structs. e.g. [Tensor1D::mean()].

The struct methods are all just pass throughs to the generic function.

Reductions:

There are a number of functions that reduce a dimension (e.g. mean_last_dim()). These functions are all labeled with *_last_dim() at the end.

Reducing a dimension means removing that dimension from the tensor by reducing it to 1 number. For example calling sum_last_dim() on a Tensor2D<2, 5> would result in a Tensor1D<2>.

See relevant functions for more examples.

Broadcasts:

Some binary functions need to broadcast one argument to be the same size as the other (e.g. add_broadcast_rhs_last()). These methods are named as <operation>_broadcast_<argument>_<dimension>(). Currently all of the functions broadcast the second argument (rhs). And there are version where the first dimension is broadcast and the last dimension is broadcast

  1. add_broadcast_rhs_last() (and others) broadcasts the last dimension
  2. add_broadcast_rhs_first() (and others) broadcast the entire array according to the first dimension in lhs.

See relevant functions for more examples.

Functions

The absolute value (abs) computes |x|

Add two Tensors of the same shape together: lhs + &rhs

Add together two Tensors by broadcasting rhs M times, where M is the first dimension of lhs.

Add two Tensors together: lhs + rhs. rhs’s last dimension is broadcasted to be the same size as lhs.

Clamps all values in t to between min and max

Similar to map(), but doesn’t take ownership of the Tensor t.

The cos function computes cos(x)

Divides two Tensors of the same shape: lhs / &rhs.

Divides two Tensors by broadcasting rhs M times, where M is the first dimension of lhs.

Divides two Tensors: lhs / rhs. rhs’s last dimension is broadcasted to be the same size as lhs.

Randomly drops out elements from t with probability p, and multiplies all elements by 1 / (1 - p).

The exponential function (exp) computes e ^ x

Reduces the last dimension of the tensor by gathering the value specified by indices. Resulting Tensor has the last dimension removed (e.g. a 2d tensor will become 1d).

The Natural Logarithm (ln) computes ln(x)

Numerically stable computation of log(softmax(t)). Does t - logsumexp(t) under the hood.

Computes the LogSumExp function. Equivalent to log(sum(exp(data))) or data.exp().sum(-1).log().

Applies a function f to every element of the Tensor. The derivative df must also be provided.

Matrix multiplication.

Matrix multiplication with the transpose of rhs. Equivalent to matmul(lhs, transpose(rhs)).

Reduces the last dimension of the tensor by gathering the maximum value from that dimension. Resulting Tensor has the last dimension removed (e.g. a 2d tensor will become 1d).

Sums all the values in self and divides by number of values.

Reduces the last dimension of the tensor by taking mean of all values in the last dimension. Result Tensor has smaller number of dimensions.

Takes the element wise minimum of two Tensors of the same shape: min(lhs, &rhs).

Multiplies two Tensors of the same shape together: lhs * &rhs.

Multiplies two Tensors by broadcasting rhs M times, where M is the first dimension of lhs.

Multiplies two Tensors: lhs * rhs. rhs’s last dimension is broadcasted to be the same size as lhs.

Replaces any nans in t with value.

Negates all values in t.

Normalizes t to have mean 0.0 and stddev 1.0: (t - t.mean_last_dim()) / (t.var_last_dim() + epsilon).sqrt().

Adds val to all elements of t.

Divides all elements of t by val.

Multiplies all elements of t by val.

Subtracts val from all elements of t.

Sigmoid computes 1 / (1 + exp(-x)).

The sine function computes sin(x)

Computes the softmax function. Equivalent to t.log_softmax().exp() or exp(log_softmax(t)) or exp(t) / sum(exp(t))

Square root computes x ^ 0.5 or √x.

Square computes x * x.

Reduces the last dimension of the tensor by computing std deviation of all values in the last dimension. Result Tensor has smaller number of dimensions.

Subtracts two Tensors of the same shape from each other: lhs - &rhs

Subtract two Tensors by broadcasting rhs M times, where M is the first dimension of lhs.

Subtracts two Tensors: lhs - rhs. rhs’s last dimension is broadcasted to be the same size as lhs.

Sums all the values in self. Returns a Tensor0D (i.e. one number).

Reduces the last dimension of the tensor by summing all the values in that dimension. Result Tensor has smaller number of dimensions.

Sets t to value anywhere mask equals value

Reduces the last dimension of the tensor by computing variance of all values in the last dimension. Result Tensor has smaller number of dimensions.

vector * matrix multiplication.

vector * matrix multiplication where rhs is transposed. y * transpose(rhs)