Module dfdx::tensor_ops

Expand description

Implementations of all ops for tensors including activations like relu(), binary operations like matmul(), and more.

Generic function and struct methods:

All functionality is provided in two ways.

The generic standalone function that takes a generic parameter. e.g. mean().
The struct method for tensor structs. e.g. [Tensor1D::mean()].

The struct methods are all just pass throughs to the generic function.

Reductions:

There are a number of functions that reduce a dimension (e.g. mean_last_dim()). These functions are all labeled with *_last_dim() at the end.

Reducing a dimension means removing that dimension from the tensor by reducing it to 1 number. For example calling sum_last_dim() on a Tensor2D<2, 5> would result in a Tensor1D<2>.

See relevant functions for more examples.

Broadcasts:

Some binary functions need to broadcast one argument to be the same size as the other (e.g. add_broadcast_rhs_last()). These methods are named as <operation>_broadcast_<argument>_<dimension>(). Currently all of the functions broadcast the second argument (rhs). And there are version where the first dimension is broadcast and the last dimension is broadcast

add_broadcast_rhs_last() (and others) broadcasts the last dimension
add_broadcast_rhs_first() (and others) broadcast the entire array according to the first dimension in lhs.

See relevant functions for more examples.

Functions

abs

The absolute value (abs) computes |x|

add

Add two Tensors of the same shape together: lhs + &rhs

add_broadcast_rhs_first

Add together two Tensors by broadcasting rhs M times, where M is the first dimension of lhs.

add_broadcast_rhs_last

Add two Tensors together: lhs + rhs. rhs’s last dimension is broadcasted to be the same size as lhs.

clamp

Clamps all values in t to between min and max

cloned_map

Similar to map(), but doesn’t take ownership of the Tensor t.

cos

The cos function computes cos(x)

div

Divides two Tensors of the same shape: lhs / &rhs.

div_broadcast_rhs_first

Divides two Tensors by broadcasting rhs M times, where M is the first dimension of lhs.

div_broadcast_rhs_last

Divides two Tensors: lhs / rhs. rhs’s last dimension is broadcasted to be the same size as lhs.

dropout

Randomly drops out elements from t with probability p, and multiplies all elements by 1 / (1 - p).

exp

The exponential function (exp) computes e ^ x

gather_last_dim

Reduces the last dimension of the tensor by gathering the value specified by indices. Resulting Tensor has the last dimension removed (e.g. a 2d tensor will become 1d).

ln

The Natural Logarithm (ln) computes ln(x)

log_softmax

Numerically stable computation of log(softmax(t)). Does t - logsumexp(t) under the hood.

logsumexp

Computes the LogSumExp function. Equivalent to log(sum(exp(data))) or data.exp().sum(-1).log().

map

Applies a function f to every element of the Tensor. The derivative df must also be provided.

matmul

Matrix multiplication.

matmul_transpose

Matrix multiplication with the transpose of rhs. Equivalent to matmul(lhs, transpose(rhs)).

max_last_dim

Reduces the last dimension of the tensor by gathering the maximum value from that dimension. Resulting Tensor has the last dimension removed (e.g. a 2d tensor will become 1d).

mean

Sums all the values in self and divides by number of values.

mean_last_dim

Reduces the last dimension of the tensor by taking mean of all values in the last dimension. Result Tensor has smaller number of dimensions.

minimum

Takes the element wise minimum of two Tensors of the same shape: min(lhs, &rhs).

mul

Multiplies two Tensors of the same shape together: lhs * &rhs.

mul_broadcast_rhs_first

Multiplies two Tensors by broadcasting rhs M times, where M is the first dimension of lhs.

mul_broadcast_rhs_last

Multiplies two Tensors: lhs * rhs. rhs’s last dimension is broadcasted to be the same size as lhs.

nans_to

Replaces any nans in t with value.

negate

Negates all values in t.

normalize

Normalizes t to have mean 0.0 and stddev 1.0: (t - t.mean_last_dim()) / (t.var_last_dim() + epsilon).sqrt().

relu

Rectified Linear Unit (ReLU) computes max(0, x).

scalar_add

Adds val to all elements of t.

scalar_div

Divides all elements of t by val.

scalar_mul

Multiplies all elements of t by val.

scalar_sub

Subtracts val from all elements of t.

sigmoid

Sigmoid computes 1 / (1 + exp(-x)).

sin

The sine function computes sin(x)

softmax

Computes the softmax function. Equivalent to t.log_softmax().exp() or exp(log_softmax(t)) or exp(t) / sum(exp(t))

sqrt

Square root computes x ^ 0.5 or √x.

square

Square computes x * x.

std_last_dim

Reduces the last dimension of the tensor by computing std deviation of all values in the last dimension. Result Tensor has smaller number of dimensions.

sub

Subtracts two Tensors of the same shape from each other: lhs - &rhs

sub_broadcast_rhs_first

Subtract two Tensors by broadcasting rhs M times, where M is the first dimension of lhs.

sub_broadcast_rhs_last

Subtracts two Tensors: lhs - rhs. rhs’s last dimension is broadcasted to be the same size as lhs.

sum

Sums all the values in self. Returns a Tensor0D (i.e. one number).

sum_last_dim

Reduces the last dimension of the tensor by summing all the values in that dimension. Result Tensor has smaller number of dimensions.

tanh

Hyperbolic Tangent (Tanh) computes tanh(x).

value_mask

Sets t to value anywhere mask equals value

var_last_dim

Reduces the last dimension of the tensor by computing variance of all values in the last dimension. Result Tensor has smaller number of dimensions.

vecmat_mul

vector * matrix multiplication.

vecmat_mul_transpose

vector * matrix multiplication where rhs is transposed. y * transpose(rhs)