Module dfdx::tensor_ops
source · [−]Expand description
Operations on tensors like relu(), matmul(), softmax(), and more.
Generic function and struct methods
All functionality is provided in two ways.
- The generic standalone function that takes a generic parameter. e.g. relu().
- The struct method for tensor structs. e.g. crate::tensor::Tensor1D::relu().
The struct methods are all just pass throughs to the generic function.
Axes/Dimensions for broadcasting/reductions/selecting
For the following sections, some traits/functions utilizing const isize
to determine
the axis to apply the transformation to.
Here are the valid axes for each tensor:
Tensor0D
:Axis<0>
Tensor1D
:Axis<0>
Tensor2D
:Axis<0>
,Axis<1>
Tensor3D
:Axis<0>
,Axis<1>
,Axis<2>
,Tensor4D
:Axis<0>
,Axis<1>
,Axis<2>
,Axis<3>
Additionally AllAxes
is valid for all tensors.
To specify multiple axes you can use Axes2
, Axes3
, and Axes4
Reductions
There are a number of functions that reduce 1 or more axes. Valid axes and reductions can be seen by viewing the Reduce or ReduceTo traits. Anything that can be Reduce’d can also be BroadcastTo the same tensor.
There are 2 ways to call each axis reducing function:
- The tensor method (e.g. crate::tensor::Tensor1D::sum()), where the axes are inferred based on the output type.
let t: Tensor3D<2, 4, 6> = TensorCreator::zeros();
let _: Tensor1D<4> = t.sum();
- The generic function (e.g. sum), where you need to specify the axes as generic parameters
let t: Tensor3D<2, 4, 6> = TensorCreator::zeros();
let _: Tensor1D<4> = sum::<_, Axes2<0, 2>>(t);
Complete list of reductions:
Broadcasts
Broadcasting tensors is provided through the BroadcastTo trait. Generally the axes can be inferred by the type of the output, so you don’t have to explicitly specify them.
To broadcast a tensor to be the same size as another tensor you can use like so:
let big: Tensor2D<2, 5> = TensorCreator::zeros();
// broadcast the 1nd axis
let a: Tensor2D<2, 5> = Tensor1D::<5>::zeros().broadcast();
add(a, big.clone());
// broadcast the 2nd axis
let a: Tensor2D<2, 5> = Tensor1D::<2>::zeros().broadcast();
add(a, big);
Permutating axes
Permutating axes is done via PermuteTo, and similar to braodcasting/reducing, you can just specify the output type and the axes will be inferred.
2D version:
let t: Tensor2D<2, 3> = TensorCreator::zeros();
let _: Tensor2D<3, 2> = t.permute();
3D version:
let t: Tensor3D<2, 3, 4> = TensorCreator::zeros();
let _: Tensor3D<3, 4, 2> = t.permute();
4D version:
let t: Tensor4D<2, 3, 4, 5> = TensorCreator::zeros();
let _: Tensor4D<3, 5, 2, 4> = t.permute();
Selects/Indexing
Selecting or indexing into a tensor is done via SelectTo::select(). This traits enables 2 behaviors for each axis of a given tensor:
- Select exactly 1 element from that axis.
- Select Z elements (can be different from the size of the axis) from that axis
For example here is selecting from the 0th axis of a 2d tensor:
let t = Tensor2D::new([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]);
let a: Tensor1D<3> = t.clone().select(&0); // select the first row
assert_eq!(a.data(), &[1.0, 2.0, 3.0]);
let b: Tensor2D<5, 3> = t.select(&[0, 0, 1, 1, 1]); // select each row multiple times
This can be done per axis as well, which allows a number of combinations. Here is the same example but selecting from the last axis of a 2d tensor:
let t = tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]);
let a: Tensor1D<2> = t.clone().select(&[0, 2]); // select one element from the last axis
assert_eq!(a.data(), &[1.0, 6.0]);
let b: Tensor2D<2, 2> = t.select(&[[0, 2], [1, 1]]); // select multiple from the last axis
assert_eq!(b.data(), &[[1.0, 3.0], [5.0, 5.0]]);
Traits
T
with the new order of axes specified via Axes
.Axes
of tensor by reducing them. Opposite of BroadcastTo.Axes
of Self
to produce a T
Self
into T
.Axes
resulting in T
. Equivalent
to torch.select
and torch.gather
from pytorch.Functions
t + val
. val
is used for all elements of t
.t
has.t / val
. val
is used for all elements of t
.t
. Zeros elements with probability p
and scales all elements by 1 / (1 - p)
.
See Tape::OWNS_TAPE.log_e(t)
.log(softmax(t))
in numerically stable way across Axes
. Does t - logsumexp(t)
under the hood.rhs
. Equivalent to matmul(lhs, transpose(rhs))
.
This supports the same variants as matmul (broadcasted, batched, etc).Axes
of the tensor by gathering the maximum value from that dimension.Axes
of T
.Axes
of the tensor by gathering the minimum value from the axes.t * val
. val
is used for all elements of t
.value
.t
to have mean 0.0
and stddev 1.0
along Axes
of T
. epsilon
is passed to stddev().
Computes (t - t.mean(Axes)) / t.std(Axes, epsilon)
.t^i
.t^i
.max(0, t)
Axes
.√t
or t^0.5
t^2
Axes
of T
by computing std deviation of all values in those axes.
Result Tensor has smaller number of dimensions.t - val
. val
is used for all elements of t
.Axes
of T
.t
to value
anywhere mask
equals valueAxes
of T
by computing variance of all values in those axes.
Result Tensor has smaller number of dimensions.rhs
is transposed. y * transpose(rhs)