Struct neuronika::VarDiff [−][src]
pub struct VarDiff<T, U> where
T: Data + 'static,
U: Gradient + Overwrite + 'static, { /* fields omitted */ }
Expand description
A differentiable variable.
Differentiable variables can be created in the two following ways described hereafter:
- By calling
.requires_grad()
on a non-differentiable leaf.
- By performing any binary operation between a
Var
and aVarDiff
. Differentiability is thus a contagious property, that is, if during a computation aVarDiff
is used, the result of the computation itself, and that of any other subsequent computations performed on it, will also be differentiable. As an obvious consequence, the results of operations performed on two VarDiff will also be VarDiff.
Implementations
Returns an immutable reference to the data inside self
.
At the differentiable variable’s creation the data is filled with zeros. You can populate it
with a call to .forward()
.
Returns a mutable reference to the data inside self
.
At the differentiable variable’s creation the data is filled with zeros. You can populate it
with a call to .forward()
.
Returns an immutable reference to the gradient inside self
.
At the differentiable variable’s creation the gradient is filled with zeros. You can
populate it with a call to .backward()
.
Back-propagates through the computational graph and populates the gradients of the
differentiable leaves that are ancestors of self
. Before back-propagating the gradient
of self
is seeded with seed
, thus, the leaves’ gradients will be scaled accordingly.
The graph is differentiated through the chain rule.
The leaves whose gradients are populated by this method are also those referred by the
vector of Param
returned by .parameters()
.
Disables gradient computation and de-allocates the gradient for self
and all of its
ancestors.
Re-enables gradient computation and re-allocates the gradient for self
and all of its
ancestors.
This has effect only on certain ancestor variables of self
. It sets such variables
and differentiable variables in training mode.
See also .dropout()
.
This has effect only on certain ancestor variables of self
. It sets such variables
and differentiable variables in evaluation mode.
See also .dropout()
.
Performs a vector-matrix multiplication between the vector variable self
and the matrix
variable rhs
.
If self
is n and rhs
is (n, m) the output will be m.
Performs a matrix multiplication between the matrix variables self
and rhs
. If self
is (n, m) and rhs
is (m, o) the output will be (n, o).
pub fn mm_t<Rhs>(self, rhs: Rhs) -> <Self as MatMatMulT<Rhs>>::Output where
Self: MatMatMulT<Rhs>,
pub fn mm_t<Rhs>(self, rhs: Rhs) -> <Self as MatMatMulT<Rhs>>::Output where
Self: MatMatMulT<Rhs>,
Performs a matrix multiplication between the matrix variables self
and rhs
.
This is a fused operation as rhs
is implicitly transposed. Fusing the two operations
it’s marginally faster than computing the matrix multiplication and the transposition
separately.
If self
is (n, m) and rhs
is (o, m) the output will be (n, o).
Returns a vector of Param
referencing all the differentiable leaves that are ancestors
of the variable.
If directly called on a differentiable leaf the resulting vector will include only a single
Param
referencing self
.
Ancestors that appear multiple times in the computation of the variable are listed only once. Thus, the parameters of a differentiable variable z resulting from a binary operation involving two other differentiable variables x and y will be the set union of the parameters of x and y. This can be extended to the general case.
Examples
use neuronika;
// x has 2 parameters, as it is the result of an addition.
let x = neuronika::rand((3,3)).requires_grad() + neuronika::rand((3,3)).requires_grad();
// The same holds for y.
let y = neuronika::rand(3).requires_grad() + neuronika::rand(1).requires_grad();
assert!(x.parameters().len() == y.parameters().len() && y.parameters().len() == 2);
// z is the result of an addition between x and y, so it will have 4 parameters.
let z = x.clone() + y;
assert_eq!(z.parameters().len(), 4);
// If we add x to z there still will be 4 parameters, as x is already present among them.
let w = z + x;
assert_eq!(w.parameters().len(), 4);
Returns the mean of all elements in self
.
Takes the power of each element in self
with exponent exp
and returns a differentiable
variable with the result.
Takes the square root element-wise and returns a differentiable variable with the result.
Applies the rectified linear unit element-wise and and returns a differentiable variable with the result.
ReLU(x) = max(0, x)
Applies the leaky rectified linear unit element-wise and returns a differentiable variable with the result.
LeakyReLU(x) = max(0, x) + 0.01 * min(0, x)
Applies the softplus element-wise and returns a differentiable variable with the result.
Softplus(x) = log(1 + exp(x))
Applies the sigmoid element-wise and returns a differentiable variable with the result.
Applies the tanh element-wise and returns a differentiable variable with the result.
Applies the natural logarithm element-wise and returns a differentiable variable with the result.
Applies the exponential element-wise and returns a differentiable variable with the result.
Applies the softmax to self
and returns a differentiable variable with the result.
The softmax is applied to all slices along axis
, and will re-scale them so
that the elements lie in the range [0, 1] and sum to 1.0.
pub fn log_softmax(
self,
axis: usize
) -> VarDiff<LogSoftmax<T>, LogSoftmaxBackward<U, LogSoftmax<T>>>
pub fn log_softmax(
self,
axis: usize
) -> VarDiff<LogSoftmax<T>, LogSoftmaxBackward<U, LogSoftmax<T>>>
Applies the log-softmax to self
and returns a differentiable variable with the result.
Applies a softmax followed by a logarithm. While mathematically equivalent to log(softmax(x)), doing these two operations separately is slower, and numerically unstable. This function uses an alternative formulation to compute the output and gradient correctly.
See also .softmax()
.
Returns a differentiable variable equivalent to self
with its dimensions reversed.
Applies dropout to self
and returns a differentiable variable with the result.
It is strongly suggested to use nn::Dropout
instead of this method when working with
neural networks.
During training, randomly zeroes some of the elements of self
with probability p using
samples from a Bernoulli distribution. Each channel will be zeroed out independently on
every forward call.
This has proven to be an effective technique for regularization and preventing the co-adaptation of neurons as described in the paper Improving neural networks by preventing co-adaptation of feature detectors.
Furthermore, the outputs are scaled by a factor of 1/(1 - p) during training. This means that during evaluation the resulting variable simply computes an identity function.
Splits self
into a certain number of chunks of size chunk_size
skipping the
remainder along each dimension that doesn’t fit evenly.
Concatenates the given sequence of differentiable variables variables
, including
self
, along the given axis, and returns a differentiable variable with the results.
Arguments
-
variables
- sequence of differentiable variables. -
axis
- axis to concatenate along to.
Panics
If the variables have mismatching shapes, apart from along axis, if the variables are empty,
if axis
is out of bounds or if the result is larger than is possible to represent.
Examples
use std::boxed::Box;
use neuronika;
use ndarray;
let a = neuronika::ones((3, 2)).requires_grad();
let b = neuronika::full((3, 2), 4.).requires_grad();
let c = neuronika::full((3, 2), 3.).requires_grad();
let mut d = a.cat(&[Box::new(b), Box::new(c)], 1);
d.forward();
assert_eq!(*d.data(), ndarray::array![[1., 1., 4., 4., 3., 3.],
[1., 1., 4., 4., 3., 3.],
[1., 1., 4., 4., 3., 3.]]);
Stacks the given sequence of differentiable variables variables
, including
self
, along the given axis, and returns a differentiable variable with the results.
All variables must have the same shape.
Arguments
-
variables
- sequence of differentiable variables. -
axis
- axis to stack along to.
Panics
If the variables have mismatching shapes, apart from along axis, if the variables are empty,
if axis
is out of bounds or if the result is larger than is possible to represent.
Examples
use std::boxed::Box;
use neuronika;
use ndarray;
let a = neuronika::ones((2, 2)).requires_grad();
let b = neuronika::ones((2, 2)).requires_grad();
let c = neuronika::ones((2, 2)).requires_grad();
let mut d = a.stack(&[Box::new(b), Box::new(c)], 0);
d.forward();
assert_eq!(*d.data(), ndarray::array![[[1., 1.],
[1., 1.]],
[[1., 1.],
[1., 1.]],
[[1., 1.],
[1., 1.]]]);
Trait Implementations
impl<F1, F2, B2, Pad> Convolve<Var<F1>, VarDiff<F2, B2>, Pad> for Var<F1> where
F1: NData + 'static,
F1::Dim: RemoveAxis,
<F1::Dim as Dimension>::Smaller: RemoveAxis,
<<F1::Dim as Dimension>::Smaller as Dimension>::Smaller: ReflPad + ReplPad,
F2: NData<Dim = F1::Dim> + 'static,
B2: Gradient<Dim = F2::Dim> + Overwrite + Display + Debug,
Pad: PaddingMode + 'static,
impl<F1, F2, B2, Pad> Convolve<Var<F1>, VarDiff<F2, B2>, Pad> for Var<F1> where
F1: NData + 'static,
F1::Dim: RemoveAxis,
<F1::Dim as Dimension>::Smaller: RemoveAxis,
<<F1::Dim as Dimension>::Smaller as Dimension>::Smaller: ReflPad + ReplPad,
F2: NData<Dim = F1::Dim> + 'static,
B2: Gradient<Dim = F2::Dim> + Overwrite + Display + Debug,
Pad: PaddingMode + 'static,
The type of the convolution’s result. See the differentiability arithmetic for more details. Read more
impl<F1, B1, F2, B2, Pad> Convolve<VarDiff<F1, B1>, VarDiff<F2, B2>, Pad> for VarDiff<F1, B1> where
F1: NData + Debug + Display + 'static,
F1::Dim: RemoveAxis,
<F1::Dim as Dimension>::Smaller: RemoveAxis,
<<F1::Dim as Dimension>::Smaller as Dimension>::Smaller: ReflPad + ReplPad,
B1: Gradient<Dim = F1::Dim> + Overwrite,
F2: NData<Dim = F1::Dim> + Debug + Display + 'static,
B2: Gradient<Dim = F2::Dim> + Overwrite,
Pad: PaddingMode + 'static,
impl<F1, B1, F2, B2, Pad> Convolve<VarDiff<F1, B1>, VarDiff<F2, B2>, Pad> for VarDiff<F1, B1> where
F1: NData + Debug + Display + 'static,
F1::Dim: RemoveAxis,
<F1::Dim as Dimension>::Smaller: RemoveAxis,
<<F1::Dim as Dimension>::Smaller as Dimension>::Smaller: ReflPad + ReplPad,
B1: Gradient<Dim = F1::Dim> + Overwrite,
F2: NData<Dim = F1::Dim> + Debug + Display + 'static,
B2: Gradient<Dim = F2::Dim> + Overwrite,
Pad: PaddingMode + 'static,
The type of the convolution’s result. See the differentiability arithmetic for more details. Read more
impl<F1, F2, B2, Pad> ConvolveWithGroups<Var<F1>, VarDiff<F2, B2>, Pad> for Var<F1> where
F1: NData + 'static,
F1::Dim: RemoveAxis,
<F1::Dim as Dimension>::Smaller: RemoveAxis,
<<F1::Dim as Dimension>::Smaller as Dimension>::Smaller: ReflPad + ReplPad,
F2: NData<Dim = F1::Dim> + 'static,
B2: Gradient<Dim = F2::Dim> + Overwrite,
Pad: PaddingMode + 'static,
impl<F1, F2, B2, Pad> ConvolveWithGroups<Var<F1>, VarDiff<F2, B2>, Pad> for Var<F1> where
F1: NData + 'static,
F1::Dim: RemoveAxis,
<F1::Dim as Dimension>::Smaller: RemoveAxis,
<<F1::Dim as Dimension>::Smaller as Dimension>::Smaller: ReflPad + ReplPad,
F2: NData<Dim = F1::Dim> + 'static,
B2: Gradient<Dim = F2::Dim> + Overwrite,
Pad: PaddingMode + 'static,
The type of the grouped convolution’s result. See the differentiability arithmetic for more details. Read more
impl<F1, B1, F2, B2, Pad> ConvolveWithGroups<VarDiff<F1, B1>, VarDiff<F2, B2>, Pad> for VarDiff<F1, B1> where
F1: NData + Debug + Display + 'static,
F1::Dim: RemoveAxis,
<F1::Dim as Dimension>::Smaller: RemoveAxis,
<<F1::Dim as Dimension>::Smaller as Dimension>::Smaller: ReflPad + ReplPad,
B1: Gradient<Dim = F1::Dim> + Overwrite,
F2: NData<Dim = F1::Dim> + Debug + Display + 'static,
B2: Gradient<Dim = F2::Dim> + Overwrite,
Pad: PaddingMode + 'static,
impl<F1, B1, F2, B2, Pad> ConvolveWithGroups<VarDiff<F1, B1>, VarDiff<F2, B2>, Pad> for VarDiff<F1, B1> where
F1: NData + Debug + Display + 'static,
F1::Dim: RemoveAxis,
<F1::Dim as Dimension>::Smaller: RemoveAxis,
<<F1::Dim as Dimension>::Smaller as Dimension>::Smaller: ReflPad + ReplPad,
B1: Gradient<Dim = F1::Dim> + Overwrite,
F2: NData<Dim = F1::Dim> + Debug + Display + 'static,
B2: Gradient<Dim = F2::Dim> + Overwrite,
Pad: PaddingMode + 'static,
The type of the grouped convolution’s result. See the differentiability arithmetic for more details. Read more
The type of the matrix-matrix multiplication’s result. See the differentiability arithmetic for more details. Read more
The type of the matrix-matrix multiplication’s result. See the differentiability arithmetic for more details. Read more
The type of the matrix-matrix multiplication’s result. See the differentiability arithmetic for more details. Read more
The type of the matrix-matrix multiplication with transposed right hand side operand’s result. See the differentiability arithmetic for more details. Read more
The type of the matrix-matrix multiplication with transposed right hand side operand’s result. See the differentiability arithmetic for more details. Read more
The type of the matrix-matrix multiplication with transposed right hand side operand’s result. See the differentiability arithmetic for more details. Read more
Registers self
’s parameters to the model’s status parameters params
.
Register self
’s status to the model’s status state status
.
The type of the vector-matrix multiplication’s result. See the differentiability arithmetic for more details. Read more
The type of the vector-matrix multiplication’s result. See the differentiability arithmetic for more details. Read more
The type of the vector-matrix multiplication’s result. See the differentiability arithmetic for more details. Read more
Auto Trait Implementations
impl<T, U> !RefUnwindSafe for VarDiff<T, U>
impl<T, U> !UnwindSafe for VarDiff<T, U>
Blanket Implementations
Mutably borrows from an owned value. Read more