Struct relearn::torch::critic::Gae[][src]

pub struct Gae<V> {
    pub gamma: f64,
    pub lambda: f64,
    pub value_fn: V,
}
Expand description

Generalized Advantage Estimator critic.

Note

Currently does not properly handle non-terminal end-of-episode. This assumes that all episodes end with a reward of 0.

Reference

High-Dimensional Continuous Control Using Generalized Advantage Estimation. ICLR 2016 by John Schulman, Philipp Moritz, Sergey Levine, Michael I. Jordan, Pieter Abbeel https://arxiv.org/pdf/1506.02438.pdf

Fields

gamma: f64

Clips the environment discount factor to be no more than this.

lambda: f64

Advantage interpolation factor between one-step residuals (=0) and full return (=1).

value_fn: V

State value function module.

Trait Implementations

Returns a copy of the value. Read more

Performs copy-assignment from source. Read more

Whether this critic has trainable internal parameters

Get the discount factor to use when calculating step returns. Read more

Provide values for a packed sequence of steps. Read more

The loss of any trainable internal variables given the observed history features. Read more

Formats the value using the given formatter. Read more

This method tests for self and other values to be equal, and is used by ==. Read more

This method tests for !=.

Auto Trait Implementations

Blanket Implementations

Gets the TypeId of self. Read more

Immutably borrows from an owned value. Read more

Mutably borrows from an owned value. Read more

Performs the conversion.

Performs the conversion.

The resulting type after obtaining ownership.

Creates owned data from borrowed data, usually by cloning. Read more

🔬 This is a nightly-only experimental API. (toowned_clone_into)

recently added

Uses borrowed data to replace owned data, usually by cloning. Read more

The type returned in the event of a conversion error.

Performs the conversion.

The type returned in the event of a conversion error.

Performs the conversion.