pub trait StdBatchBase {
    type ObsBatch;
    type ActBatch;

    // Required methods
    fn unpack(
        self
    ) -> (Self::ObsBatch, Self::ActBatch, Self::ObsBatch, Vec<f32>, Vec<i8>, Option<Vec<usize>>, Option<Vec<f32>>);
    fn len(&self) -> usize;
    fn obs(&self) -> &Self::ObsBatch;
    fn act(&self) -> &Self::ActBatch;
    fn next_obs(&self) -> &Self::ObsBatch;
    fn reward(&self) -> &Vec<f32>;
    fn is_done(&self) -> &Vec<i8>;
    fn weight(&self) -> &Option<Vec<f32>>;
    fn ix_sample(&self) -> &Option<Vec<usize>>;
    fn empty() -> Self;
}
Expand description

A batch of transitions for training agents.

This trait represents a standard transition (o, a, o', r, is_done), where o is an observation, a is an action, o' is an observation after some time steps. Typically, o' is for the next step and used as single-step backup. o' can also be for the multiple steps after o and in this case it is sometimes called n-step backup.

The type of o and o' is the associated type ObsBatch. The type of a is the associated type ActBatch.

Required Associated Types§

source

type ObsBatch

A set of observation in a batch.

source

type ActBatch

A set of observation in a batch.

Required Methods§

source

fn unpack( self ) -> (Self::ObsBatch, Self::ActBatch, Self::ObsBatch, Vec<f32>, Vec<i8>, Option<Vec<usize>>, Option<Vec<f32>>)

Unpack the data (o_t, a_t, o_t+n, r_t, is_done_t).

Optionally, the return value has sample indices in the replay buffer and thier weights. Those are used for prioritized experience replay (PER).

source

fn len(&self) -> usize

Returns the number of samples in the batch.

source

fn obs(&self) -> &Self::ObsBatch

Returns o_t.

source

fn act(&self) -> &Self::ActBatch

Returns a_t.

source

fn next_obs(&self) -> &Self::ObsBatch

Returns o_t+1.

source

fn reward(&self) -> &Vec<f32>

Returns r_t.

source

fn is_done(&self) -> &Vec<i8>

Returns is_done_t.

source

fn weight(&self) -> &Option<Vec<f32>>

Returns weight. It is used for PER.

source

fn ix_sample(&self) -> &Option<Vec<usize>>

Returns ix_sample. It is used for PER.

source

fn empty() -> Self

Creates an empty batch.

Implementors§

source§

impl<O, A> StdBatchBase for StdBatch<O, A>where O: SubBatch, A: SubBatch,

§

type ObsBatch = O

§

type ActBatch = A