Struct border_core::core::trainer::Trainer[−][src]

pub struct Trainer<E: Env, A: Agent<E>> { /* fields omitted */ }

Expand description

Manages training process.

Training loop

For training an agent with standard RL algorithms in the library, the agent and environment interact as illustrated in the following diagram:

flowchart TB Trainer -. 0. Env::reset .-> Env Env --> Obs ObsPrev -- 3. Policy::sample --> Policy Policy --> Act Act -- 4. Env::step --> Env Obs --> Step Obs -- 1. RefCell::replace --> ObsPrev Act --> Step ObsPrev -- 2. Agent::push_obs --> ObsPrev' Step -- 5. Agent::observe --> Transition subgraph Agent ObsPrev' --> Transition ReplayBuffer -- 6. update policy parameters --- Policy Transition --> ReplayBuffer end

Call Env::reset for resetting the enbironment and getting an observation. An episode starts.
Call std::cell::RefCell::replace for placing the observation in PrevObs.
Call Agent::push_obs for placing the observation in PrevObs'.
Call Policy::sample for sampling an action from Policy.
Call Env::step for taking an action, getting a new observation, and creating Step object.
Call Agent::observe for updating the replay buffer with the new and previous observations.
Call some methods in the agent for updating policy parameters.
Back to 1.

Actually, Trainer is not responsible for the step 6. The Agent does it.

Model evaluation and saving

Trainer::train() evaluates the agent being trained with the interval of optimization steps specified by TrainerBuilder::eval_interval(). If the evaluation reward is greater than the maximum in the history of training, the agent will be saved in the directory specified by TrainerBuilder::model_dir().

A trained agent often consists of a number of neural networks like an action-value network, its target network, a policy network. Typically, Agent saves all of these neural networks in a directory.

Implementations

`impl<E: Env, A: Agent<E>> Trainer<E, A>`[src]

`pub fn get_agent(&self) -> &impl Agent<E>`[src]

Get the reference to the agent.

`pub fn get_env(&self) -> &E`[src]

Get the reference to the environment.

`pub fn get_env_eval(&self) -> &E`[src]

Get the reference to the environment for evaluation.

`pub fn train<T: Recorder>(&mut self, recorder: &mut T)`[src]

Train the agent.

In the training loop, the following values are recorded in the recorder:

n_steps - The nunber of steps interacting with the environment.
n_opts - The number of optimization steps.
datetime - Date and time.
mean_cum_eval_reward - Cumulative rewards in evaluation runs.

Auto Trait Implementations

`impl<E, A> !RefUnwindSafe for Trainer<E, A>`

`impl<E, A> Send for Trainer<E, A> where A: Send, E: Send, <E as Env>::Obs: Send,`

`impl<E, A> !Sync for Trainer<E, A>`

`impl<E, A> Unpin for Trainer<E, A> where A: Unpin, E: Unpin, <E as Env>::Obs: Unpin,`

`impl<E, A> UnwindSafe for Trainer<E, A> where A: UnwindSafe, E: UnwindSafe, <E as Env>::Obs: UnwindSafe,`

Blanket Implementations

`impl<T> Any for T where T: 'static + ?Sized,` [src]

`pub fn type_id(&self) -> TypeId`[src]

Gets the TypeId of self. Read more

`impl<T> Borrow<T> for T where T: ?Sized,` [src]

`pub fn borrow(&self) -> &T`[src]

Immutably borrows from an owned value. Read more

`impl<T> BorrowMut<T> for T where T: ?Sized,` [src]

`pub fn borrow_mut(&mut self) -> &mut T`[src]

Mutably borrows from an owned value. Read more

`impl<T> From<T> for T`[src]

`pub fn from(t: T) -> T`[src]

Performs the conversion.

`impl<T, U> Into for T where U: From<T>,` [src]

`pub fn into(self) -> U`[src]

Performs the conversion.

`impl<T> Pointable for T`

`pub const ALIGN: usize`

The alignment of pointer.

`type Init = T`

The type for initializers.

`pub unsafe fn init(init: <T as Pointable>::Init) -> usize`

Initializes a with the given initializer. Read more

`pub unsafe fn deref<'a>(ptr: usize) -> &'a T`

Dereferences the given pointer. Read more

`pub unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T`

Mutably dereferences the given pointer. Read more

`pub unsafe fn drop(ptr: usize)`

Drops the object pointed to by the given pointer. Read more

`impl<T, U> TryFrom for T where U: Into<T>,` [src]

`type Error = Infallible`

The type returned in the event of a conversion error.

`pub fn try_from(value: U) -> Result<T, <T as TryFrom>::Error>`[src]

Performs the conversion.

`impl<T, U> TryInto for T where U: TryFrom<T>,` [src]

`type Error = >::Error`

The type returned in the event of a conversion error.

`pub fn try_into(self) -> Result<U, >::Error>`[src]

Performs the conversion.

Struct border_core::core::trainer::Trainer[−][src]

Training loop

Model evaluation and saving

Implementations

`impl<E: Env, A: Agent<E>> Trainer<E, A>`[src]

`pub fn get_agent(&self) -> &impl Agent<E>`[src]

`pub fn get_env(&self) -> &E`[src]

`pub fn get_env_eval(&self) -> &E`[src]

`pub fn train<T: Recorder>(&mut self, recorder: &mut T)`[src]

Auto Trait Implementations

`impl<E, A> !RefUnwindSafe for Trainer<E, A>`

`impl<E, A> Send for Trainer<E, A> where A: Send, E: Send, <E as Env>::Obs: Send,`

`impl<E, A> !Sync for Trainer<E, A>`

`impl<E, A> Unpin for Trainer<E, A> where A: Unpin, E: Unpin, <E as Env>::Obs: Unpin,`

`impl<E, A> UnwindSafe for Trainer<E, A> where A: UnwindSafe, E: UnwindSafe, <E as Env>::Obs: UnwindSafe,`

Blanket Implementations

`impl<T> Any for T where T: 'static + ?Sized,` [src]

`pub fn type_id(&self) -> TypeId`[src]

`impl<T> Borrow<T> for T where T: ?Sized,` [src]

`pub fn borrow(&self) -> &T`[src]

`impl<T> BorrowMut<T> for T where T: ?Sized,` [src]

`pub fn borrow_mut(&mut self) -> &mut T`[src]

`impl<T> From<T> for T`[src]

`pub fn from(t: T) -> T`[src]

`impl<T, U> Into<U> for T where U: From<T>,` [src]

`pub fn into(self) -> U`[src]

`impl<T> Pointable for T`

`pub const ALIGN: usize`

`type Init = T`

`pub unsafe fn init(init: <T as Pointable>::Init) -> usize`

`pub unsafe fn deref<'a>(ptr: usize) -> &'a T`

`pub unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T`

`pub unsafe fn drop(ptr: usize)`

`impl<T, U> TryFrom<U> for T where U: Into<T>,` [src]

`type Error = Infallible`

`pub fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>`[src]

`impl<T, U> TryInto<U> for T where U: TryFrom<T>,` [src]

`type Error = <U as TryFrom<T>>::Error`

`pub fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>`[src]

Struct border_core::core::trainer::Trainer[−][src]

Training loop

Model evaluation and saving

Implementations

impl<E: Env, A: Agent<E>> Trainer<E, A>[src]

pub fn get_agent(&self) -> &impl Agent<E>[src]

pub fn get_env(&self) -> &E[src]

pub fn get_env_eval(&self) -> &E[src]

pub fn train<T: Recorder>(&mut self, recorder: &mut T)[src]

Auto Trait Implementations

impl<E, A> !RefUnwindSafe for Trainer<E, A>

impl<E, A> Send for Trainer<E, A> where A: Send, E: Send, <E as Env>::Obs: Send,

impl<E, A> !Sync for Trainer<E, A>

impl<E, A> Unpin for Trainer<E, A> where A: Unpin, E: Unpin, <E as Env>::Obs: Unpin,

impl<E, A> UnwindSafe for Trainer<E, A> where A: UnwindSafe, E: UnwindSafe, <E as Env>::Obs: UnwindSafe,

Blanket Implementations

impl<T> Any for T where T: 'static + ?Sized, [src]

pub fn type_id(&self) -> TypeId[src]

impl<T> Borrow<T> for T where T: ?Sized, [src]

pub fn borrow(&self) -> &T[src]

impl<T> BorrowMut<T> for T where T: ?Sized, [src]

pub fn borrow_mut(&mut self) -> &mut T[src]

impl<T> From<T> for T[src]

pub fn from(t: T) -> T[src]

impl<T, U> Into<U> for T where U: From<T>, [src]

pub fn into(self) -> U[src]

impl<T> Pointable for T

pub const ALIGN: usize

type Init = T

pub unsafe fn init(init: <T as Pointable>::Init) -> usize

pub unsafe fn deref<'a>(ptr: usize) -> &'a T

pub unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

pub unsafe fn drop(ptr: usize)

impl<T, U> TryFrom<U> for T where U: Into<T>, [src]

type Error = Infallible

pub fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>[src]

impl<T, U> TryInto<U> for T where U: TryFrom<T>, [src]

type Error = <U as TryFrom<T>>::Error

pub fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>[src]

`impl<E: Env, A: Agent<E>> Trainer<E, A>`[src]

`pub fn get_agent(&self) -> &impl Agent<E>`[src]

`pub fn get_env(&self) -> &E`[src]

`pub fn get_env_eval(&self) -> &E`[src]

`pub fn train<T: Recorder>(&mut self, recorder: &mut T)`[src]

`impl<E, A> !RefUnwindSafe for Trainer<E, A>`

`impl<E, A> Send for Trainer<E, A> where A: Send, E: Send, <E as Env>::Obs: Send,`

`impl<E, A> !Sync for Trainer<E, A>`

`impl<E, A> Unpin for Trainer<E, A> where A: Unpin, E: Unpin, <E as Env>::Obs: Unpin,`

`impl<E, A> UnwindSafe for Trainer<E, A> where A: UnwindSafe, E: UnwindSafe, <E as Env>::Obs: UnwindSafe,`

`impl<T> Any for T where T: 'static + ?Sized,` [src]

`pub fn type_id(&self) -> TypeId`[src]

`impl<T> Borrow<T> for T where T: ?Sized,` [src]

`pub fn borrow(&self) -> &T`[src]

`impl<T> BorrowMut<T> for T where T: ?Sized,` [src]

`pub fn borrow_mut(&mut self) -> &mut T`[src]

`impl<T> From<T> for T`[src]

`pub fn from(t: T) -> T`[src]

`impl<T, U> Into<U> for T where U: From<T>,` [src]

`pub fn into(self) -> U`[src]

`impl<T> Pointable for T`

`pub const ALIGN: usize`

`type Init = T`

`pub unsafe fn init(init: <T as Pointable>::Init) -> usize`

`pub unsafe fn deref<'a>(ptr: usize) -> &'a T`

`pub unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T`

`pub unsafe fn drop(ptr: usize)`

`impl<T, U> TryFrom<U> for T where U: Into<T>,` [src]

`type Error = Infallible`

`pub fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>`[src]

`impl<T, U> TryInto<U> for T where U: TryFrom<T>,` [src]

`type Error = <U as TryFrom<T>>::Error`

`pub fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>`[src]