pub trait Actor<O: ?Sized, A> {
type EpisodeState;
fn initial_state(&self, rng: &mut Prng) -> Self::EpisodeState;
fn act(
&self,
episode_state: &mut Self::EpisodeState,
observation: &O,
rng: &mut Prng
) -> A;
}
Expand description
Take actions in an environment.
The actions may depend on the action-observation history within an episode
but not across episodes. This is managed with an explicit EpisodeState
associated type.
Design Discussion
Episode State
If Generic Associated Types were stable, an alternate strategy would be to have
a self-contaned EpisodeActor<'a>
associated type with an act(&mut self, observation: &O)
method. However, this would make it challenging to store both an Actor
and its EpisodeActor
together (if wanting a single object to act over multiple sequential episodes).
As such, the current EpisodeState
strategy might still be preferable.
Another strategy (allowed without GAT) is for the Actor
to internally manage episode state
and provide a reset()
method for resetting between episodes. This lacks the benefit of being
able to guarantee independence between episodes via the type system.
Random State
The actor is not responsible for managing its own pseudo-random state. This avoids having to frequently re-initialize the random number generator on each episode and simplifies episode state definitions.
Required Associated Types
type EpisodeState
type EpisodeState
Stores state for each episode.
Required Methods
fn initial_state(&self, rng: &mut Prng) -> Self::EpisodeState
fn initial_state(&self, rng: &mut Prng) -> Self::EpisodeState
Create the initial actor state for the start of a new episode.
fn act(
&self,
episode_state: &mut Self::EpisodeState,
observation: &O,
rng: &mut Prng
) -> A
fn act(
&self,
episode_state: &mut Self::EpisodeState,
observation: &O,
rng: &mut Prng
) -> A
Select an action in response to an observation.
May depend on and update the episode state.
The observation, the selected action, and any other internal state may be stored into
episode_state
.