Expand description
Reinforcement learning environments
Re-exports
pub use meta::MetaEnv;
Modules
Structs
A multi-armed bandit
Cart-Pole environment
Configuration for the CartPole
environment.
Chain Environment
Random distribution over MDPs with Dirichlet sampled transition probabilities.
Wraps a two-player game as a one-player game for the first player.
Cooperative two-agent fruit collecting game.
Environment wrapper that interrupts episodes after a set number of steps.
Memory Game Environment
Distribution over deterministic bandits in which one arm has reward 1 and the rest have 0.
Environment where the goal is to partition vectors based on supervision
Wraps a two-player game as a one-player game for the second player.
Stored copy of an environment structure.
A distribution over Beroulli bandit environments with uniformly sampled means.
Environment wrapper that interrupts episodes after a set number of steps.
A basic wrapped object.
Enums
Error building an environment
The successor state or outcome of an episode step.
Traits
Build a StructuredEnvironment
.
Build an EnvDistribution
.
Marker indiciating that this object can build itself by cloning.
A distribution of Environment
sharing the same external structure.
The external structure of a reinforcement learning environment.
A reinforcement learning environment.
A Partially Observable Markov Decision Process
Marker trait for a wrapper that does not modify the environment structure.
An environment distribution with consistent EnvStructure
.
A reinforcement learning Environment
with consistent EnvStructure
.
Trait providing a wrap
method for all sized types.
Type Definitions
A multi-armed bandit where each arm samples from a Bernoulli distribution.
A multi-armed bandit where each arm has a determistic distribution.
A successor that does not store the successor state if continuing.
A successor that only stores a reference to the successor state if continuing.
Wrap an environment with a per-episode step limit.
Wrap an environment with a per-episode step limit.