Expand description

Reinforcement learning environments

Re-exports

pub use meta::MetaEnv;

Modules

Fruit collection gridworlds.

Meta reinforcement learning environment.

Structs

A multi-armed bandit

Cart-Pole environment

Configuration for the CartPole environment.

Chain Environment

Random distribution over MDPs with Dirichlet sampled transition probabilities.

Wraps a two-player game as a one-player game for the first player.

Cooperative two-agent fruit collecting game.

Environment wrapper that interrupts episodes after a set number of steps.

Memory Game Environment

Distribution over deterministic bandits in which one arm has reward 1 and the rest have 0.

Environment where the goal is to partition vectors based on supervision

Wraps a two-player game as a one-player game for the second player.

Stored copy of an environment structure.

A distribution over Beroulli bandit environments with uniformly sampled means.

Environment wrapper that interrupts episodes after a set number of steps.

A basic wrapped object.

Enums

Error building an environment

The successor state or outcome of an episode step.

Traits

Marker indiciating that this object can build itself by cloning.

A distribution of Environment sharing the same external structure.

The external structure of a reinforcement learning environment.

A reinforcement learning environment.

A Partially Observable Markov Decision Process

Marker trait for a wrapper that does not modify the environment structure.

An environment distribution with consistent EnvStructure.

A reinforcement learning Environment with consistent EnvStructure.

Trait providing a wrap method for all sized types.

Type Definitions

A multi-armed bandit where each arm samples from a Bernoulli distribution.

A multi-armed bandit where each arm has a determistic distribution.

A successor that does not store the successor state if continuing.

A successor that only stores a reference to the successor state if continuing.

Wrap an environment with a per-episode step limit.

Wrap an environment with a per-episode step limit.