Module relearn::envs[][src]

Expand description

Reinforcement learning environments

Structs

A multi-armed bandit

Chain Environment

Random distribution over MDPs with Dirichlet sampled transition probabilities.

Wrapper that provides the inner environment structure of a meta environment.

Memory Game Environment

A meta reinforcement learning environment with internal state.

Configuration for MetaEnv

A meta reinforcement learning environment that treats RL itself as an environment.

The state of a MetaPomdp.

Distribution over deterministic bandits in which one arm has reward 1 and the rest have 0.

Wraps a Pomdp as a Environment.

Environment wrapper that cuts off episodes after a set number of steps.

Stored copy of an environment structure.

A distribution over Beroulli bandit environments with uniformly sampled means.

A basic wrapped object.

Enums

Error building an environment

Traits

Build a Pomdp.

Marker indiciating that this object can build itself by cloning.

A distribution of environments sharing the same external structure.

The external structure of a reinforcement learning environment.

A reinforcement learning environment with internal state.

Convert into an Environment.

A Markov decision process (MDP).

A partially observable Markov decision process (POMDP).

A distribution of Pomdp sharing the same external structure.

Type Definitions

A multi-armed bandit where each arm samples from a Bernoulli distribution.

A multi-armed bandit where each arm has a determistic distribution.

Meta-environment observation space. See MetaPomdp for details.

Wrap an environment with a per-episode step limit.