Module relearn::envs

source · [−]

Expand description

Reinforcement learning environments

Re-exports

pub use meta::MetaEnv;

Modules

fruit

Fruit collection gridworlds.

Structs

Bandit

A multi-armed bandit

CartPole

Cart-Pole environment

CartPoleConfig

Configuration for the CartPole environment.

Chain

Chain Environment

DirichletRandomMdps

Random distribution over MDPs with Dirichlet sampled transition probabilities.

FirstPlayerView

Wraps a two-player game as a one-player game for the first player.

FruitGame

Cooperative two-agent fruit collecting game.

LatentStepLimit

Environment wrapper that interrupts episodes after a set number of steps.

MemoryGame

Memory Game Environment

OneHotBandits

Distribution over deterministic bandits in which one arm has reward 1 and the rest have 0.

PartitionGame

Environment where the goal is to partition vectors based on supervision

SecondPlayerView

Wraps a two-player game as a one-player game for the second player.

StoredEnvStructure

Stored copy of an environment structure.

UniformBernoulliBandits

A distribution over Beroulli bandit environments with uniformly sampled means.

VisibleStepLimit

Environment wrapper that interrupts episodes after a set number of steps.

Wrapped

A basic wrapped object.

Enums

BuildEnvError

Error building an environment

Successor

The successor state or outcome of an episode step.

Traits

BuildEnv

Build a StructuredEnvironment.

BuildEnvDist

Build an EnvDistribution.

CloneBuild

Marker indiciating that this object can build itself by cloning.

EnvDistribution

A distribution of Environment sharing the same external structure.

EnvStructure

The external structure of a reinforcement learning environment.

Environment

A reinforcement learning environment.

Pomdp

A Partially Observable Markov Decision Process

PomdpStructure

StructurePreservingWrapper

Marker trait for a wrapper that does not modify the environment structure.

StructuredEnvDist

An environment distribution with consistent EnvStructure.

StructuredEnvironment

A reinforcement learning Environment with consistent EnvStructure.

Wrap

Trait providing a wrap method for all sized types.

Type Definitions

BernoulliBandit

A multi-armed bandit where each arm samples from a Bernoulli distribution.

DeterministicBandit

A multi-armed bandit where each arm has a determistic distribution.

PartialSuccessor

A successor that does not store the successor state if continuing.

RefSuccessor

A successor that only stores a reference to the successor state if continuing.

WithLatentStepLimit

Wrap an environment with a per-episode step limit.

WithVisibleStepLimit

Wrap an environment with a per-episode step limit.