Module vrp_core::algorithms::mdp[][src]

This module contains definition of Markov Decision Process (MDP) model and related reinforcement learning logic.

Structs

ActionEstimates

Keeps track of action estimation.

EpsilonGreedy

An e-greedy action selection strategy which acts as greedy except it can select some random action with probability specified.

EpsilonWeighted

An e-weighted action selection strategy.

Greedy

A greedy strategy.

MonteCarlo

Applies monte cargo learning strategy to calculate values for taken actions.

QLearning

Applies q-learning strategy to calculate values for taken actions.

Simulator

A simulator to train agent with multiple episodes.

Traits

Agent

Represents an agent in MDP.

LearningStrategy

A learning strategy for the MDP.

PolicyStrategy

A policy strategy for MDP.

State

Represents a state in MDP.

Type Definitions

StateEstimates

A type which keeps track of all state-action estimates.