Module vrp_core::algorithms::mdp[−][src]

This module contains definition of Markov Decision Process (MDP) model and related reinforcement learning logic.

Structs

ActionEstimates	Keeps track of action estimation.
EpsilonGreedy	An e-greedy action selection strategy which acts as greedy except it can select some random action with probability specified.
EpsilonWeighted	An e-weighted action selection strategy.
Greedy	A greedy strategy.
MonteCarlo	Applies monte cargo learning strategy to calculate values for taken actions.
QLearning	Applies q-learning strategy to calculate values for taken actions.
Simulator	A simulator to train agent with multiple episodes.

Agent	Represents an agent in MDP.
LearningStrategy	A learning strategy for the MDP.
PolicyStrategy	A policy strategy for MDP.
State	Represents a state in MDP.

A type which keeps track of all state-action estimates.