Rurel
Rurel is a flexible, reusable reinforcement learning (Q learning) implementation in Rust.
In Cargo.toml:
= "0.3.0"
An example is included. This teaches an agent on a 21x21 grid how to arrive at 10,10, using actions (go left, go up, go right, go down):
cargo run --example eucdist
Getting started
There are two main traits you need to implement: rurel::mdp::State
and rurel::mdp::Agent
.
A State
is something which defines a Vec
of actions that can be taken from this state, and has a certain reward. A State
needs to define the corresponding action type A
.
An Agent
is something which has a current state, and given an action, can take the action and evaluate the next state.
Example
Let's implement the example in cargo run --example eucdist
. We want to make an agent which is taught how to arrive at 10,10 on a 21x21 grid.
First, let's define a State
, which should represent a position on a 21x21, and the correspoding Action, which is either up, down, left or right.
use State;
Then define the agent:
use Agent;
That's all. Now make a trainer and train the agent with Q learning, with learning rate 0.2, discount factor 0.01 and an initial value of Q of 2.0. We let the trainer run for 100000 iterations, randomly exploring new states.
use AgentTrainer;
use QLearning;
use RandomExploration;
use FixedIterations;
let mut trainer = new;
let mut agent = MyAgent ;
trainer.train;
After this, you can query the learned value (Q) for a certain action in a certain state by:
trainer.expected_value // : Option<f64>
Development
- Run
cargo +nightly fmt
to format the code.