# Crate rurel [−] [src]

Rurel is a flexible, reusable reinforcement learning (Q learning) implementation in Rust.

Implement the Agent and State traits for your process, then create an AgentTrainer and train it for your process.

# Basic Example

The following example defines the `State` as a position on a 21x21 2D matrix. The `Action`s that can be taken are: go up, go down, go left and go right. Positions closer to (10, 10) are assigned a higher reward.

After training, the AgentTrainer will have assigned higher values to actions which move closer to (10, 10).

```use rurel::mdp::{State, Agent};

#[derive(PartialEq, Eq, Hash, Clone)]
struct MyState { x: i32, y: i32 }
#[derive(PartialEq, Eq, Hash, Clone)]
struct MyAction { dx: i32, dy: i32 }

impl State for MyState {
type A = MyAction;
fn reward(&self) -> f64 {
// Negative Euclidean distance
-((((10 - self.x).pow(2) + (10 - self.y).pow(2)) as f64).sqrt())
}
fn actions(&self) -> Vec<MyAction> {
vec![MyAction { dx: 0, dy: -1 },    // up
MyAction { dx: 0, dy: 1 }, // down
MyAction { dx: -1, dy: 0 },    // left
MyAction { dx: 1, dy: 0 }, // right
]
}
}

struct MyAgent { state: MyState }
impl Agent<MyState> for MyAgent {
fn current_state(&self) -> &MyState {
&self.state
}
fn take_action(&mut self, action: &MyAction) -> () {
match action {
&MyAction { dx, dy } => {
self.state = MyState {
x: (((self.state.x + dx) % 21) + 21) % 21, // (x+dx) mod 21
y: (((self.state.y + dy) % 21) + 21) % 21, // (y+dy) mod 21
}
}
}
}
}

use rurel::AgentTrainer;
use rurel::strategy::learn::QLearning;
use rurel::strategy::explore::RandomExploration;
use rurel::strategy::terminate::FixedIterations;

let mut trainer = AgentTrainer::new();
let mut agent = MyAgent { state: MyState { x: 0, y: 0 }};
trainer.train(&mut agent,
&QLearning::new(0.2, 0.01, 2.),
&mut FixedIterations::new(100000),
&RandomExploration::new());

// Test to see if it worked
let test_state = MyState { x: 10, y: 9 };
let go_up = MyAction { dx: 0, dy: -1 };
let go_down = MyAction { dx: 0, dy: 1};
// Going down is better than going up
assert!(trainer.expected_value(&test_state, &go_down)
> trainer.expected_value(&test_state, &go_up));```

## Modules

 mdp strategy

## Structs

 AgentTrainer An `AgentTrainer` can be trained for using a certain Agent. After training, the `AgentTrainer` contains learned knowledge about the process, and can be queried for this. For example, you can ask the `AgentTrainer` the expected values of all possible actions in a given state.