Crate rurel [−] [src]

Rurel is a flexible, reusable reinforcement learning (Q learning) implementation in Rust.

Implement the Agent and State traits for your process, then create an AgentTrainer and train it for your process.

Basic Example

The following example defines the State as a position on a 21x21 2D matrix. The Actions that can be taken are: go up, go down, go left and go right. Positions closer to (10, 10) are assigned a higher reward.

After training, the AgentTrainer will have assigned higher values to actions which move closer to (10, 10).

use rurel::mdp::{State, Agent};

#[derive(PartialEq, Eq, Hash, Clone)]
struct MyState { x: i32, y: i32 }
#[derive(PartialEq, Eq, Hash, Clone)]
struct MyAction { dx: i32, dy: i32 }

impl State for MyState {
    type A = MyAction;
    fn reward(&self) -> f64 {
        // Negative Euclidean distance
        -((((10 - self.x).pow(2) + (10 - self.y).pow(2)) as f64).sqrt())
    }
    fn actions(&self) -> Vec<MyAction> {
        vec![MyAction { dx: 0, dy: -1 },    // up
             MyAction { dx: 0, dy: 1 }, // down
             MyAction { dx: -1, dy: 0 },    // left
             MyAction { dx: 1, dy: 0 }, // right
        ]
    }
}

struct MyAgent { state: MyState }
impl Agent<MyState> for MyAgent {
    fn current_state(&self) -> &MyState {
        &self.state
    }
    fn take_action(&mut self, action: &MyAction) -> () {
        match action {
            &MyAction { dx, dy } => {
                self.state = MyState {
                    x: (((self.state.x + dx) % 21) + 21) % 21, // (x+dx) mod 21
                    y: (((self.state.y + dy) % 21) + 21) % 21, // (y+dy) mod 21
                }
            }
        }
    }
}

use rurel::AgentTrainer;
use rurel::strategy::learn::QLearning;
use rurel::strategy::explore::RandomExploration;
use rurel::strategy::terminate::FixedIterations;

let mut trainer = AgentTrainer::new();
let mut agent = MyAgent { state: MyState { x: 0, y: 0 }};
trainer.train(&mut agent,
              &QLearning::new(0.2, 0.01, 2.),
              &mut FixedIterations::new(100000),
              &RandomExploration::new());

// Test to see if it worked
let test_state = MyState { x: 10, y: 9 };
let go_up = MyAction { dx: 0, dy: -1 };
let go_down = MyAction { dx: 0, dy: 1};
// Going down is better than going up
assert!(trainer.expected_value(&test_state, &go_down)
    > trainer.expected_value(&test_state, &go_up));

Modules

mdp
strategy

Structs

AgentTrainer

An AgentTrainer can be trained for using a certain Agent. After training, the AgentTrainer contains learned knowledge about the process, and can be queried for this. For example, you can ask the AgentTrainer the expected values of all possible actions in a given state.