Border
Border is a reinforcement learning library in Rust.
Status
Border is currently under development.
Prerequisites
In order to run examples, install python>=3.7 and gym. Gym is the only built-in environment. The library itself works with any kind of environment.
Examples
-
Random policy: the following command runs a random controller (policy) for 5 episodes in CartPole-v0:
It renders during the episodes and generates a csv file in
examples/model
, including the sequences of observation and reward values in the episodes. -
DQN agent: the following command trains a DQN agent:
After training, the trained agent runs for 5 episodes. In the code, the parameters of the trained Q-network (and the target network) are saved in
examples/model/dqn_cartpole
and load them for testing saving/loading trained models. -
SAC agent: the following command trains a SAC agent on Pendulum-v0, which takes continuous action:
The code defines an action filter that doubles the torque in the environment.
-
Pong: the following command trains a DQN agent on PongNoFrameskip-v4:
This demonstrates how to use vectorized environments, in which 4 environments are running synchronously (see code). It took about 11 hours for 2M steps on a
g3s.xlarge
instance on EC2. Hyperparameter values, tuned specific to Pong instead of all Atari games, are adapted from the book Deep Reinforcement Learning Hands-On. The learning curve is shown below.After the training, you can see how the agent plays:
Features
- Environments which wrap gym using PyO3 and ndarray
- Interfaces to record quantities in training process or in evaluation path
- Support tensorboard using tensorboard-rs
- Vectorized environment using a tweaked atari_wrapper.py, adapted from the RL example in tch
- Agents based on tch
Roadmap
- More tests and documentations
- Investigate a performance issue (https://github.com/taku-y/border/issues/5)
- More environments
- More RL algorithms
License
Border is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0).