Expand description
Core traits for reinforcement learning environments, policies, and agents.
rl-traits defines the shared vocabulary used across the ecosystem:
- [
ember-rl]: algorithm implementations using Burn for neural networks - [
bevy-gym]: Bevy ECS plugin for visualising and parallelising environments
§Design goals
-
Type-safe by default: observation and action spaces are associated types, not runtime objects. The compiler catches mismatches.
-
Correct
TerminatedvsTruncateddistinction: algorithms that bootstrap value estimates (PPO, DQN, SAC) need this distinction. It is baked intoEpisodeStatusfrom day one. -
Rendering-free: this crate has no concept of visualisation. That belongs in
bevy-gym. -
Bevy-compatible:
Send + Sync + 'staticbounds on associated types mean anyEnvironmentimplementation can be a BevyComponent, enabling free ECS parallelisation viaQuery::par_iter_mut(). -
Minimal dependencies: only
randfor RNG abstractions.
§Quick start
use rl_traits::{Environment, StepResult, EpisodeStatus};
use rand::Rng;
struct MyEnv;
impl Environment for MyEnv {
type Observation = f32;
type Action = usize;
type Info = ();
fn step(&mut self, action: usize) -> StepResult<f32, ()> {
StepResult::new(0.0, 1.0, EpisodeStatus::Continuing, ())
}
fn reset(&mut self, _seed: Option<u64>) -> (f32, ()) {
(0.0, ())
}
fn sample_action(&self, rng: &mut impl Rng) -> usize {
rng.gen_range(0..4)
}
}Re-exports§
pub use agent::Agent;pub use buffer::ReplayBuffer;pub use environment::Environment;pub use episode::EpisodeStatus;pub use episode::StepResult;pub use experience::Experience;pub use multi_agent::AecEnvironment;pub use multi_agent::ParallelEnvironment;pub use policy::Policy;pub use policy::StochasticPolicy;pub use wrappers::TimeLimit;pub use wrappers::Wrapper;