Expand description
RelayRL learning algorithms and a small trainer façade for constructing them.
§Layout
PpoTrainer— independent PPO (and theIPPOnaming alias for the same type).ReinforceTrainer— independent REINFORCE (andIREINFORCEalias).MultiagentTrainer— MAPPO / MAREINFORCE; no external step kernel type parameter.
Use RelayRLTrainer as a convenience namespace with the same constructors as those types.
After construction, drive training through AlgorithmTrait: pass trajectories whose type
implements TrajectoryData (for example RelayRL, CSV, or Arrow trajectory wrappers from
relayrl_types), then call receive_trajectory, train_model, log_epoch, and save as your
integration loop requires.
§Re-exports
This module re-exports algorithm structs and hyperparameter types (PPOParams, MAPPOParams,
…) plus kernel traits (PPOKernelTrait, StepKernelTrait, TrainableKernelTrait) so
callers can supply custom kernels without digging through submodule paths.
§Generics
B, InK, and OutK are the Burn backend and tensor kinds your environment uses. Independent
trainers also take a kernel type K. If the compiler cannot infer them, use a turbofish on the
constructor, e.g. RelayRLTrainer::mappo::<B, InK, OutK>(args, None)?, or give the result a
concrete type in a let binding.
§Examples in this file
Fenced examples are illustrative (ignore): substitute your real B, InK, OutK, kernel
K, and async runtime. They are not run as doctests by default so environments without optional
backends (for example libtorch) still build docs cleanly.
§End-to-end training flow
use std::path::PathBuf;
use relayrl_algorithms::{AlgorithmError, AlgorithmTrait, RelayRLTrainer, TrainerArgs};
use relayrl_types::prelude::trajectory::RelayRLTrajectory;
async fn run_training_loop<B, InK, OutK>() -> Result<(), AlgorithmError> {
let args = TrainerArgs {
env_dir: PathBuf::from("./env"),
save_model_path: PathBuf::from("./checkpoints"),
obs_dim: 64,
act_dim: 8,
buffer_size: 1_000_000,
};
let mut trainer = RelayRLTrainer::mappo::<B, InK, OutK>(args, None)?;
let mut trajectory = RelayRLTrajectory::new(1024);
// Populate `trajectory` from your environment loop before handing it to the trainer.
trainer.receive_trajectory(trajectory).await?;
trainer.train_model();
trainer.log_epoch();
trainer.save("epoch-0001");
Ok(())
}Re-exports§
pub use algorithms::PPO::IPPOAlgorithm;pub use algorithms::PPO::IPPOParams;pub use algorithms::PPO::MAPPOAlgorithm;pub use algorithms::PPO::MAPPOParams;pub use algorithms::PPO::PPOAlgorithm;pub use algorithms::PPO::PPOKernelTrait;pub use algorithms::PPO::PPOParams;pub use algorithms::REINFORCE::IREINFORCEAlgorithm;pub use algorithms::REINFORCE::IREINFORCEParams;pub use algorithms::REINFORCE::MAREINFORCEAlgorithm;pub use algorithms::REINFORCE::MAREINFORCEParams;pub use algorithms::REINFORCE::REINFORCEParams;pub use algorithms::REINFORCE::ReinforceAlgorithm;pub use templates::base_algorithm::AlgorithmError;pub use templates::base_algorithm::AlgorithmTrait;pub use templates::base_algorithm::StepKernelTrait;pub use templates::base_algorithm::TrainableKernelTrait;pub use templates::base_algorithm::TrajectoryData;pub use templates::base_algorithm::WeightProvider;
Modules§
Structs§
- RelayRL
Trainer - Namespace type with static constructors for each trainer family.
- Trainer
Args - Shared filesystem and shape arguments for every trainer constructor in this module.
Enums§
- Multiagent
Trainer - Runtime wrapper for multi-agent MAPPO and MAREINFORCE.
- Multiagent
Trainer Spec - Describes which multi-agent trainer to build (MAPPO or MAREINFORCE).
- PpoTrainer
- Runtime wrapper for independent PPO-family algorithms, parameterized by your step kernel
K. - PpoTrainer
Spec - Describes which independent PPO trainer to build, before you supply a kernel
K. - Reinforce
Trainer - Runtime wrapper for independent REINFORCE-family algorithms with kernel
K. - Reinforce
Trainer Spec - Describes which independent REINFORCE trainer to build, before you supply a kernel
K.