Crate border_async_trainer

Expand description

Asynchronous trainer with parallel sampling processes.

The code might look like below.


type Env = TestEnv;
type ObsBatch = TestObsBatch;
type ActBatch = TestActBatch;
type ReplayBuffer = SimpleReplayBuffer<ObsBatch, ActBatch>;
type StepProcessor = SimpleStepProcessor<Env, ObsBatch, ActBatch>;

// Create a new agent by wrapping the existing agent in order to implement SyncModel.
struct TestAgent2(TestAgent);

impl border_core::Configurable for TestAgent2 {
    type Config = TestAgentConfig;

    fn build(config: Self::Config) -> Self {
        Self(TestAgent::build(config))
    }
}

impl border_core::Agent<Env, ReplayBuffer> for TestAgent2 {
    // Boilerplate code to delegate the method calls to the inner agent.
    fn train(&mut self) {
        self.0.train();
     }

     // For other methods ...
}

impl border_core::Policy<Env> for TestAgent2 {
      // Boilerplate code to delegate the method calls to the inner agent.
      // ...
}

impl border_async_trainer::SyncModel for TestAgent2{
    // Self::ModelInfo shold include the model parameters.
    type ModelInfo = usize;


    fn model_info(&self) -> (usize, Self::ModelInfo) {
        // Extracts the model parameters and returns them as Self::ModelInfo.
        // The first element of the tuple is the number of optimization steps.
        (0, 0)
    }

    fn sync_model(&mut self, _model_info: &Self::ModelInfo) {
        // implements synchronization of the model based on the _model_info
    }
}

let agent_configs: Vec<_> = vec![agent_config()];
let env_config_train = env_config();
let env_config_eval = env_config();
let replay_buffer_config = SimpleReplayBufferConfig::default();
let step_proc_config = SimpleStepProcessorConfig::default();
let actor_man_config = ActorManagerConfig::default();
let async_trainer_config = AsyncTrainerConfig::default();
let mut recorder: Box<dyn Recorder<_, _>> = Box::new(NullRecorder::new());
let mut evaluator = DefaultEvaluator::<TestEnv>::new(&env_config_eval, 0, 1).unwrap();

border_async_trainer::util::train_async::<TestAgent2, _, _, StepProcessor>(
    &agent_config(),
    &agent_configs,
    &env_config_train,
    &env_config_eval,
    &step_proc_config,
    &replay_buffer_config,
    &actor_man_config,
    &async_trainer_config,
    &mut recorder,
    &mut evaluator,
);

Training process consists of the following two components:

ActorManager manages Actors, each of which runs a thread for interacting Agent and Env and taking samples. Those samples will be sent to the replay buffer in AsyncTrainer.
AsyncTrainer is responsible for training of an agent. It also runs a thread for pushing samples from ActorManager into a replay buffer.

The Agent must implement SyncModel trait in order to synchronize the model of the agent in Actor with the trained agent in AsyncTrainer. The trait has the ability to import and export the information of the model as SyncModel::ModelInfo.

The Agent in AsyncTrainer is responsible for training, typically with a GPU, while the Agents in Actors in ActorManager is responsible for sampling using CPU.

Both AsyncTrainer and ActorManager are running in the same machine and communicate by channels.

Modules§

util: Utility function.

Structs§

Actor: Generate transitions by running Agent in Env.
ActorManager: Manages Actors.
ActorManagerConfig: Configuration of ActorManager.
ActorStat: Stats of sampling process in an Actor.
AsyncTrainStat: Stats of AsyncTrainer::train().
AsyncTrainer: Manages asynchronous training loop in a single machine.
AsyncTrainerConfig: Configuration of AsyncTrainer.
PushedItemMessage: Message containing a ReplayBufferBase::Item.
ReplayBufferProxy: A wrapper of replay buffer for asynchronous trainer.
ReplayBufferProxyConfig: Configuration of ReplayBufferProxy.

Enums§

BorderAsyncTrainerError

Traits§

SyncModel: Synchronizes the model of the agent in asynchronous training.

Functions§