Trait ParallelEnvironment

Source

pub trait ParallelEnvironment {
    type AgentId: Eq + Hash + Clone + Send + Sync + 'static;
    type Observation: Clone + Send + Sync + 'static;
    type Action: Clone + Send + Sync + 'static;
    type Info: Default + Clone + Send + Sync + 'static;

    // Required methods
    fn possible_agents(&self) -> &[Self::AgentId];
    fn agents(&self) -> &[Self::AgentId];
    fn step(
        &mut self,
        actions: HashMap<Self::AgentId, Self::Action>,
    ) -> HashMap<Self::AgentId, StepResult<Self::Observation, Self::Info>>;
    fn reset(
        &mut self,
        seed: Option<u64>,
    ) -> HashMap<Self::AgentId, (Self::Observation, Self::Info)>;
    fn sample_action(
        &self,
        agent: &Self::AgentId,
        rng: &mut impl Rng,
    ) -> Self::Action;

    // Provided methods
    fn state(&self) -> Option<Self::Observation> { ... }
    fn is_done(&self) -> bool { ... }
    fn num_agents(&self) -> usize { ... }
    fn max_num_agents(&self) -> usize { ... }
}

Expand description

A multi-agent environment where all agents act simultaneously each step.

Mirrors the semantics of PettingZoo’s Parallel API, adapted for Rust’s type system. Use this when all agents observe and act at every step — cooperative navigation, competitive games, mixed-team tasks.

§Design principles

possible_agents vs agents: possible_agents() is the fixed universe of all agent IDs. agents() is the live subset for the current episode. After reset(), agents == possible_agents. Agents are removed from agents when their episode ends; the episode is over when agents is empty.
Joint step: step() takes exactly one action per agent in agents() and returns one StepResult per active agent. Providing actions for terminated agents or omitting active agents is undefined behaviour.
Homogeneous agents: all agents share Observation, Action, and Info types. Heterogeneous agents can be modelled with enum wrappers over the per-type variants.
Bevy-compatible by design: AgentId: Eq + Hash + Send + Sync + 'static means Bevy Entity is a valid agent ID directly, enabling free ECS-based parallelisation across agents in bevy-gym.
No render(): visualisation is bevy-gym’s concern.
No close(): implement Drop if your environment holds resources.

§Example

use std::collections::HashMap;
use rl_traits::{ParallelEnvironment, StepResult, EpisodeStatus};
use rand::Rng;

struct CoopGame {
    active: Vec<usize>,
}

impl ParallelEnvironment for CoopGame {
    type AgentId = usize;
    type Observation = f32;
    type Action = bool;   // cooperate or defect
    type Info = ();

    fn possible_agents(&self) -> &[usize] { &[0, 1] }
    fn agents(&self) -> &[usize] { &self.active }

    fn step(&mut self, _actions: HashMap<usize, bool>)
        -> HashMap<usize, StepResult<f32, ()>>
    {
        self.active.iter()
            .map(|&id| (id, StepResult::new(0.0_f32, 1.0, EpisodeStatus::Continuing, ())))
            .collect()
    }

    fn reset(&mut self, _seed: Option<u64>) -> HashMap<usize, (f32, ())> {
        self.active = vec![0, 1];
        self.active.iter().map(|&id| (id, (0.0_f32, ()))).collect()
    }

    fn sample_action(&self, _agent: &usize, rng: &mut impl Rng) -> bool {
        rng.gen()
    }
}

Required Associated Types§

Source

type AgentId: Eq + Hash + Clone + Send + Sync + 'static

Identifier for each agent.

Common choices: usize (index), &'static str (name), or a Bevy Entity for direct ECS integration without an extra lookup.

Source

type Observation: Clone + Send + Sync + 'static

The observation type produced by step() and reset().

Send + Sync + 'static are required for Bevy ECS compatibility.

Source

type Action: Clone + Send + Sync + 'static

The action type consumed by step().

Source

type Info: Default + Clone + Send + Sync + 'static

Auxiliary information returned alongside observations.

Use () if you don’t need it — Default is implemented for ().

Required Methods§

Source

fn possible_agents(&self) -> &[Self::AgentId]

The complete, fixed set of agent IDs for this environment.

Does not change between episodes or as agents terminate mid-episode. Use agents() for the currently live set.

Source

fn agents(&self) -> &[Self::AgentId]

The agents currently active in this episode.

Starts equal to possible_agents() after reset(). Shrinks as agents terminate or are truncated; never grows. Empty when the episode is over.

Source

fn step( &mut self, actions: HashMap<Self::AgentId, Self::Action>, ) -> HashMap<Self::AgentId, StepResult<Self::Observation, Self::Info>>

Advance the environment by one step using joint actions.

actions must contain exactly one entry per agent in self.agents(). After this call, agents whose result was done are removed from agents().

Source

fn reset( &mut self, seed: Option<u64>, ) -> HashMap<Self::AgentId, (Self::Observation, Self::Info)>

Reset the environment to an initial state, starting a new episode.

If seed is Some(u64), the environment should use it to seed its internal RNG for deterministic reproduction of episodes. Returns the initial observation and info for every agent.

Source

fn sample_action( &self, agent: &Self::AgentId, rng: &mut impl Rng, ) -> Self::Action

Sample a random action for the given agent.

The rng is caller-supplied so exploration randomness can be seeded and tracked independently from environment randomness.

Provided Methods§

Source

fn state(&self) -> Option<Self::Observation>

A global state observation of the full environment.

Used by centralised-training / decentralised-execution algorithms (e.g. MADDPG, QMIX) that condition a centralised critic on the full state while individual policies see only local observations. Returns None by default; override if your environment supports it.

Source

fn is_done(&self) -> bool

Returns true when all agents have finished (active set is empty).

Examples found in repository ?

examples/pursuit.rs (line 132)

126fn run_episode(env: &mut Pursuit, rng: &mut SmallRng) -> ([f64; 2], EpisodeStatus, usize) {
127    env.reset(None);
128    let mut returns = [0.0_f64; 2];
129    let mut steps = 0;
130    let mut outcome = EpisodeStatus::Continuing;
131
132    while !env.is_done() {
133        let actions = env.agents().iter()
134            .map(|&id| (id, env.sample_action(&id, rng)))
135            .collect();
136
137        let results = env.step(actions);
138        steps += 1;
139
140        for (id, result) in &results {
141            returns[*id as usize] += result.reward;
142            if result.status.is_done() {
143                outcome = result.status.clone();
144            }
145        }
146    }
147
148    (returns, outcome, steps)
149}

Source

fn num_agents(&self) -> usize

Number of currently active agents.

Source

fn max_num_agents(&self) -> usize

Maximum number of agents that could ever be active simultaneously.

Dyn Compatibility§

This trait is not dyn compatible.

In older versions of Rust, dyn compatibility was called "object safety", so this trait is not object safe.

Trait ParallelEnvironment Copy item path

§Design principles

§Example

Required Associated Types§

type AgentId: Eq + Hash + Clone + Send + Sync + 'static

type Observation: Clone + Send + Sync + 'static

type Action: Clone + Send + Sync + 'static

type Info: Default + Clone + Send + Sync + 'static

Required Methods§

fn possible_agents(&self) -> &[Self::AgentId]

fn agents(&self) -> &[Self::AgentId]

fn step( &mut self, actions: HashMap<Self::AgentId, Self::Action>, ) -> HashMap<Self::AgentId, StepResult<Self::Observation, Self::Info>>

fn reset( &mut self, seed: Option<u64>, ) -> HashMap<Self::AgentId, (Self::Observation, Self::Info)>

fn sample_action( &self, agent: &Self::AgentId, rng: &mut impl Rng, ) -> Self::Action

Provided Methods§

fn state(&self) -> Option<Self::Observation>

fn is_done(&self) -> bool

fn num_agents(&self) -> usize

fn max_num_agents(&self) -> usize

Dyn Compatibility§

Implementors§

Trait ParallelEnvironment