mctrust 0.4.0

Universal search & planning toolkit — MCTS, bandit search, pluggable evaluators, tree reuse, DAG transpositions, root parallelism. Define an Environment, search handles the rest.
Documentation

mctrust

Production-grade Monte Carlo Tree Search for any sequential decision problem. Portfolio optimization, security scan planning, hyperparameter tuning, game AI, molecular design, compiler optimization, robot path planning — anything with a state space to explore. Implement the Environment trait and the search handles the rest.

UCT, PUCT, Gumbel MuZero, Thompson Sampling, RAVE, pluggable evaluators, multi-agent negamax, tree reuse, DAG transpositions, root parallelism, progressive widening, node & time budgets, tree visualization, checkpoint/restore.

Crates.io License: MIT

Quick Start

use mctrust::{Environment, Outcome, SearchConfig, TreeSearch, Reward};

#[derive(Clone)]
struct Optimizer { value: f64, target: f64 }

#[derive(Clone, Debug, PartialEq)]
enum Action { Increase, Decrease }

impl Environment for Optimizer {
    type Action = Action;

    fn legal_actions(&self) -> Vec<Action> {
        vec![Action::Increase, Action::Decrease]
    }

    fn apply(&mut self, action: &Action) {
        match action {
            Action::Increase => self.value += 0.1,
            Action::Decrease => self.value -= 0.1,
        }
    }

    fn evaluate(&self) -> Outcome {
        if (self.value - self.target).abs() < 0.05 {
            Outcome::Success(Reward::WIN)
        } else if self.value.abs() > 10.0 {
            Outcome::Failure
        } else {
            Outcome::Ongoing
        }
    }
}

let state = Optimizer { value: 0.0, target: 3.0 };
let config = SearchConfig::builder().iterations(10_000).build();
let mut search = TreeSearch::new(state, config);
let best = search.run(); // Some(Action::Increase)

Two Search Engines

Environment-based (TreeSearch)

Full tree-search MCTS. Define an Environment with actions, transitions, and terminal conditions. The engine handles selection, expansion, simulation, and backpropagation.

Use for: optimization, planning, scheduling, game AI, theorem proving.

Bandit-based (BanditSearch)

Two-level MCTS for explore/exploit problems with a flat set of options grouped by category. Uses UCT+RAVE to decide which option to evaluate next.

Use for: fuzzing, A/B testing, hyperparameter search, scan strategy optimization.

use mctrust::{BanditSearch, BanditConfig};

let mut search = BanditSearch::new(BanditConfig::default());
for category in 0..5u64 {
    for option in 0..20u64 {
        search.add_arm(category * 20 + option, category as u32);
    }
}
while let Some(option_id) = search.next_arm() {
    let reward = evaluate(option_id);
    search.observe(option_id, reward);
}

Tree Policies

Policy Use Case Exploration Constant?
Uct General purpose Yes
Puct Neural-network-guided search (AlphaZero-style) Yes
Gumbel Hyperparameter-free search (MuZero-style) No
ThompsonSampling Stochastic exploration Temperature only
use mctrust::{SearchConfig, TreePolicy};

// Gumbel MuZero: same quality in ~16x fewer simulations, zero tuning
let config = SearchConfig::builder()
    .iterations(1_000)
    .tree_policy(TreePolicy::Gumbel {
        sampled_actions: 16,
        max_completions_coeff: 50.0,
    })
    .build();

Pluggable Evaluator

Replace random rollouts with neural networks, domain heuristics, or any custom evaluation:

use mctrust::{Evaluator, Environment, Reward};

struct ValueNetwork { /* your model */ }

impl<E: Environment> Evaluator<E> for ValueNetwork {
    fn evaluate(&self, env: &E) -> Reward {
        // Pass state through your model, return value estimate
        Reward::new(0.7)
    }
}

// search.with_evaluator(Arc::new(ValueNetwork { ... }));

Multi-Agent Environments

For adversarial or multi-agent environments, implement current_player() and num_players():

fn current_player(&self) -> usize {
    self.turn % self.num_agents
}
fn num_players(&self) -> usize { 2 }

The engine automatically applies negamax-style reward flipping during backpropagation.

Search Control

Time Budgets

use std::time::Duration;
let config = SearchConfig::builder()
    .iterations(10_000_000)  // high ceiling
    .time_budget(Duration::from_millis(50))  // but stop after 50ms
    .build();

Arbitrary Stop Conditions (run_until)

search.run_until(|s| {
    s.best_root_reward().unwrap_or(0.0) > 0.95
    || s.total_simulations() >= 50_000
});

Node Budget (Memory Safety)

search.with_max_nodes(1_000_000); // never allocate more than 1M nodes
search.run();

Stepped Iteration

for _ in 0..100 {
    search.run_step();
}
if search.best_root_reward().unwrap_or(0.0) > 0.9 {
    // Early exit — already confident
}

Tree Reuse (Warm Start Between Decisions)

After choosing an action, re-root the tree at the chosen child's subtree instead of discarding everything. Preserves all accumulated statistics — the next search starts with a massive head start.

loop {
    if let Some(best) = search.run() {
        apply_action(best.clone());
        search.advance_to_action(&best); // preserves subtree
    }
}

No other Rust MCTS crate supports tree reuse.

DAG Transposition Tables

When your environment has multiple paths to the same state, enable DAG mode to merge duplicate nodes:

search.enable_dag();
search.run();

Requires implementing state_hash() on your Environment.

Root Parallelism

Lock-free linear scaling across CPU cores:

let best = search.run_parallel(num_cpus::get());

Each thread runs an independent search tree; results are merged at the root.

Diagnostics & Visualization

// Best line of actions through the tree:
let pv = search.principal_variation(); // Vec<Action>

// Full intermediate state sequence along the PV:
let pv_states = search.principal_variation_states(); // Vec<Environment>

// Per-action visit counts and rewards at root:
let stats = search.root_stats(); // Vec<(Action, NodeStats)>

// Best action's average reward:
let reward = search.best_root_reward(); // Option<f64>

// Graphviz DOT export for visual inspection:
let dot = search.export_dot(5);
std::fs::write("tree.dot", dot)?;
// Then: dot -Tsvg tree.dot -o tree.svg

TOML Configuration

iterations = 50000
exploration_constant = 1.41
max_depth = 100

[tree_policy]
kind = "gumbel"
sampled_actions = 16
max_completions_coeff = 50.0

[rave]
enabled = true
bias = 300.0
let config = SearchConfig::from_toml_str(toml_string)?;
let config = SearchConfig::from_toml_file("search.toml")?;

Feature Flags

Feature Default Description
std Enables rand/std — disable for #![no_std] targets
parallel Root-parallel MCTS via Rayon
dag Transposition tables via hashbrown
toml TOML config parsing

Use Cases

MCTS generalizes to any sequential decision problem:

  • Finance — portfolio rebalancing, options hedging, order execution strategies
  • Security — scan strategy optimization, fuzzing prioritization, attack path planning
  • Operations — scheduling, resource allocation, supply chain optimization
  • Science — molecular design, protein folding, experiment design
  • Engineering — compiler optimization, chip placement, network routing
  • Games — board games, real-time strategy, puzzle solving

The Environment trait is fully generic. If you can define states, actions, and outcomes, mctrust can search it.

Contributing

Pull requests welcome. Open a PR if you find a bug, a better API, or a rough edge.

License

MIT. Copyright 2026 CORUM COLLECTIVE LLC.