# mctrust
Production-grade Monte Carlo Tree Search for any sequential decision problem. Portfolio optimization, security scan planning, hyperparameter tuning, game AI, molecular design, compiler optimization, robot path planning — anything with a state space to explore. Implement the `Environment` trait and the search handles the rest.
UCT, PUCT, Gumbel MuZero, Thompson Sampling, RAVE, pluggable evaluators, multi-agent negamax, tree reuse, DAG transpositions, root parallelism, progressive widening, node & time budgets, tree visualization, checkpoint/restore.
[](https://crates.io/crates/mctrust)
[](LICENSE)
## Quick Start
```rust
use mctrust::{Environment, Outcome, SearchConfig, TreeSearch, Reward};
#[derive(Clone)]
struct Optimizer { value: f64, target: f64 }
#[derive(Clone, Debug, PartialEq)]
enum Action { Increase, Decrease }
impl Environment for Optimizer {
type Action = Action;
fn legal_actions(&self) -> Vec<Action> {
vec![Action::Increase, Action::Decrease]
}
fn apply(&mut self, action: &Action) {
match action {
Action::Increase => self.value += 0.1,
Action::Decrease => self.value -= 0.1,
}
}
fn evaluate(&self) -> Outcome {
if (self.value - self.target).abs() < 0.05 {
Outcome::Success(Reward::WIN)
} else if self.value.abs() > 10.0 {
Outcome::Failure
} else {
Outcome::Ongoing
}
}
}
let state = Optimizer { value: 0.0, target: 3.0 };
let config = SearchConfig::builder().iterations(10_000).build();
let mut search = TreeSearch::new(state, config);
let best = search.run(); // Some(Action::Increase)
```
## Two Search Engines
### Environment-based (`TreeSearch`)
Full tree-search MCTS. Define an `Environment` with actions, transitions, and terminal conditions. The engine handles selection, expansion, simulation, and backpropagation.
Use for: optimization, planning, scheduling, game AI, theorem proving.
### Bandit-based (`BanditSearch`)
Two-level MCTS for explore/exploit problems with a flat set of options grouped by category. Uses UCT+RAVE to decide which option to evaluate next.
Use for: fuzzing, A/B testing, hyperparameter search, scan strategy optimization.
```rust
use mctrust::{BanditSearch, BanditConfig};
let mut search = BanditSearch::new(BanditConfig::default());
for category in 0..5u64 {
for option in 0..20u64 {
search.add_arm(category * 20 + option, category as u32);
}
}
while let Some(option_id) = search.next_arm() {
let reward = evaluate(option_id);
search.observe(option_id, reward);
}
```
## Tree Policies
| `Uct` | General purpose | Yes |
| `Puct` | Neural-network-guided search (AlphaZero-style) | Yes |
| `Gumbel` | Hyperparameter-free search (MuZero-style) | **No** |
| `ThompsonSampling` | Stochastic exploration | Temperature only |
```rust
use mctrust::{SearchConfig, TreePolicy};
// Gumbel MuZero: same quality in ~16x fewer simulations, zero tuning
let config = SearchConfig::builder()
.iterations(1_000)
.tree_policy(TreePolicy::Gumbel {
sampled_actions: 16,
max_completions_coeff: 50.0,
})
.build();
```
## Pluggable Evaluator
Replace random rollouts with neural networks, domain heuristics, or any custom evaluation:
```rust
use mctrust::{Evaluator, Environment, Reward};
struct ValueNetwork { /* your model */ }
impl<E: Environment> Evaluator<E> for ValueNetwork {
fn evaluate(&self, env: &E) -> Reward {
// Pass state through your model, return value estimate
Reward::new(0.7)
}
}
// search.with_evaluator(Arc::new(ValueNetwork { ... }));
```
## Multi-Agent Environments
For adversarial or multi-agent environments, implement `current_player()` and `num_players()`:
```rust
fn current_player(&self) -> usize {
self.turn % self.num_agents
}
fn num_players(&self) -> usize { 2 }
```
The engine automatically applies negamax-style reward flipping during backpropagation.
## Search Control
### Time Budgets
```rust
use std::time::Duration;
let config = SearchConfig::builder()
.iterations(10_000_000) // high ceiling
.time_budget(Duration::from_millis(50)) // but stop after 50ms
.build();
```
### Arbitrary Stop Conditions (`run_until`)
```rust
search.run_until(|s| {
s.best_root_reward().unwrap_or(0.0) > 0.95
|| s.total_simulations() >= 50_000
});
```
### Node Budget (Memory Safety)
```rust
search.with_max_nodes(1_000_000); // never allocate more than 1M nodes
search.run();
```
### Stepped Iteration
```rust
for _ in 0..100 {
search.run_step();
}
if search.best_root_reward().unwrap_or(0.0) > 0.9 {
// Early exit — already confident
}
```
## Tree Reuse (Warm Start Between Decisions)
After choosing an action, re-root the tree at the chosen child's subtree instead of discarding everything. Preserves all accumulated statistics — the next search starts with a massive head start.
```rust
loop {
if let Some(best) = search.run() {
apply_action(best.clone());
search.advance_to_action(&best); // preserves subtree
}
}
```
No other Rust MCTS crate supports tree reuse.
## DAG Transposition Tables
When your environment has multiple paths to the same state, enable DAG mode to merge duplicate nodes:
```rust
search.enable_dag();
search.run();
```
Requires implementing `state_hash()` on your `Environment`.
## Root Parallelism
Lock-free linear scaling across CPU cores:
```rust
let best = search.run_parallel(num_cpus::get());
```
Each thread runs an independent search tree; results are merged at the root.
## Diagnostics & Visualization
```rust
// Best line of actions through the tree:
let pv = search.principal_variation(); // Vec<Action>
// Full intermediate state sequence along the PV:
let pv_states = search.principal_variation_states(); // Vec<Environment>
// Per-action visit counts and rewards at root:
let stats = search.root_stats(); // Vec<(Action, NodeStats)>
// Best action's average reward:
let reward = search.best_root_reward(); // Option<f64>
// Graphviz DOT export for visual inspection:
let dot = search.export_dot(5);
std::fs::write("tree.dot", dot)?;
// Then: dot -Tsvg tree.dot -o tree.svg
```
## TOML Configuration
```toml
iterations = 50000
exploration_constant = 1.41
max_depth = 100
[tree_policy]
kind = "gumbel"
sampled_actions = 16
max_completions_coeff = 50.0
[rave]
enabled = true
bias = 300.0
```
```rust
let config = SearchConfig::from_toml_str(toml_string)?;
let config = SearchConfig::from_toml_file("search.toml")?;
```
## Feature Flags
| `std` | ✅ | Enables `rand/std` — disable for `#![no_std]` targets |
| `parallel` | ✅ | Root-parallel MCTS via Rayon |
| `dag` | ✅ | Transposition tables via `hashbrown` |
| `toml` | ✅ | TOML config parsing |
## Use Cases
MCTS generalizes to any sequential decision problem:
- **Finance** — portfolio rebalancing, options hedging, order execution strategies
- **Security** — scan strategy optimization, fuzzing prioritization, attack path planning
- **Operations** — scheduling, resource allocation, supply chain optimization
- **Science** — molecular design, protein folding, experiment design
- **Engineering** — compiler optimization, chip placement, network routing
- **Games** — board games, real-time strategy, puzzle solving
The `Environment` trait is fully generic. If you can define states, actions, and outcomes, mctrust can search it.
## Contributing
Pull requests welcome. Open a PR if you find a bug, a better API, or a rough edge.
## License
MIT. Copyright 2026 CORUM COLLECTIVE LLC.