mctrust 0.4.0

Universal search & planning toolkit — MCTS, bandit search, pluggable evaluators, tree reuse, DAG transpositions, root parallelism. Define an Environment, search handles the rest.
Documentation
# Changelog

All notable changes to `mctrust` will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.4.0] — 2026-03-27

### Added

- **Pluggable `Evaluator` trait** — replace random rollouts with neural networks, domain
  heuristics, or any custom evaluation function. This is the integration point for
  AlphaZero / MuZero-style search.
- **Multi-agent support**`Environment::current_player()` and `num_players()` enable
  negamax-style reward flipping for adversarial environments.
- **Tree reuse**`advance_to_action()` re-roots the search tree at the chosen child,
  preserving the entire subtree and all accumulated statistics. This is the technique
  used by Stockfish, Leela Chess Zero, and KataGo.
- **`run_until(predicate)`** — arbitrary stop conditions: target reward, convergence
  detection, external cancellation, wall-clock limits, etc.
- **Node budget**`with_max_nodes(limit)` bounds memory usage. Expansion stops after the
  limit; the engine continues selection + simulation on existing nodes.
- **`principal_variation_states()`** — replays the PV through the environment, returning
  the full sequence of intermediate states for debugging and visualization.
- **Graphviz DOT export**`export_dot(depth)` produces a Graphviz representation of
  the search tree. Pipe to `dot -Tsvg` for visual inspection.
- 8 new unit tests covering evaluator, node budget, PV states,
  `run_until`, DOT export, and tree reuse.

### Changed

- **Domain-neutral API renaming** — no public users, clean break before v1:
  - `GameSearch``TreeSearch`
  - `GameSearchCheckpoint``TreeSearchCheckpoint`
  - `GameState``Outcome`
  - `Outcome::Win``Outcome::Success`
  - `Outcome::Loss``Outcome::Failure`
  - `Outcome::Draw``Outcome::Neutral`
- `BanditConfig::sanitize()` now returns `Vec<String>` (matching `SearchConfig::sanitize()`),
  eliminating the library's last remaining `eprintln!` call.
- `BanditSearch::observe()` now rejects non-finite rewards (NaN, ±Infinity) to prevent
  silent score poisoning.
- `pick_best_child()` no longer calls `legal_actions()` when the tree policy is not PUCT,
  eliminating a hot-path waste in UCT/Thompson/Gumbel modes.
- Removed redundant `.max()` call in `progressive_limit()`.
- Extracted `best_root_child_id()` helper to eliminate duplicated max-visits-child logic.
- Parallel sub-searchers now get independent fresh DAG tables instead of cloning the
  parent's table, reducing memory from O(threads × entries) to O(threads + entries).
- `principal_variation()` is now cycle-safe in DAG mode (tracks visited node IDs).
- Eliminated last production `unwrap()` call in `principal_variation_states()`.

### Fixed

- 6 adversarial tests updated to match sanitize/reject semantics for NaN/Infinity inputs.

## [0.3.0] — 2026-03-27

### Added

- **Gumbel MuZero tree policy** — hyperparameter-free exploration via Sequential Halving.
  Achieves equivalent search quality in ~16x fewer simulations vs. standard PUCT.
- **DAG transposition tables** — merge identical states reached through different paths,
  compressing the search tree and accelerating convergence. Gated behind `feature = "dag"`.
- **Root parallelism** — lock-free linear scaling across CPU cores via Rayon.
  Gated behind `feature = "parallel"`.
- **Time budgets**`SearchConfig::time_budget` for wall-clock deadline support
  in real-time systems (game servers, web handlers, security probes).
- **Stepped iteration**`TreeSearch::run_step()` executes exactly one MCTS iteration
  for fine-grained control, async interleaving, and progress reporting.
- **Principal variation extraction**`TreeSearch::principal_variation()` returns the
  engine's optimal line of play through the tree.
- **Best root reward**`TreeSearch::best_root_reward()` returns the average reward
  of the most-visited root action.
- **DAG convenience API**`enable_dag()`, `disable_dag()`, `dag_hit_count()`.
- **Cycle-safe DAG descent** — automatic detection of graph cycles created by
  transposition reuse, preventing infinite loops in selection.
- **`state_hash()` on `Environment` trait** — optional method for DAG deduplication.
- **`RaveConfig` re-export** from the crate root.
- **`SearchConfigLoadError` conditional export** (behind `feature = "toml"`).

### Changed

- **RNG engine**: Switched from `rand::rngs::StdRng` to `rand_chacha::ChaCha8Rng`
  across both `TreeSearch` and `BanditSearch` for deterministic, faster, `no_std`-compatible
  rollout simulations.
- **`sanitize()` returns `Vec<String>`** instead of writing to stderr via `eprintln!`.
  Libraries should never write to stderr; callers now receive structured warnings.
- **`BanditConfigBuilder::build()` now calls `sanitize()`**, matching `SearchConfigBuilder`.
- **`toml` is now an optional dependency** behind `feature = "toml"`. Users who construct
  configs programmatically avoid pulling in 5+ transitive dependencies.
- **Removed `async` and `simd` feature stubs** — these were declared but contained zero
  code, violating the no-stubs principle.
- **Version bump to 0.3.0** reflecting the new public API surface.

### Fixed

- Infinite loop when DAG transposition reuse creates graph cycles in reversible environments
  (e.g., `Inc → Dec → Inc...`). Selection now tracks visited node IDs per descent.

## [0.2.1] — 2026-03-26

### Added

- Initial release with UCT, PUCT, Thompson Sampling, RAVE, progressive widening,
  checkpoint/restore, bandit search, TOML config parsing, and comprehensive test suite.

[0.3.0]: https://github.com/santhsecurity/mctrust/compare/v0.2.1...v0.3.0
[0.2.1]: https://github.com/santhsecurity/mctrust/releases/tag/v0.2.1