# Changelog
All notable changes to `mctrust` will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [0.4.0] — 2026-03-27
### Added
- **Pluggable `Evaluator` trait** — replace random rollouts with neural networks, domain
heuristics, or any custom evaluation function. This is the integration point for
AlphaZero / MuZero-style search.
- **Multi-agent support** — `Environment::current_player()` and `num_players()` enable
negamax-style reward flipping for adversarial environments.
- **Tree reuse** — `advance_to_action()` re-roots the search tree at the chosen child,
preserving the entire subtree and all accumulated statistics. This is the technique
used by Stockfish, Leela Chess Zero, and KataGo.
- **`run_until(predicate)`** — arbitrary stop conditions: target reward, convergence
detection, external cancellation, wall-clock limits, etc.
- **Node budget** — `with_max_nodes(limit)` bounds memory usage. Expansion stops after the
limit; the engine continues selection + simulation on existing nodes.
- **`principal_variation_states()`** — replays the PV through the environment, returning
the full sequence of intermediate states for debugging and visualization.
- **Graphviz DOT export** — `export_dot(depth)` produces a Graphviz representation of
the search tree. Pipe to `dot -Tsvg` for visual inspection.
- 8 new unit tests covering evaluator, node budget, PV states,
`run_until`, DOT export, and tree reuse.
### Changed
- **Domain-neutral API renaming** — no public users, clean break before v1:
- `GameSearch` → `TreeSearch`
- `GameSearchCheckpoint` → `TreeSearchCheckpoint`
- `GameState` → `Outcome`
- `Outcome::Win` → `Outcome::Success`
- `Outcome::Loss` → `Outcome::Failure`
- `Outcome::Draw` → `Outcome::Neutral`
- `BanditConfig::sanitize()` now returns `Vec<String>` (matching `SearchConfig::sanitize()`),
eliminating the library's last remaining `eprintln!` call.
- `BanditSearch::observe()` now rejects non-finite rewards (NaN, ±Infinity) to prevent
silent score poisoning.
- `pick_best_child()` no longer calls `legal_actions()` when the tree policy is not PUCT,
eliminating a hot-path waste in UCT/Thompson/Gumbel modes.
- Removed redundant `.max()` call in `progressive_limit()`.
- Extracted `best_root_child_id()` helper to eliminate duplicated max-visits-child logic.
- Parallel sub-searchers now get independent fresh DAG tables instead of cloning the
parent's table, reducing memory from O(threads × entries) to O(threads + entries).
- `principal_variation()` is now cycle-safe in DAG mode (tracks visited node IDs).
- Eliminated last production `unwrap()` call in `principal_variation_states()`.
### Fixed
- 6 adversarial tests updated to match sanitize/reject semantics for NaN/Infinity inputs.
## [0.3.0] — 2026-03-27
### Added
- **Gumbel MuZero tree policy** — hyperparameter-free exploration via Sequential Halving.
Achieves equivalent search quality in ~16x fewer simulations vs. standard PUCT.
- **DAG transposition tables** — merge identical states reached through different paths,
compressing the search tree and accelerating convergence. Gated behind `feature = "dag"`.
- **Root parallelism** — lock-free linear scaling across CPU cores via Rayon.
Gated behind `feature = "parallel"`.
- **Time budgets** — `SearchConfig::time_budget` for wall-clock deadline support
in real-time systems (game servers, web handlers, security probes).
- **Stepped iteration** — `TreeSearch::run_step()` executes exactly one MCTS iteration
for fine-grained control, async interleaving, and progress reporting.
- **Principal variation extraction** — `TreeSearch::principal_variation()` returns the
engine's optimal line of play through the tree.
- **Best root reward** — `TreeSearch::best_root_reward()` returns the average reward
of the most-visited root action.
- **DAG convenience API** — `enable_dag()`, `disable_dag()`, `dag_hit_count()`.
- **Cycle-safe DAG descent** — automatic detection of graph cycles created by
transposition reuse, preventing infinite loops in selection.
- **`state_hash()` on `Environment` trait** — optional method for DAG deduplication.
- **`RaveConfig` re-export** from the crate root.
- **`SearchConfigLoadError` conditional export** (behind `feature = "toml"`).
### Changed
- **RNG engine**: Switched from `rand::rngs::StdRng` to `rand_chacha::ChaCha8Rng`
across both `TreeSearch` and `BanditSearch` for deterministic, faster, `no_std`-compatible
rollout simulations.
- **`sanitize()` returns `Vec<String>`** instead of writing to stderr via `eprintln!`.
Libraries should never write to stderr; callers now receive structured warnings.
- **`BanditConfigBuilder::build()` now calls `sanitize()`**, matching `SearchConfigBuilder`.
- **`toml` is now an optional dependency** behind `feature = "toml"`. Users who construct
configs programmatically avoid pulling in 5+ transitive dependencies.
- **Removed `async` and `simd` feature stubs** — these were declared but contained zero
code, violating the no-stubs principle.
- **Version bump to 0.3.0** reflecting the new public API surface.
### Fixed
- Infinite loop when DAG transposition reuse creates graph cycles in reversible environments
(e.g., `Inc → Dec → Inc...`). Selection now tracks visited node IDs per descent.
## [0.2.1] — 2026-03-26
### Added
- Initial release with UCT, PUCT, Thompson Sampling, RAVE, progressive widening,
checkpoint/restore, bandit search, TOML config parsing, and comprehensive test suite.
[0.3.0]: https://github.com/santhsecurity/mctrust/compare/v0.2.1...v0.3.0
[0.2.1]: https://github.com/santhsecurity/mctrust/releases/tag/v0.2.1