mctrust 0.4.0 - Docs.rs

#![cfg_attr(
    not(test),
    deny(
        clippy::unwrap_used,
        clippy::expect_used,
        clippy::todo,
        clippy::unimplemented,
        clippy::panic
    )
)]
#![allow(
    clippy::module_name_repetitions,
    clippy::must_use_candidate,
    clippy::missing_errors_doc
)]
//! # mctrust — Production-grade Monte Carlo Tree Search
//!
//! A generic, zero-domain-assumption MCTS engine for any sequential
//! decision-making problem: game AI, portfolio optimization, security
//! scan planning, hyperparameter tuning, molecular design, compiler
//! optimization, robot path planning, or anything with a state space
//! to explore.
//!
//! ## Design
//!
//! This crate provides two complementary search interfaces:
//!
//! 1. **Environment-based** ([`TreeSearch`]) — Full tree-search MCTS. You define
//!    an [`Environment`] with actions, state transitions, and terminal conditions.
//!    The engine handles selection, expansion, simulation, and backpropagation.
//!    Use for: optimization, planning, game AI, theorem proving, scheduling.
//!
//! 2. **Bandit-based** ([`BanditSearch`]) — Two-level MCTS for explore/exploit
//!    problems where you have a flat set of options grouped by category.
//!    The engine uses UCT+RAVE to decide which option to try next, and you
//!    feed back rewards. Use for: fuzzing, A/B testing, hyperparameter search,
//!    scan strategy optimization, any problem with pre-enumerated choices.
//!
//! ## Feature Flags
//!
//! | Feature    | Default | Description |
//! |------------|---------|-------------|
//! | `std`      | ✅      | Enables `rand/std` and `StdRng` — disable for `#![no_std]` targets |
//! | `parallel` | ✅      | Root-parallel MCTS via Rayon — linear scaling across CPU cores |
//! | `dag`      | ✅      | Transposition Table (DAG) via `hashbrown` — compresses identical states |
//! | `toml`     | ✅      | Enables TOML config parsing (`from_toml_str`, `from_toml_file`) |
//!
//! ## Quick Start (Environment-based)
//!
//! ```rust
//! use mctrust::{Environment, Outcome, TreeSearch, SearchConfig, Reward};
//!
//! #[derive(Clone)]
//! struct Optimizer { value: f64 }
//!
//! #[derive(Clone, Debug, PartialEq)]
//! enum Action { Increase, Decrease }
//!
//! impl Environment for Optimizer {
//!     type Action = Action;
//!
//!     fn legal_actions(&self) -> Vec<Action> {
//!         vec![Action::Increase, Action::Decrease]
//!     }
//!
//!     fn apply(&mut self, action: &Action) {
//!         match action {
//!             Action::Increase => self.value += 0.1,
//!             Action::Decrease => self.value -= 0.1,
//!         }
//!     }
//!
//!     fn evaluate(&self) -> Outcome {
//!         if (self.value - 1.0).abs() < 0.01 {
//!             Outcome::Terminal(Reward::WIN)
//!         } else if self.value < -5.0 || self.value > 5.0 {
//!             Outcome::Terminal(Reward::LOSS)
//!         } else {
//!             Outcome::Ongoing
//!         }
//!     }
//! }
//!
//! let state = Optimizer { value: 0.0 };
//! let config = SearchConfig::default();
//! let mut search = TreeSearch::new(state, config);
//! let best_action = search.run();
//! ```
//!
//! ## Quick Start (Bandit-based)
//!
//! ```rust
//! use mctrust::{BanditSearch, BanditConfig};
//!
//! let mut search = BanditSearch::new(BanditConfig::default());
//!
//! // Register 100 options across 5 categories.
//! for category in 0..5u64 {
//!     for option in 0..20u64 {
//!         search.add_arm(category * 20 + option, category as u32);
//!     }
//! }
//!
//! // Explore and feed rewards.
//! while let Some(option_id) = search.next_arm() {
//!     let reward = 0.5; // your evaluation
//!     search.observe(option_id, reward);
//! }
//! ```
//!
//! ## Capabilities
//!
//! - **UCT selection** with configurable exploration constant
//! - **PUCT** (AlphaZero-style) with pluggable action priors
//! - **Gumbel `MuZero`** — hyperparameter-free Sequential Halving
//! - **Thompson Sampling** for stochastic exploration
//! - **RAVE** for rapid cross-branch value estimation
//! - **Pluggable `Evaluator` trait** — replace random rollout with neural networks or domain heuristics
//! - **Multi-agent support** — negamax-style backpropagation for adversarial environments
//! - **Tree reuse** — re-root the tree after each decision, preserving subtree statistics
//! - **DAG transposition tables** — merge identical states, compress the tree
//! - **Root parallelism** via Rayon — lock-free linear scaling across cores
//! - **Node budget** — bound memory usage with `with_max_nodes()`
//! - **`run_until(predicate)`** — arbitrary stop conditions (reward threshold, convergence, cancellation)
//! - **Principal variation extraction** — follow the best line through the tree (actions + states)
//! - **Graphviz DOT export** — visualize the search tree at any depth
//! - **Checkpoint/restore** — pause and resume searches across processes
//! - **Progressive widening** for continuous or massive action spaces
//! - **Stepped iteration** — `run_step()` for fine-grained control, async interleaving
//! - **Time and iteration budgets** — wall-clock deadline support for real-time systems
//! - **Deterministic RNG** via `ChaCha8Rng`
//! - **Zero domain assumptions** — works for any sequential decision problem
//!
//! ## References
//!
//! - Kocsis & Szepesvári, "Bandit-based Monte Carlo Planning" (2006)
//! - Gelly & Silver, "Monte-Carlo tree search and rapid action value estimation
//!   in computer Go" (2011)
//! - Browne et al., "A Survey of Monte Carlo Tree Search Methods" (2012)
//! - Danihelka et al., "Policy Improvement by Planning with Gumbel" (2022)

#![warn(clippy::pedantic)]
#![forbid(unsafe_code)]
#![warn(missing_docs)]
#![allow(clippy::must_use_candidate)]
#![allow(clippy::needless_pass_by_value)]

mod bandit;
mod config;
mod environment;
mod game_search;
mod node;
mod reward;

#[cfg(test)]
mod adversarial_tests;

pub use bandit::{BanditConfig, BanditSearch, BanditSearchCheckpoint, GroupStats, Scalarizer};
#[cfg(feature = "toml")]
pub use config::SearchConfigLoadError;
pub use config::{ProgressiveWideningConfig, RaveConfig, SearchConfig, TreePolicy};
pub use environment::Evaluator;
pub use environment::Heuristic;
pub use environment::{Environment, Outcome};
pub use game_search::{TreeSearch, TreeSearchCheckpoint};
pub use node::NodeStats;
pub use reward::Reward;