treant-gumbel 0.1.0

Gumbel MuZero search: Sequential Halving with Gumbel noise for MCTS. Built on the treant crate.
Documentation
  • Coverage
  • 100%
    25 out of 25 items documented1 out of 12 items with examples
  • Size
  • Source code size: 49.99 kB This is the summed size of all the files inside the crates.io package for this release.
  • Documentation size: 2.77 MB This is the summed size of all files generated by rustdoc for all configured targets
  • Ø build duration
  • this release: 1m 10s Average build duration of successful builds.
  • all releases: 1m 10s Average build duration of successful builds in releases after 2024-10-23.
  • Links
  • Homepage
  • patricker/treant
    2 0 0
  • crates.io
  • Dependencies
  • Versions
  • Owners
  • patricker

treant-gumbel

Gumbel MuZero search for Rust: Sequential Halving with Gumbel noise for Monte Carlo Tree Search. Produces a policy with monotonic improvement — more simulations always yield a better move distribution.

Built on top of the treant crate, reusing its GameState trait so any game works with both standard MCTS and Gumbel search.

Based on Danihelka et al., "Policy improvement by planning with Gumbel" (ICLR 2022).

When to use

  • Self-play training with guaranteed policy improvement
  • Distilling search into a neural network
  • Low simulation budgets where PUCT degrades

Example

use treant_gumbel::{GumbelConfig, GumbelSearch};
// See the docs for a complete example.

License

MIT