muxer
Multi-objective bandit routing.
Select among K arms (models, endpoints, backends) using per-call outcomes. Handles non-stationary reward distributions.
See the API docs and examples/EXPERIMENTS.md for derivations and failure modes.
Usage
[]
= "0.5.0"
Deterministic core only (no stochastic bandits):
[]
= { = "0.5", = false }
Quickstart
use ;
let arms = vec!;
let mut router = new.unwrap;
loop
For larger arm counts, pass k > 1 to batch exploration:
let cfg = default.with_coverage;
let d = router.select; // K=30, k=3 -> coverage in ~10 rounds
Examples
Deterministic multi-objective selection
use ;
use BTreeMap;
let arms = vec!;
let mut summaries = new;
summaries.insert;
summaries.insert;
let sel = select_mab;
assert_eq!; // lower junk rate wins
Detect-then-triage
use ;
let arms = vec!;
let mut session = new.unwrap;
session.observe;
session.observe;
let alarmed = session.alarmed_arms;
let bins = session.tracker.active_bins;
let cells = session.top_alarmed_cells;
Runnable examples
Start here:
Algorithm variants: deterministic_router, thompson_router, exp3ix_router, contextual_router (requires contextual feature), sticky_mab_router, monitored_router.
Domain harnesses simulate realistic routing with injected drift: NLP (matrix_harness), network security (pcap_triage_harness), ad ranking, fraud scoring, clinical triage, search ranking.
See examples/ for 25+ examples and examples/EXPERIMENTS.md for mini-experiments on trade-offs and failure modes.
Development
Quickstart guide | API docs | Changelog
References
- P. Auer, N. Cesa-Bianchi, and P. Fischer. "Finite-time analysis of the multiarmed bandit problem." Machine Learning, 47(2-3):235--256, 2002.
- S. Agrawal and N. Goyal. "Analysis of Thompson Sampling for the Multi-armed Bandit Problem." COLT, 2012.
- P. Auer, N. Cesa-Bianchi, Y. Freund, and R. E. Schapire. "The Nonstochastic Multiarmed Bandit Problem." SIAM J. Comput., 32(1):48--77, 2002.
- W. Chu, L. Li, L. Reyzin, and R. Schapire. "Contextual Bandits with Linear Payoff Functions." AISTATS, 2011.
- E. S. Page. "Continuous inspection schemes." Biometrika, 41(1-2):100--115, 1954.
- A. Garivier and E. Moulines. "On Upper-Confidence Bound Policies for Switching Bandit Problems." ALT, 2011.
- R. R. Drugan and A. Nowe. "Designing multi-objective multi-armed bandits algorithms: A study." IJCNN, 2013.
- L. Besson, E. Kaufmann, O.-A. Maillard, and J. Seznec. "Efficient Change-Point Detection for Tackling Piecewise-Stationary Bandits." arXiv:1902.01575, 2019.
- M. Ehrgott and S. Nickel. "On the number of criteria needed to decide Pareto optimality." Math. Meth. Oper. Res., 55:329--345, 2002.
- T. Banerjee and V. V. Veeravalli. "Data-efficient quickest change detection." arXiv:1211.3729, 2012.
- V. Hadad, D. A. Hirshberg, R. Zhan, S. Wager, and S. Athey. "Confidence Intervals for Policy Evaluation in Adaptive Experiments." arXiv:1911.02768, 2021.
License
Licensed under MIT or Apache-2.0 (LICENSE-MIT, LICENSE-APACHE).