Expand description
A Rust port of Jason Hutchens’ 1998 MegaHAL chatbot: a bidirectional Markov-chain engine that learns from text and replies to prompts.
§Quick start
use megahal::MegaHal;
use rand::{rngs::SmallRng, SeedableRng};
let mut hal = MegaHal::new(5, SmallRng::seed_from_u64(42));
hal.learn("the cat sat on the mat");
hal.learn("the dog chased the cat around the yard");
let reply = hal.respond("tell me about the cat");
println!("{reply}");§Concepts
- Order: The n-gram depth. A model of order N considers up to N preceding tokens. The default is 5.
- Tokens: Words, whitespace, and punctuation. Sentences of fewer than
order + 1tokens are not learned. - Keywords: Alphanumeric tokens that the model has seen before (excluding banned tokens). Generation biases random walks toward these keywords.
- Generation limits: Stop conditions for response generation, configurable
via
GenerationLimit.
§Thread safety
MegaHal<R> is Send + Sync if R: Send + Sync. The type does not perform
internal synchronization.
§Brain file format
Brain files start with MHALRUST followed by a one-byte version and a
bincode-encoded model. The format is not compatible with the original
C MegaHAL’s .brn files.
§Shipping a pre-trained brain
To embed a trained model in your binary, serialize it during a build step and load the serialized bytes at runtime:
ⓘ
use megahal::MegaHal;
use rand::{rngs::SmallRng, SeedableRng};
use std::io::Cursor;
const BRAIN: &[u8] = include_bytes!("../assets/bot.brn");
let mut hal = MegaHal::new(5, SmallRng::seed_from_u64(42));
hal.load_brain_from_reader(&mut Cursor::new(BRAIN))
.expect("valid brain data");§See also
- The examples directory in the repository source.
- The
megahal-clicrate for themegahalcommand-line binary.
Structs§
- Keyword
Config - Configuration for keyword extraction.
- MegaHal
- The MegaHAL conversational engine.
- Mega
HalSymbol - The MegaHAL symbol type: a case-insensitive byte string.
- Swap
Table - Perspective-swapping substitution table.
Enums§
- Generation
Limit - Controls how many candidate replies are generated before selecting the best.
- Mega
HalError - Errors returned by brain serialization APIs.
Functions§
- extract_
keywords - Extract keywords from tokenized input per the MegaHAL two-pass algorithm.
- load_
swap_ file - Load a swap file (space/tab-separated pairs, one per line).
- load_
word_ list - Load a keyword list file (one word per line, comments with #).