Skip to main content

Crate megahal

Crate megahal 

Source
Expand description

A Rust port of Jason Hutchens’ 1998 MegaHAL chatbot: a bidirectional Markov-chain engine that learns from text and replies to prompts.

§Quick start

use megahal::MegaHal;
use rand::{rngs::SmallRng, SeedableRng};

let mut hal = MegaHal::new(5, SmallRng::seed_from_u64(42));
hal.learn("the cat sat on the mat");
hal.learn("the dog chased the cat around the yard");
let reply = hal.respond("tell me about the cat");
println!("{reply}");

§Concepts

  • Order: The n-gram depth. A model of order N considers up to N preceding tokens. The default is 5.
  • Tokens: Words, whitespace, and punctuation. Sentences of fewer than order + 1 tokens are not learned.
  • Keywords: Alphanumeric tokens that the model has seen before (excluding banned tokens). Generation biases random walks toward these keywords.
  • Generation limits: Stop conditions for response generation, configurable via GenerationLimit.

§Thread safety

MegaHal<R> is Send + Sync if R: Send + Sync. The type does not perform internal synchronization.

§Brain file format

Brain files start with MHALRUST followed by a one-byte version and a bincode-encoded model. The format is not compatible with the original C MegaHAL’s .brn files.

§Shipping a pre-trained brain

To embed a trained model in your binary, serialize it during a build step and load the serialized bytes at runtime:

use megahal::MegaHal;
use rand::{rngs::SmallRng, SeedableRng};
use std::io::Cursor;

const BRAIN: &[u8] = include_bytes!("../assets/bot.brn");

let mut hal = MegaHal::new(5, SmallRng::seed_from_u64(42));
hal.load_brain_from_reader(&mut Cursor::new(BRAIN))
    .expect("valid brain data");

§See also

  • The examples directory in the repository source.
  • The megahal-cli crate for the megahal command-line binary.

Structs§

KeywordConfig
Configuration for keyword extraction.
MegaHal
The MegaHAL conversational engine.
MegaHalSymbol
The MegaHAL symbol type: a case-insensitive byte string.
SwapTable
Perspective-swapping substitution table.

Enums§

GenerationLimit
Controls how many candidate replies are generated before selecting the best.
MegaHalError
Errors returned by brain serialization APIs.

Functions§

extract_keywords
Extract keywords from tokenized input per the MegaHAL two-pass algorithm.
load_swap_file
Load a swap file (space/tab-separated pairs, one per line).
load_word_list
Load a keyword list file (one word per line, comments with #).