airust 0.1.5

Trainable, modular AI engine in Rust with compile-time knowledge
Documentation

airust

๐Ÿง  airust is a modular, trainable AI library written in Rust.
It supports compile-time knowledge through JSON files and provides sophisticated prediction engines for natural language input.


๐Ÿš€ AiRust Capabilities

โœ… What You Can Concretely Do:

๐Ÿง  1. Build Your Own AI Agents

  • Train agents with examples (Question โ†’ Answer)
  • Supported Agent Types:
    • Exact Match โ€“ precise matching
    • Fuzzy Match โ€“ tolerant to typos (Levenshtein)
    • TF-IDF/BM25 โ€“ semantic similarity
    • ContextAgent โ€“ remembers previous dialogues

๐Ÿ’ฌ 2. Manage Your Own Knowledge Database

  • Save/load training data (train.json)
  • Weighting and metadata per entry
  • Import legacy data possible

๐Ÿงช 3. Text Analysis

  • Tokenization, stop words, N-grams
  • Similarity measures: Levenshtein, Jaccard
  • Text normalization

๐Ÿงฐ 4. Custom CLI Tools

  • Launch airust CLI for:
    • Interactive sessions with an agent
    • Knowledge base management
    • Quick data testing

๐ŸŒ 5. Integration into Other Projects

  • Use airust as a Rust library in your own applications (Web, CLI, Desktop, IoT)

๐Ÿ”ง Example Application Ideas:

  • ๐Ÿค– FAQ Bot for your website
  • ๐Ÿ“š Intelligent document search
  • ๐Ÿงพ Customer support via terminal
  • ๐Ÿ—ฃ๏ธ Voice assistant with context understanding
  • ๐Ÿ”Ž Similarity search for text databases
  • ๐Ÿ›  Local assistance tool for developer documentation

๐Ÿš€ Advanced Features

  • ๐Ÿงฉ Modular Architecture with Unified Traits:

    • Agent โ€“ Base trait for all agents with enhanced prediction capabilities
    • TrainableAgent โ€“ For agents that can be trained with examples
    • ContextualAgent โ€“ For context-aware conversational agents
    • ConfidenceAgent โ€“ New trait for agents that can provide prediction confidence
  • ๐Ÿง  Intelligent Agent Implementations:

    • MatchAgent โ€“ Advanced matching with configurable strategies
      • Exact matching
      • Fuzzy matching with dynamic thresholds
      • Configurable Levenshtein distance options
    • TfidfAgent โ€“ Sophisticated similarity detection using BM25 algorithm
      • Customizable term frequency scaling
      • Document length normalization
    • ContextAgent<A> โ€“ Flexible context-aware wrapper
      • Multiple context formatting strategies
      • Configurable context history size
  • ๐Ÿ“ Enhanced Response Handling:

    • ResponseFormat with support for:
      • Plain text
      • Markdown
      • JSON
    • Metadata and confidence tracking
    • Seamless type conversions
  • ๐Ÿ’พ Intelligent Knowledge Base:

    • Compile-time knowledge via train.json
    • Runtime knowledge expansion
    • Backward compatibility with legacy formats
    • Weighted training examples
    • Optional metadata support
  • ๐Ÿ” Advanced Text Processing:

    • Tokenization with Unicode support
    • Stopword removal
    • Text normalization
    • N-gram generation
    • Advanced string similarity metrics
      • Levenshtein distance
      • Jaccard similarity
  • ๐Ÿ› ๏ธ Unified CLI Tool:

    • Interactive mode
    • Multiple agent type selection
    • Knowledge base management
    • Flexible querying

๐Ÿ”ง Usage

Integration in other projects

[dependencies]
airust = "0.1.4"

Sample Code (Updated)

use airust::{Agent, TrainableAgent, MatchAgent, ResponseFormat, KnowledgeBase};

fn main() {
    // Load embedded knowledge base
    let kb = KnowledgeBase::from_embedded();

    // Create and train agent
    let mut agent = MatchAgent::new_exact();
    agent.train(kb.get_examples());

    // Ask a question
    let answer = agent.predict("What is airust?");

    // Print the response (converted from ResponseFormat to String)
    println!("Answer: {}", String::from(answer));
}

๐Ÿ“‚ Training Data Format

The file format knowledge/train.json has been extended to support both the old and new format:

[
  {
    "input": "What is airust?",
    "output": {
      "Text": "A modular AI library in Rust."
    },
    "weight": 2.0
  },
  {
    "input": "What agents are available?",
    "output": {
      "Markdown": "- **MatchAgent** (exact & fuzzy)\n- **TfidfAgent** (BM25)\n- **ContextAgent** (context-aware)"
    },
    "weight": 1.0
  }
]

Legacy format is still supported for backward compatibility.


๐Ÿ–ฅ๏ธ CLI Usage

# Simple query
airust query simple "What is airust?"
airust query fuzzy "What is airust?"
airust query tfidf "Explain airust"

# Interactive mode
airust interactive

# Knowledge base management
airust knowledge

๐Ÿ“Š Advanced Usage โ€“ Context Agent

use airust::{Agent, TrainableAgent, ContextualAgent, TfidfAgent, ContextAgent, KnowledgeBase};

fn main() {
    // Load embedded knowledge base
    let kb = KnowledgeBase::from_embedded();

    // Create and train base agent
    let mut base_agent = TfidfAgent::new()
        .with_bm25_params(1.5, 0.8);  // Custom BM25 tuning
    base_agent.train(kb.get_examples());

    // Wrap in a context-aware agent (remembering 3 turns)
    let mut agent = ContextAgent::new(base_agent, 3)
        .with_context_format(ContextFormat::List);

    // First question
    let answer1 = agent.predict("What is airust?");
    println!("A1: {}", String::from(answer1.clone()));

    // Add to context history
    agent.add_context("What is airust?".to_string(), answer1);

    // Follow-up question
    let answer2 = agent.predict("What features does it provide?");
    println!("A2: {}", String::from(answer2));
}

๐Ÿš€ New in Version 0.1.4

Matching Strategies

// Configurable fuzzy matching
let agent = MatchAgent::new(MatchingStrategy::Fuzzy(FuzzyOptions {
    max_distance: Some(5),      // Maximum Levenshtein distance
    threshold_factor: Some(0.2) // Dynamic length-based threshold
}));

Context Formatting

// Multiple context representation strategies
let context_agent = ContextAgent::new(base_agent, 3)
    .with_context_format(ContextFormat::List);
    // Other formats: QAPairs, Sentence, Custom

Advanced Text Utilities

// Text processing capabilities
let tokens = text_utils::tokenize("Hello, world!");
let unique_terms = text_utils::unique_terms(text);
let ngrams = text_utils::create_ngrams(text, 2);

๐Ÿ“ƒ License

MIT

Built with โค๏ธ in Rust.
Contributions and extensions are welcome!


๐Ÿ›  Migration Guide for airust 0.1.4

This guide helps you migrate from airust 0.1.x to 0.1.4.

1. Trait and Type Changes

New Trait Hierarchy

trait Agent {
    fn predict(&self, input: &str) -> ResponseFormat;
}

trait TrainableAgent: Agent {
    fn train(&mut self, data: &[TrainingExample]);
}

trait ContextualAgent: Agent {
    fn add_context(&mut self, question: String, answer: ResponseFormat);
}

New Response Format

let answer: ResponseFormat = agent.predict("Question");
let answer_string: String = String::from(answer);

Updated TrainingExample Struct

struct TrainingExample {
    input: String,
    output: ResponseFormat,
    weight: f32,
}

2. Agent Replacements

SimpleAgent and FuzzyAgent โ†’ MatchAgent

let mut agent = MatchAgent::new_exact();
let mut agent = MatchAgent::new_fuzzy();

With options:

let mut agent = MatchAgent::new(MatchingStrategy::Fuzzy(FuzzyOptions {
    max_distance: Some(5),
    threshold_factor: Some(0.2),
}));

ContextAgent is Now Generic

let mut base_agent = TfidfAgent::new();
base_agent.train(&data);
let mut agent = ContextAgent::new(base_agent, 5);

StructuredAgent Removed (use ResponseFormat)


3. Knowledge Base Changes

let kb = KnowledgeBase::from_embedded();
let data = kb.get_examples();

let mut kb = KnowledgeBase::new();
kb.add_example("Question".to_string(), "Answer".to_string(), 1.0);

4. CLI Tool Migration

cargo run --bin airust -- query simple "What is airust?"
cargo run --bin airust -- interactive
cargo run --bin airust -- knowledge

5. Recommendations

  • Upgrade your dependencies
  • Use new lib.rs re-exports
  • Test thoroughly
  • Explore new context formatting