Expand description
§Awful Jade - Local-First LLM Client with Semantic Memory
Awful Jade is a Rust library and CLI for interacting with OpenAI-compatible language model APIs with advanced features for memory management, RAG (Retrieval- Augmented Generation), and conversation persistence.
§Key Features
- OpenAI-Compatible API Client: Works with local LLMs (Ollama, LM Studio, vLLM) and cloud providers (OpenAI, Anthropic, etc.)
- Semantic Memory System: HNSW vector indexing with sentence embeddings for intelligent context retrieval
- RAG Support: Document chunking, embedding, and retrieval for grounded responses
- Session Persistence: SQLite database for conversation history and continuity
- Token Budgeting: Intelligent context window management with FIFO eviction
- Streaming Responses: Real-time token-by-token output
- Pretty Printing: Markdown rendering and syntax highlighting for code blocks
- Template System: YAML-based prompt engineering with system prompts and message seeds
§Architecture Overview
┌─────────────────────────────────────────────────────────────┐
│ awful_aj Library │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Commands │ │ API Client │ │ Template │ │
│ │ (CLI Args) │ │ (OpenAI) │ │ (YAML) │ │
│ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │ │
│ ┌──────┴──────────────────┴──────────────────┴───────────┐ │
│ │ Brain (Working Memory) │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌──────────────┐ │ │
│ │ │ Preamble │ │ RAG Context │ │ Memories │ │ │
│ │ └─────────────┘ └─────────────┘ └──────────────┘ │ │
│ └──────────────────────────────────────────────────────┬─┘ │
│ │ │
│ ┌───────────────────────────────────────────────────────┴─┐ │
│ │ Long-Term Memory (Vector Store + SQLite) │ │
│ │ ┌──────────────┐ ┌──────────────┐ │ │
│ │ │ HNSW Index │ │ Sessions │ │ │
│ │ │ (Semantic) │ │ (Messages) │ │ │
│ │ └──────────────┘ └──────────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘§Core Modules
| Module | Purpose | Key Types |
|---|---|---|
api | OpenAI API client and orchestration | ask(), stream_response(), fetch_response() |
brain | Working memory with token budgeting | Brain, Memory |
vector_store | HNSW semantic search | VectorStore, SentenceEmbeddingsModel |
session_messages | Conversation persistence | SessionMessages |
template | YAML prompt templates | ChatTemplate |
commands | CLI argument parsing | Cli, Commands |
config | Configuration management | AwfulJadeConfig |
models | Database ORM models | Session, Message |
schema | Diesel schema definitions | sessions, messages tables |
pretty | Terminal formatting | print_pretty(), PrettyPrinter |
§Quick Start
§As a Library
use awful_aj::{config::load_config, brain::Brain, template::ChatTemplate, api};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Load configuration
let config = load_config("config.yaml")?;
// Create template
let template = ChatTemplate {
system_prompt: "You are a helpful assistant.".into(),
messages: vec![],
response_format: None,
pre_user_message_content: None,
post_user_message_content: None,
};
// Ask a question
let response = api::ask(
&config,
"What is HNSW?".into(),
&template,
None, // no vector store
None, // no brain
false, // not pretty
).await?;
println!("{}", response);
Ok(())
}§As a CLI
# Initialize configuration
aj init
# Ask a question
aj ask "What is HNSW indexing?"
# Interactive session
aj interactive -s my-project
# RAG with documents
aj ask -r "docs/*.txt" -k 5 "Summarize the documentation"§Embedding Model
The sentence embedding model (all-MiniLM-L6-v2) is automatically downloaded from
HuggingFace Hub by the Candle framework when first used. It produces 384-dimensional
embeddings suitable for semantic search.
Model Details:
- Architecture: Sentence Transformer (BERT-based)
- Dimensions: 384
- Size: ~90MB
- Cache Location: Standard HuggingFace cache directory
§Configuration
Configuration is loaded from platform-specific directories:
- macOS:
~/Library/Application Support/com.awful-sec.aj/config.yaml - Linux:
~/.config/aj/config.yaml - Windows:
%APPDATA%\\com.awful-sec\\aj\\config.yaml
See config::AwfulJadeConfig for available settings.
§Memory Management
Awful Jade uses a two-tier memory system:
-
Working Memory (
Brain): Token-budgeted FIFO queue- Preamble (system prompt, always included)
- RAG context (document chunks)
- Recent memories (conversation turns)
- Eviction when context exceeds
context_max_tokens
-
Long-Term Memory (
VectorStore): Semantic search- HNSW index for fast approximate nearest neighbor search
- Euclidean distance similarity (threshold < 1.0)
- Automatic embedding of evicted memories
See brain and vector_store modules for details.
§RAG Pipeline
Retrieval-Augmented Generation workflow:
- Document Loading: Read text files from specified paths
- Chunking: Split into overlapping segments (512 tokens, 128 overlap)
- Embedding: Encode chunks with sentence transformer model
- Indexing: Build HNSW index for fast retrieval
- Retrieval: Query index with user prompt, fetch top-k chunks
- Injection: Add retrieved chunks to brain’s preamble
- Generation: LLM generates response with grounded context
See [api::process_rag_documents] for implementation details.
§Examples
See the examples/ directory and commands module documentation for comprehensive
usage examples.
Modules§
- api
- API Module
- brain
- Working Memory and Preamble Generation
- commands
- Command-line interface
- config
- Configuration Management for Awful Jade
- models
- Database ORM Models for Awful Jade
- pretty
- Pretty Printing - Markdown Rendering and Syntax Highlighting
- schema
- Database Schema Definitions
- session_
messages - Session Messages - Conversation Persistence & Lifecycle Management
- template
- Chat Template System for Awful Jade
- vector_
store - Semantic Memory with HNSW Vector Search
Functions§
- config_
dir - Returns the platform-specific configuration directory for Awful Jade.