paladin-ai 0.5.1

# Memory Management Guide

This guide covers how to use the Garrison memory system to give your Paladins conversation context, long-term knowledge, and semantic search capabilities.

## Table of Contents

- [Overview](#overview)
- [Garrison Architecture](#garrison-architecture)
- [In-Memory Garrison](#in-memory-garrison)
- [Persistent Garrison](#persistent-garrison)
- [Memory Windowing](#memory-windowing)
- [Semantic Search](#semantic-search)
- [Memory Types](#memory-types)
- [Best Practices](#best-practices)
- [Advanced Patterns](#advanced-patterns)
- [Troubleshooting](#troubleshooting)

## Overview

The Garrison system provides Paladins with:
- **Conversation Context**: Maintain multi-turn dialogue history
- **Memory Windowing**: Manage token limits intelligently
- **Persistence**: Save and restore sessions across restarts
- **Semantic Search**: Retrieve relevant memories by meaning, not just keywords
- **Embeddings**: Vector-based similarity for long-term memory

**Key Concepts:**
- **Garrison**: Memory storage system for a Paladin
- **GarrisonEntry**: Single memory record (message, observation, fact)
- **ConversationHistory**: Ordered sequence of interactions
- **Memory Window**: Limited context size respecting token limits
- **Long-Term Memory**: Persistent storage with semantic retrieval

## Garrison Architecture

### Core Components

```rust,ignore
// Single memory entry
pub struct GarrisonEntry {
    pub id: Uuid,
    pub role: ConversationRole,
    pub content: String,
    pub timestamp: DateTime<Utc>,
    pub metadata: HashMap<String, String>,
    pub token_count: Option<u32>,
}

// Conversation roles
pub enum ConversationRole {
    System,    // System prompts
    User,      // User messages
    Assistant, // Paladin responses
    Tool,      // Tool execution results
}

// Memory interface
#[async_trait]
pub trait GarrisonPort: Send + Sync {
    async fn remember(&self, entry: GarrisonEntry) -> Result<(), GarrisonError>;
    async fn recall_recent(&self, limit: usize) -> Result<Vec<GarrisonEntry>, GarrisonError>;
    async fn search(&self, query: &str, limit: usize) -> Result<Vec<GarrisonEntry>, GarrisonError>;
    async fn forget_all(&self) -> Result<(), GarrisonError>;
    async fn stats(&self) -> Result<GarrisonStats, GarrisonError>;
}

// Extended port for long-term memory
#[async_trait]
pub trait LongTermGarrisonPort: GarrisonPort {
    async fn remember_with_embedding(
        &self,
        entry: GarrisonEntry,
        embedding: Vec<f32>
    ) -> Result<(), GarrisonError>;

    async fn search_similar(
        &self,
        query_embedding: Vec<f32>,
        limit: usize
    ) -> Result<Vec<(GarrisonEntry, f32)>, GarrisonError>;
}
```

### Memory Flow

```
User Input → Garrison adds User entry
    ↓
Paladin retrieves relevant history (window or search)
    ↓
LLM generates response with full context
    ↓
Garrison adds Assistant entry
    ↓
(Optional) Tool calls → Garrison adds Tool entries
    ↓
Repeat for next interaction
```

## In-Memory Garrison

Fastest option for short-lived sessions where persistence isn't needed.

### Basic Usage

```rust,ignore
use paladin_memory::garrison::InMemoryGarrison;
use paladin_core::platform::container::garrison::{GarrisonEntry, ConversationRole, GarrisonConfig};
use paladin::prelude::*;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let llm_adapter = Arc::new(OpenAiAdapter::new().build()?);

    // Create in-memory garrison
    let garrison = Arc::new(InMemoryGarrison::new(
        GarrisonConfig::default()
            .with_max_entries(100)
            .with_max_tokens(4000)
    ));

    // Build Paladin with memory
    let paladin = PaladinBuilder::new(llm_adapter)
        .name("ChatBot")
        .system_prompt("You are a helpful assistant with memory of our conversation.")
        .with_garrison(garrison.clone())
        .build()?;

    // First interaction
    let response1 = paladin.execute("My name is Alice").await?;
    println!("Bot: {}", response1.content);

    // Second interaction - Paladin remembers
    let response2 = paladin.execute("What's my name?").await?;
    println!("Bot: {}", response2.content);  // Should say "Alice"

    // Check garrison statistics
    let stats = garrison.stats().await?;
    println!("Total memories: {}", stats.total_entries);
    println!("Total tokens: {}", stats.total_tokens);

    Ok(())
}
```

### Configuration Options

```rust,ignore
let garrison = InMemoryGarrison::new(
    GarrisonConfig::default()
        // Maximum number of entries to retain
        .with_max_entries(100)

        // Maximum total tokens across all entries
        .with_max_tokens(4000)

        // Token estimation strategy
        .with_token_counter(TokenCounter::Gpt4)

        // Eviction policy when limits reached
        .with_eviction_policy(EvictionPolicy::Fifo)  // First-in-first-out
);
```

### Eviction Policies

```rust,ignore
pub enum EvictionPolicy {
    // Remove oldest entries first
    Fifo,

    // Remove least recently accessed entries
    Lru,

    // Remove entries based on importance score
    ImportanceBased,

    // Custom eviction logic
    Custom(Arc<dyn Fn(&[GarrisonEntry]) -> Vec<Uuid> + Send + Sync>),
}

// Example: Custom eviction keeping system prompts
let garrison = InMemoryGarrison::new(
    GarrisonConfig::default()
        .with_eviction_policy(EvictionPolicy::Custom(Arc::new(|entries| {
            // Never evict system prompts, evict oldest user messages
            entries.iter()
                .filter(|e| e.role == ConversationRole::User)
                .take(10)
                .map(|e| e.id)
                .collect()
        })))
);
```

## Persistent Garrison

SQLite-backed storage for sessions that need to survive restarts.

### Setup

```rust,ignore
use paladin_memory::garrison::InMemoryGarrison;
use paladin_core::platform::container::garrison::{GarrisonEntry, ConversationRole, GarrisonConfig};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create persistent garrison
    let garrison = Arc::new(
        SqliteGarrison::new("garrison.db")
            .await?
            .with_config(GarrisonConfig::default())
    );

    let paladin = PaladinBuilder::new(llm_adapter)
        .with_garrison(garrison)
        .build()?;

    // All interactions are automatically persisted
    paladin.execute("Remember this important fact!").await?;

    Ok(())
}
```

### Session Management

```rust,ignore
// Create session-based garrison
let session_id = Uuid::new_v4();

let garrison = Arc::new(
    SqliteGarrison::new("garrison.db")
        .await?
        .with_session_id(session_id)
);

// Later, restore the same session
let garrison_restored = Arc::new(
    SqliteGarrison::new("garrison.db")
        .await?
        .with_session_id(session_id)  // Same session ID
);

// History is preserved
let history = garrison_restored.recall_recent(100).await?;
println!("Restored {} memories", history.len());
```

### Multiple Users

```rust,ignore
pub struct UserGarrison {
    db: SqliteGarrison,
    user_id: String,
}

impl UserGarrison {
    pub async fn new(db_path: &str, user_id: String) -> Result<Self> {
        let db = SqliteGarrison::new(db_path).await?;
        Ok(Self { db, user_id })
    }
}

#[async_trait]
impl GarrisonPort for UserGarrison {
    async fn remember(&self, mut entry: GarrisonEntry) -> Result<()> {
        // Tag entries with user_id
        entry.metadata.insert("user_id".to_string(), self.user_id.clone());
        self.db.remember(entry).await
    }

    async fn recall_recent(&self, limit: usize) -> Result<Vec<GarrisonEntry>> {
        // Filter by user_id
        let all_entries = self.db.recall_recent(limit * 2).await?;
        Ok(all_entries.into_iter()
            .filter(|e| e.metadata.get("user_id") == Some(&self.user_id))
            .take(limit)
            .collect())
    }

    // Implement other methods...
}

// Usage
let alice_garrison = Arc::new(UserGarrison::new("garrison.db", "alice".to_string()).await?);
let bob_garrison = Arc::new(UserGarrison::new("garrison.db", "bob".to_string()).await?);

let alice_paladin = PaladinBuilder::new(llm_adapter.clone())
    .with_garrison(alice_garrison)
    .build()?;

let bob_paladin = PaladinBuilder::new(llm_adapter)
    .with_garrison(bob_garrison)
    .build()?;
```

### Database Schema

```sql
-- migrations/001_create_garrison_tables.sql
CREATE TABLE IF NOT EXISTS garrison_entries (
    id TEXT PRIMARY KEY,
    session_id TEXT NOT NULL,
    role TEXT NOT NULL,
    content TEXT NOT NULL,
    timestamp INTEGER NOT NULL,
    metadata TEXT,
    token_count INTEGER,
    embedding BLOB,

    INDEX idx_session_timestamp (session_id, timestamp),
    INDEX idx_session_role (session_id, role)
);

CREATE TABLE IF NOT EXISTS garrison_sessions (
    session_id TEXT PRIMARY KEY,
    user_id TEXT,
    created_at INTEGER NOT NULL,
    updated_at INTEGER NOT NULL,
    metadata TEXT
);
```

## Memory Windowing

Intelligently manage context size to respect LLM token limits.

### Token-Based Windowing

```rust,ignore
// Get most recent entries that fit within token limit
let window = garrison.recall_recent(4000).await?;

println!("Window contains {} entries", window.len());
println!("Total tokens: {}",
    window.iter().map(|e| e.token_count.unwrap_or(0)).sum::<u32>());
```

### Sliding Window

```rust,ignore
pub struct SlidingWindowGarrison {
    garrison: Arc<dyn GarrisonPort>,
    window_size: u32,
}

impl SlidingWindowGarrison {
    pub fn new(garrison: Arc<dyn GarrisonPort>, window_size: u32) -> Self {
        Self { garrison, window_size }
    }
}

#[async_trait]
impl GarrisonPort for SlidingWindowGarrison {
    async fn recall_recent(&self, _limit: usize) -> Result<Vec<GarrisonEntry>> {
        // Always return windowed history
        self.garrison.recall_recent(self.window_size).await
    }

    // Forward other methods to inner garrison
    async fn remember(&self, entry: GarrisonEntry) -> Result<()> {
        self.garrison.remember(entry).await
    }

    // ... other methods
}

// Usage - Paladin always sees only recent context
let windowed = Arc::new(SlidingWindowGarrison::new(garrison, 4000));

let paladin = PaladinBuilder::new(llm_adapter)
    .with_garrison(windowed)
    .build()?;
```

### Smart Windowing with Priorities

```rust,ignore
pub struct PriorityWindowGarrison {
    garrison: Arc<dyn GarrisonPort>,
    window_size: u32,
}

impl PriorityWindowGarrison {
    async fn get_prioritized_window(&self) -> Result<Vec<GarrisonEntry>> {
        let all_entries = self.garrison.recall_recent(1000).await?;

        // Always include system prompts
        let system_entries: Vec<_> = all_entries.iter()
            .filter(|e| e.role == ConversationRole::System)
            .cloned()
            .collect();

        // Calculate remaining token budget
        let system_tokens: u32 = system_entries.iter()
            .map(|e| e.token_count.unwrap_or(0))
            .sum();

        let remaining_budget = self.window_size.saturating_sub(system_tokens);

        // Fill with most recent non-system entries
        let mut recent_entries: Vec<_> = all_entries.iter()
            .filter(|e| e.role != ConversationRole::System)
            .rev()
            .cloned()
            .collect();

        let mut token_sum = 0u32;
        let mut windowed_recent = Vec::new();

        for entry in recent_entries {
            let entry_tokens = entry.token_count.unwrap_or(0);
            if token_sum + entry_tokens <= remaining_budget {
                token_sum += entry_tokens;
                windowed_recent.push(entry);
            } else {
                break;
            }
        }

        // Combine: system + recent (chronological order)
        windowed_recent.reverse();
        let mut result = system_entries;
        result.extend(windowed_recent);

        Ok(result)
    }
}
```

### Summarization for Compression

```rust,ignore
pub struct SummarizingGarrison {
    garrison: Arc<dyn GarrisonPort>,
    summarizer: Arc<dyn LlmPort>,
    window_size: u32,
    summary_threshold: usize,
}

impl SummarizingGarrison {
    async fn maybe_summarize(&self) -> Result<()> {
        let entries = self.garrison.recall_recent(self.summary_threshold).await?;

        if entries.len() >= self.summary_threshold {
            // Create summary of old entries
            let old_entries: Vec<_> = entries.iter()
                .take(self.summary_threshold / 2)
                .collect();

            let conversation_text = old_entries.iter()
                .map(|e| format!("{:?}: {}", e.role, e.content))
                .collect::<Vec<_>>()
                .join("\n");

            let prompt = format!(
                "Summarize this conversation in 2-3 paragraphs, preserving key facts:\n\n{}",
                conversation_text
            );

            let summary = self.summarizer.generate(&prompt).await?;

            // Replace old entries with summary
            for entry in old_entries {
                self.garrison.remove_entry(entry.id).await?;
            }

            self.garrison.remember(GarrisonEntry {
                id: Uuid::new_v4(),
                role: ConversationRole::System,
                content: format!("Previous conversation summary: {}", summary),
                timestamp: Utc::now(),
                metadata: HashMap::from([
                    ("type".to_string(), "summary".to_string()),
                ]),
                token_count: None,
            }).await?;
        }

        Ok(())
    }
}
```

## Semantic Search

Retrieve relevant memories by meaning using embeddings.

### Setup with Embeddings

```rust,ignore
use paladin_memory::garrison::InMemoryGarrison;
use paladin_core::platform::container::garrison::{GarrisonEntry, ConversationRole, GarrisonConfig};
use paladin_memory::embedding::OpenAIEmbeddingPort;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create garrison with embedding support
    let embedding_service = Arc::new(OpenAIEmbeddingService::new(api_key)?);

    let garrison = Arc::new(
        VectorGarrison::new("garrison.db")
            .await?
            .with_embedding_service(embedding_service)
    );

    let paladin = PaladinBuilder::new(llm_adapter)
        .with_garrison(garrison.clone())
        .build()?;

    // Add entries - embeddings generated automatically
    paladin.execute("I love hiking in the mountains").await?;
    paladin.execute("My favorite color is blue").await?;
    paladin.execute("I work as a software engineer").await?;

    // Semantic search
    let results = garrison.semantic_search("outdoor activities", 5).await?;

    for (entry, similarity) in results {
        println!("Similarity: {:.2} - {}", similarity, entry.content);
    }
    // Output: High similarity for "hiking in the mountains"

    Ok(())
}
```

### Hybrid Search (Keyword + Semantic)

```rust,ignore
pub struct HybridGarrison {
    garrison: Arc<dyn LongTermGarrisonPort>,
}

impl HybridGarrison {
    pub async fn hybrid_search(
        &self,
        query: &str,
        limit: usize,
    ) -> Result<Vec<GarrisonEntry>> {
        // Get keyword matches
        let keyword_results = self.garrison.search(query, limit * 2).await?;

        // Get semantic matches
        let embedding = self.embedding_service.embed(query).await?;
        let semantic_results = self.garrison
            .semantic_search(embedding, limit * 2)
            .await?;

        // Merge and deduplicate
        let mut combined: HashMap<Uuid, (GarrisonEntry, f32)> = HashMap::new();

        // Add keyword results with base score
        for entry in keyword_results {
            combined.insert(entry.id, (entry, 0.5));
        }

        // Add semantic results, boosting score if already present
        for (entry, similarity) in semantic_results {
            combined.entry(entry.id)
                .and_modify(|(_, score)| *score += similarity * 0.5)
                .or_insert((entry, similarity * 0.5));
        }

        // Sort by combined score
        let mut sorted: Vec<_> = combined.into_values().collect();
        sorted.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());

        Ok(sorted.into_iter()
            .take(limit)
            .map(|(entry, _)| entry)
            .collect())
    }
}
```

### RAG (Retrieval-Augmented Generation)

```rust,ignore
pub struct RAGPaladin {
    paladin: Paladin,
    garrison: Arc<dyn LongTermGarrisonPort>,
}

impl RAGPaladin {
    pub async fn execute_with_rag(&self, query: &str) -> Result<PaladinResult> {
        // Retrieve relevant context from long-term memory
        let embedding = self.embedding_service.embed(query).await?;
        let relevant_memories = self.garrison
            .semantic_search(embedding, 5)
            .await?;

        // Build augmented prompt
        let context = relevant_memories.iter()
            .map(|(entry, _)| entry.content.as_str())
            .collect::<Vec<_>>()
            .join("\n\n");

        let augmented_query = format!(
            "Context from previous conversations:\n{}\n\n\
             Current question: {}",
            context, query
        );

        // Execute with retrieved context
        self.paladin.execute(&augmented_query).await
    }
}

// Usage
let rag_paladin = RAGPaladin {
    paladin,
    garrison: vector_garrison,
};

let response = rag_paladin.execute_with_rag(
    "What programming languages do I know?"
).await?;
```

## Memory Types

### Episodic Memory

Memory of specific events and experiences.

```rust,ignore
// Add episodic memory
garrison.remember(GarrisonEntry {
    id: Uuid::new_v4(),
    role: ConversationRole::User,
    content: "I visited Paris last summer".to_string(),
    timestamp: Utc::now(),
    metadata: HashMap::from([
        ("memory_type".to_string(), "episodic".to_string()),
        ("event_type".to_string(), "travel".to_string()),
        ("location".to_string(), "Paris, France".to_string()),
        ("timeframe".to_string(), "summer 2023".to_string()),
    ]),
    token_count: Some(10),
}).await?;
```

### Semantic Memory

General knowledge and facts.

```rust,ignore
// Add semantic memory (facts)
garrison.remember(GarrisonEntry {
    id: Uuid::new_v4(),
    role: ConversationRole::System,
    content: "User prefers Python over JavaScript for backend development".to_string(),
    timestamp: Utc::now(),
    metadata: HashMap::from([
        ("memory_type".to_string(), "semantic".to_string()),
        ("category".to_string(), "preferences".to_string()),
        ("topic".to_string(), "programming".to_string()),
    ]),
    token_count: Some(15),
}).await?;
```

### Procedural Memory

Knowledge about how to do things.

```rust,ignore
// Add procedural memory
garrison.remember(GarrisonEntry {
    id: Uuid::new_v4(),
    role: ConversationRole::System,
    content: "To deploy this project: cargo build --release && docker build -t app .".to_string(),
    timestamp: Utc::now(),
    metadata: HashMap::from([
        ("memory_type".to_string(), "procedural".to_string()),
        ("task".to_string(), "deployment".to_string()),
    ]),
    token_count: Some(20),
}).await?;
```

## Best Practices

### 1. Choose the Right Garrison Type

```rust,ignore
// ✅ Use InMemoryGarrison for:
// - Temporary chatbots
// - Stateless services
// - Testing and development

let garrison = Arc::new(InMemoryGarrison::new(
    GarrisonConfig::default().with_max_tokens(4000)
));

// ✅ Use SqliteGarrison for:
// - Multi-session applications
// - User-specific contexts
// - Production services needing persistence

let garrison = Arc::new(
    SqliteGarrison::new("garrison.db").await?
        .with_session_id(session_id)
);

// ✅ Use VectorGarrison for:
// - Long-term knowledge bases
// - RAG applications
// - Semantic retrieval needs

let garrison = Arc::new(
    VectorGarrison::new("garrison.db").await?
        .with_embedding_service(embedding_service)
);
```

### 2. Set Appropriate Token Limits

```rust,ignore
// Model context windows
const GPT_4_TURBO: u32 = 128_000;
const GPT_4: u32 = 8_192;
const GPT_3_5: u32 = 16_385;
const CLAUDE_3: u32 = 200_000;

// Reserve tokens for: system prompt + response + buffer
let response_tokens = 1000;
let system_prompt_tokens = 500;
let buffer = 500;

let available_for_history = GPT_4 - response_tokens - system_prompt_tokens - buffer;

let garrison = InMemoryGarrison::new(
    GarrisonConfig::default()
        .with_max_tokens(available_for_history)  // ~6000 tokens
);
```

### 3. Add Metadata for Better Organization

```rust,ignore
garrison.remember(GarrisonEntry {
    id: Uuid::new_v4(),
    role: ConversationRole::User,
    content: message.clone(),
    timestamp: Utc::now(),
    metadata: HashMap::from([
        ("user_id".to_string(), user_id.clone()),
        ("session_id".to_string(), session_id.to_string()),
        ("channel".to_string(), "web".to_string()),
        ("language".to_string(), "en".to_string()),
        ("importance".to_string(), "high".to_string()),
    ]),
    token_count: Some(estimate_tokens(&message)),
}).await?;
```

### 4. Clean Up Old Memories

```rust,ignore
// Periodic cleanup
pub async fn cleanup_old_memories(
    garrison: &SqliteGarrison,
    days_to_keep: i64,
) -> Result<usize> {
    let cutoff = Utc::now() - Duration::days(days_to_keep);

    let removed = garrison
        .remove_before(cutoff)
        .await?;

    println!("Removed {} old memories", removed);
    Ok(removed)
}

// Scheduled cleanup
tokio::spawn(async move {
    let mut interval = tokio::time::interval(Duration::from_secs(86400)); // Daily
    loop {
        interval.tick().await;
        if let Err(e) = cleanup_old_memories(&garrison, 30).await {
            eprintln!("Cleanup failed: {}", e);
        }
    }
});
```

### 5. Implement Conversation Branching

```rust,ignore
pub struct BranchingGarrison {
    garrison: Arc<dyn GarrisonPort>,
    current_branch: RwLock<Uuid>,
}

impl BranchingGarrison {
    pub async fn create_branch(&self, from_entry: Uuid) -> Result<Uuid> {
        let branch_id = Uuid::new_v4();

        // Copy history up to branch point
        let history = self.garrison.recall_recent(1000).await?;
        let branch_history: Vec<_> = history.into_iter()
            .take_while(|e| e.id != from_entry)
            .collect();

        // Store branch metadata
        self.garrison.remember(GarrisonEntry {
            id: Uuid::new_v4(),
            role: ConversationRole::System,
            content: format!("Branch created from entry {}", from_entry),
            timestamp: Utc::now(),
            metadata: HashMap::from([
                ("type".to_string(), "branch".to_string()),
                ("branch_id".to_string(), branch_id.to_string()),
                ("parent_entry".to_string(), from_entry.to_string()),
            ]),
            token_count: None,
        }).await?;

        *self.current_branch.write().await = branch_id;
        Ok(branch_id)
    }
}
```

## Advanced Patterns

### Memory Consolidation

```rust,ignore
pub struct ConsolidatingGarrison {
    garrison: Arc<dyn GarrisonPort>,
    llm: Arc<dyn LlmPort>,
}

impl ConsolidatingGarrison {
    pub async fn consolidate_memories(&self) -> Result<()> {
        let entries = self.garrison.recall_recent(100).await?;

        // Group by topic using LLM
        let topics = self.extract_topics(&entries).await?;

        // Create consolidated memory for each topic
        for (topic, topic_entries) in topics {
            let facts = self.extract_facts(&topic_entries).await?;

            self.garrison.remember(GarrisonEntry {
                id: Uuid::new_v4(),
                role: ConversationRole::System,
                content: format!("Consolidated facts about {}: {}", topic, facts),
                timestamp: Utc::now(),
                metadata: HashMap::from([
                    ("type".to_string(), "consolidated".to_string()),
                    ("topic".to_string(), topic),
                    ("source_count".to_string(), topic_entries.len().to_string()),
                ]),
                token_count: None,
            }).await?;
        }

        Ok(())
    }

    async fn extract_topics(&self, entries: &[GarrisonEntry]) -> Result<HashMap<String, Vec<GarrisonEntry>>> {
        // Use LLM to categorize entries by topic
        // Implementation details...
        Ok(HashMap::new())
    }

    async fn extract_facts(&self, entries: &[GarrisonEntry]) -> Result<String> {
        let conversation = entries.iter()
            .map(|e| &e.content)
            .cloned()
            .collect::<Vec<_>>()
            .join("\n");

        let prompt = format!(
            "Extract key facts from this conversation:\n\n{}",
            conversation
        );

        self.llm.generate(&prompt).await
    }
}
```

### Attention Mechanism

```rust,ignore
pub struct AttentionGarrison {
    garrison: Arc<dyn LongTermGarrisonPort>,
}

impl AttentionGarrison {
    pub async fn get_attended_context(
        &self,
        query: &str,
        context_size: u32,
    ) -> Result<Vec<GarrisonEntry>> {
        // Get semantic matches
        let query_embedding = self.embed(query).await?;
        let candidates = self.garrison
            .semantic_search(query_embedding, 50)
            .await?;

        // Score each candidate using attention mechanism
        let mut scored: Vec<_> = candidates.into_iter()
            .map(|(entry, similarity)| {
                let recency_score = self.recency_score(&entry);
                let importance_score = self.importance_score(&entry);

                // Weighted combination
                let attention = similarity * 0.5 + recency_score * 0.3 + importance_score * 0.2;

                (entry, attention)
            })
            .collect();

        // Sort by attention score
        scored.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());

        // Select top entries within token budget
        let mut selected = Vec::new();
        let mut token_sum = 0u32;

        for (entry, _) in scored {
            let entry_tokens = entry.token_count.unwrap_or(0);
            if token_sum + entry_tokens <= context_size {
                token_sum += entry_tokens;
                selected.push(entry);
            }
        }

        Ok(selected)
    }

    fn recency_score(&self, entry: &GarrisonEntry) -> f32 {
        let age = (Utc::now() - entry.timestamp).num_seconds() as f32;
        let decay_rate = 0.0001;  // Adjust for desired decay speed
        (-decay_rate * age).exp()
    }

    fn importance_score(&self, entry: &GarrisonEntry) -> f32 {
        // Extract importance from metadata or content
        entry.metadata.get("importance")
            .and_then(|s| s.parse::<f32>().ok())
            .unwrap_or(0.5)
    }
}
```

### Memory Reflection

```rust,ignore
pub struct ReflectiveGarrison {
    garrison: Arc<dyn GarrisonPort>,
    llm: Arc<dyn LlmPort>,
}

impl ReflectiveGarrison {
    pub async fn generate_reflections(&self) -> Result<()> {
        let recent_entries = self.garrison.recall_recent(50).await?;

        // Prompt LLM to reflect on conversation
        let conversation = recent_entries.iter()
            .map(|e| format!("{:?}: {}", e.role, e.content))
            .collect::<Vec<_>>()
            .join("\n");

        let prompt = format!(
            "Reflect on this conversation and extract:\n\
             1. Key insights about the user\n\
             2. Patterns in the discussion\n\
             3. Important facts to remember\n\n\
             Conversation:\n{}",
            conversation
        );

        let reflection = self.llm.generate(&prompt).await?;

        // Store reflection as high-importance memory
        self.garrison.remember(GarrisonEntry {
            id: Uuid::new_v4(),
            role: ConversationRole::System,
            content: format!("Reflection: {}", reflection),
            timestamp: Utc::now(),
            metadata: HashMap::from([
                ("type".to_string(), "reflection".to_string()),
                ("importance".to_string(), "high".to_string()),
            ]),
            token_count: None,
        }).await?;

        Ok(())
    }
}
```

## Troubleshooting

### Memory Not Persisting

**Problem**: Garrison entries disappear after restart.

**Solutions**:
1. Verify using `SqliteGarrison`, not `InMemoryGarrison`
2. Check database file path is correct and writable
3. Ensure proper async handling (`.await` on all operations)

```rust,ignore
// ❌ Won't persist
let garrison = Arc::new(InMemoryGarrison::new(config));

// ✅ Will persist
let garrison = Arc::new(SqliteGarrison::new("garrison.db").await?);
```

### Context Window Overflow

**Problem**: Errors about exceeding maximum context length.

**Solutions**:
1. Reduce `max_tokens` in `GarrisonConfig`
2. Use `get_window()` instead of `get_history()`
3. Implement summarization for old memories

```rust,ignore
// Calculate safe token limit
let model_limit = 8192;  // GPT-4
let response_budget = 1000;
let system_prompt_tokens = 500;
let safety_buffer = 500;

let garrison_limit = model_limit - response_budget - system_prompt_tokens - safety_buffer;

let garrison = InMemoryGarrison::new(
    GarrisonConfig::default().with_max_tokens(garrison_limit)
);
```

### Slow Semantic Search

**Problem**: Embedding-based search is taking too long.

**Solutions**:
1. Add database indexes on embedding columns
2. Use approximate nearest neighbor (ANN) algorithms
3. Cache embeddings for frequent queries
4. Limit search scope with filters

```sql
-- Add index for faster vector search
CREATE INDEX idx_embeddings ON garrison_entries(embedding);

-- Consider using specialized vector databases
-- PostgreSQL with pgvector extension
-- Qdrant, Milvus, or Weaviate for production
```

### Memory Leaks in Long Sessions

**Problem**: Memory usage grows unbounded.

**Solutions**:
1. Set `max_entries` in config
2. Implement periodic cleanup
3. Use eviction policies
4. Monitor with `garrison.stats()`

```rust,ignore
// Periodic memory management
tokio::spawn(async move {
    let mut interval = tokio::time::interval(Duration::from_secs(3600));
    loop {
        interval.tick().await;

        let stats = garrison.stats().await.unwrap();

        if stats.total_entries > 1000 {
            // Trigger cleanup
            garrison.compact().await.unwrap();
        }
    }
});
```

## Testing

### Unit Testing

```rust,ignore
#[cfg(test)]
mod tests {
    use super::*;

    #[tokio::test]
    async fn test_garrison_add_and_retrieve() {
        let garrison = InMemoryGarrison::new(GarrisonConfig::default());

        let entry = GarrisonEntry {
            id: Uuid::new_v4(),
            role: ConversationRole::User,
            content: "Test message".to_string(),
            timestamp: Utc::now(),
            metadata: HashMap::new(),
            token_count: Some(2),
        };

        garrison.remember(entry.clone()).await.unwrap();

        let history = garrison.recall_recent(10).await.unwrap();
        assert_eq!(history.len(), 1);
        assert_eq!(history[0].content, "Test message");
    }

    #[tokio::test]
    async fn test_token_window() {
        let garrison = InMemoryGarrison::new(
            GarrisonConfig::default().with_max_tokens(100)
        );

        // Add entries totaling 150 tokens
        for i in 0..15 {
            garrison.remember(GarrisonEntry {
                id: Uuid::new_v4(),
                role: ConversationRole::User,
                content: format!("Message {}", i),
                timestamp: Utc::now(),
                metadata: HashMap::new(),
                token_count: Some(10),
            }).await.unwrap();
        }

        // Window should respect token limit
        let window = garrison.recall_recent(100).await.unwrap();
        let total_tokens: u32 = window.iter()
            .map(|e| e.token_count.unwrap_or(0))
            .sum();

        assert!(total_tokens <= 100);
    }
}
```

## Examples

See working examples:
- `examples/garrison_in_memory.rs` - Basic in-memory usage
- `examples/garrison_persistent.rs` - SQLite persistence
- `examples/garrison_semantic_search.rs` - Embedding-based retrieval
- `examples/memory_windowing.rs` - Token management strategies

## Next Steps

- **[Tool Integration](tool-integration.md)** - Combine memory with tools
- **[Battalion Patterns](battalion-patterns.md)** - Shared memory in multi-agent systems
- **[API Reference](https://docs.rs/paladin)** - Garrison API documentation

## Related Resources

- [Token Counting Strategies](../architecture/overview.md)
- [Vector Database Integration](../user-guides/sanctum-vector-memory.md)
- [Production Deployment](../deployment/production.md)