agentmem 0.1.1

Local-first memory engine for AI coding agents
Documentation
# Architecture

## Layers

1. CLI layer
2. Query layer
3. Index layer
4. Store layer
5. In-memory engine
6. Config layer

## Cost-First Flow

1. User asks question in Codex/Claude.
2. Agent calls local `index.query(...)`.
3. Query layer enforces `top_k` and `token_budget`.
4. If confidence is sufficient, answer from local chunks only.
5. Only low-confidence queries may escalate to remote model context.

The default path must be "local retrieval only".

## Core Principles

- local-first
- explicit paths
- atomic writes
- typed interfaces
- no hidden side effects
- predictable query cost
- incremental indexing only

## Index Design

- Forward index: `chunk_id -> {path, line_start, line_end, hash, text_ref}`
- Inverted index: `token/symbol -> [chunk_id...]`
- Metadata index: language, file size, last modified hash

This avoids full-repo scans for every question.

## Performance Design

- daemon holds hot index in memory
- incremental update from file hashes and git diff
- append journal for updates; periodic snapshot compaction
- no full snapshot rewrite on each mutation

## Retrieval Strategy

- rank by lexical and symbol matches first
- cap results by bytes/tokens before model call
- include precise file/line references in output
- prefer deterministic "not enough evidence" over expensive fallback

## Telemetry (Local)

Track per-project:

- queries served from index only
- remote fallback rate
- average token payload sent to model
- index build/update durations

Primary KPI: reduce remote-token usage while preserving answer quality.