Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
pond
Lossless storage and search for AI agent sessions, across every agentic client.
Quickstart. Install, run guided setup, and ingest your local sessions:
pond init registers pond as an MCP server for detected clients; to add it by hand:
Pond keeps every AI conversation you've ever had intact and searchable, and lets you continue any of them in any supported tool - your history, your search, your sessions, independent of the agent vendor that made them. It is one Rust binary that ingests sessions from registered agentic-client adapters into a canonical Session / Message / Part interlingua, stores them in Lance on object storage, and serves search over them via HTTP+JSON and MCP. Two deployments: a personal pond on your laptop, or a multi-tenant backend for hosted agent infrastructure. No extra database, no wrapper around Lance.
Current automatically synced agent clients:
- Claude Code CLI
- Claude desktop app (local agent mode)
- Codex CLI
- opencode CLI
- pi-coding-agent CLI
You can also import a Claude.ai data export with the claude-ai-export adapter - a manual download, so it is not auto-discovered: pond sync claude-ai-export --path <path>.
Status: pre-v1. Schemas, wire shapes, and config keys are subject to breaking change until v1. Full documentation lives at pond.locker; the contract is docs/spec.md.
Background
Every agentic CLI ships its own session format and its own search surface. Switching tools means losing history. Replaying a Claude Code session in another provider's tooling means re-translating the wire shape by hand. Hosted multi-tenant deployments rebuild the same storage layer from scratch.
Pond is the storage and retrieval layer that sits underneath. Every adapter is a bidirectional codec between a client format and one canonical schema, so any session can be restored by any adapter - it need not return to the client that produced it. Storage, search (vector or BM25 full-text, one arm per query), and provider-agnostic replay all sit on a single Lance-on-object-storage foundation.
The v1 surface includes: full CLI, HTTP+JSON and MCP transports, search over three Lance datasets, intfloat/multilingual-e5-small embeddings at FP16 weights (Metal on macOS, CUDA opt-in, CPU fallback), and local-FS / S3 / GCS / Azure backends through Lance's object_store integration.
Install
Linux and macOS are supported; Windows is not in v1 scope.
Package Managers (macOS and Linux):
Build from source:
For CUDA acceleration on Linux:
On macOS the Metal backend is selected automatically; on other systems the CPU fallback runs without extra features.
Usage
Set up storage, adapters, MCP registration, and an optional sync schedule in one pass (idempotent - re-run it any time to repair or update):
Then import sessions from local adapters, embed them, update indexes, and search:
Run a server:
Fetch a single session or message, or move a whole corpus:
Ask structured questions with read-only SQL (the same surface as the pond_sql_query MCP tool):
Run maintenance on demand (sync already embeds inline and folds indexes every run):
Keep pond current automatically (launchd on macOS, systemd user timers or cron on Linux):
pond status prints a per-table storage table, then indexes (text/semantic readiness), stored (sessions + searchable messages), and adapters (configured adapter count). Pass --adapters for per-project tables and per-intent index detail. pond search --explain returns Lance's analyze_plan output for each retrieval arm.
Remote storage
By default pond stores data locally under $XDG_DATA_HOME/pond. To use an object store, add credentials and switch the destination:
pond init --storage-path <url> configures a remote destination during setup and prompts for credentials inline when the destination is remote, so a bucket is one command. The s3+https://host/bucket form works for any S3-compatible store (Hetzner, R2, B2, MinIO); s3://, gs://, and az:// use the standard cloud SDK credential chain when no [creds.*] set matches. pond copy --from <local> --to <url> carries existing local data into the bucket - idempotent, never deletes the source, and on completion it rebuilds the destination indexes and verifies every row landed (exit 6 if any are missing or duplicated, so you never reconcile by hand). pond copy --verify-only --from <local> --to <url> runs that same check read-only, without copying. Full walkthrough: pond.locker.
Configuration
pond init walks through everything below interactively and enables the adapters it finds. pond sync only ingests already-enabled adapters - enabling one is an explicit step (pond adapters enable / pond adapters discover / pond init), never a side effect of sync. Config lives under $XDG_CONFIG_HOME/pond/. Every [adapters.<name>] block needs enabled = true to be active; sections without it (or with enabled = false) are skipped.
[]
= true
= "~/.claude/projects"
[]
= false # kept in config, skipped on `pond sync`
= "~/.codex/sessions"
Verbosity
Root-level -v / -vv / -vvv raise the tracing level (info / debug / trace); -q / -qq lower it. The default surfaces warnings only. RUST_LOG overrides the CLI flag when set; POND_LOG is no longer honored.
Design
The full contract is in docs/spec.md. Key choices:
- Lance direct, no wrapper. The
lance-format/lancecrates are the only storage and search engine. Nolancedb, no parallel abstraction. Storage, indexing, OCC, schema evolution, blob columns, versioning, and time-travel are all Lance. The read-onlypond sqlsurface is DataFusion planning over the same Lance datasets - a query escape hatch, not a second engine. - Canonical Session / Message / Part interlingua. Owned in pond, in the shape of Effect v4's
Prompt-side Part union. This schema is pond's product; everything else is machinery around it. - Three Lance datasets (
sessions,messages,parts).messagescarries the nullable embedding (vector+embedding_model) alongside denormalized filter columns (source_agent/project/role/timestamp) for single-stage filter pushdown. - No-synthesis adapter seam. Adapters parse source records through extractor helpers that make "invent a value" a compile error -
model-no-synthesis,model-schema-honesty, andadapter-provenance-requiredare structural, not review rules. - Index lifecycle decoupled from writes. Writes commit data (embeddings included, computed inline at ingest) without folding the search indexes.
pond syncruns index maintenance by default, andpond optimize --only indexruns it on demand; Lance merges index results with a flat scan over unindexed fragments, so reads stay correct. - Single-arm retrieval. Each query runs one retriever -
vector(cosine, with a gentle recency tiebreaker) orfts(BM25) - chosen per query; no server-side fusion. The vector arm falls back to full-text when the store has no embeddings, and--sort-by recencyreturns newest-first. Results group to one summary per session, keyed onsession_root. - Language-neutral full-text. Word-level
simpletokenizer with English stemming (ascii-folding on); tokens the stemmer does not recognize pass through unchanged and stay exact-matchable, so pond indexes sessions in any language alike. - Two transports, one handler set. HTTP+JSON (axum) and MCP (rmcp) both dispatch into the same handlers. Wire ops:
pond_search,pond_get,pond_ingest. MCP additionally exposes the read-onlypond_sql_querytool and theschema://pond,schema://pond-sql, andstats://pondresources. - Opaque-string multi-tenancy. Each tenant is a
namespacestring the integrator supplies; pond does not authenticate, authorize, or model identity. The object store's IAM is the storage boundary. - Encryption is operational. Bucket SSE plus filesystem encryption; pond holds no keys and adds no application-level crypto.
References
The upstream schemas that shaped pond's canonical model are documented in docs/references/ (source URLs + why each matters; the vendored code itself is not redistributed). Real session captures live under tests/fixtures/adapter/.
| Source | Why it matters |
|---|---|
| Effect-TS/effect | Effect v4 Prompt/Response Part unions. Pond's canonical types copy this shape. |
| sst/opencode | Effect Schema canonical Part union; SDK types; storage schema. |
| kilo-org/kilocode | OpenCode fork. Adds editorContext, plan-followup, kilocode-specific events. |
| badlogic/pi-mono | pi-coding-agent leaf-cursor branching and cross-provider conformance test matrix. |
| open-telemetry/semantic-conventions-genai | GenAI semantic conventions. Inspiration for shape overlap; pond does not derive from OTel. |
tests/fixtures/adapter/ |
Real session captures for nine source harnesses (claude_ai_export, claude_code, claude_desktop_app, claude_managed_agents, codex_cli, nanoclaw, openclaw, opencode, pi-coding-agent). Drives adapter design and serves as adapter test fixtures. |
Contributing
Issues and pull requests are welcome. The most useful contributions right now:
- Spec feedback on
docs/spec.md. - Pointers to additional reference schemas or session samples worth documenting under
docs/references/. - Bug reports against the v1 surface (CLI verbs, wire ops, schema mismatches, OCC behavior, object-store backends).
For larger changes, open an issue first to discuss the direction. For security issues, see SECURITY.md.
License
Apache-2.0 (c) 2026 tenequm