Lantern
Lantern is a local-first memory engine for agent activity.
It ingests text an agent has touched, chunks it deterministically, and keeps a
full provenance trail — source URI, content hash, byte ranges, ingest time —
alongside a BM25 keyword index. Everything lives in a single SQLite file under
./.lantern/ so it is easy to inspect, back up, or wipe by hand.
Thesis
Most memory tools are either chat-memory products, document search tools, or heavyweight agent frameworks. Lantern is narrower and more durable:
a local memory engine for agent activity with provenance-aware search
Provenance comes first. Every stored chunk can answer where it came from, when it was ingested, what exact byte range it covers, and why a search result surfaced it.
Install
Prebuilt Binaries
Static binaries are available from the GitLab Releases:
| Platform | File |
|---|---|
| Linux x86_64 | lantern-linux-amd64 |
| Linux aarch64 | lantern-linux-arm64 |
| macOS aarch64 | lantern-macos-arm64 |
# Download and install (Linux amd64 example)
All Linux binaries are fully static (musl) — no libc or OpenSSL dependency. SHA256 checksums are attached to each release.
Build from Source
Requires a recent Rust toolchain (2024 edition):
Commands
| Command | Purpose |
|---|---|
lantern init |
Create a local store at ./.lantern/lantern.db |
lantern ingest <path> |
Ingest supported files from a path; respects .lantern-ignore (use --no-ignore to bypass) |
lantern ingest <path> --follow |
Poll <path> on an interval (--follow-interval-secs, default 5) and re-ingest new or modified files until interrupted |
lantern ingest --stdin --uri L |
Ingest piped content under an explicit label |
lantern ingest <fifo> |
Auto-detect a named pipe and read it to EOF as a streamed batch (append mode, fifo:// URI) |
lantern embed |
Generate embeddings for chunks via Ollama (--model, --ollama-url, --limit) |
lantern mcp |
Run the MCP server over stdio or TCP (--port) |
lantern search <query> |
BM25 keyword search with --kind, --path, --limit filters |
lantern search --semantic <q> |
Semantic search via Ollama embeddings (cosine similarity; auto-uses sqlite-vec when eligible) |
lantern search --vec-semantic <q> |
Force the sqlite-vec-backed semantic path for the default model |
lantern search --hybrid <q> |
Hybrid keyword + semantic search via Reciprocal Rank Fusion |
lantern show <id> |
Full provenance + all chunks for one source (id prefix ok) |
lantern inspect |
Store status: schema version, counts, recent sources |
lantern export |
JSON dump of sources + chunks, filterable by --path / --query |
lantern diff [<path>] |
Compare indexed file:// sources against the filesystem |
lantern forget <pattern> |
Preview matching sources; pass --apply to actually delete |
lantern reindex |
Rebuild the full-text index from the canonical chunk rows |
lantern stash |
Write a timestamped tar.gz snapshot under <store>/stashes/ |
lantern version / --version |
Print the build version |
Every command that produces structured output accepts --format text or
--format json; search additionally defaults to a compact summary mode.
Examples
Index a notes tree and search it
Capture an agent session transcript from stdin
|
Stream session transcripts through a named pipe
# In one shell: the agent writes its transcript to the pipe between turns.
# In another: Lantern reads to EOF, ingests the batch, and is ready for the next.
Lantern auto-detects the FIFO, reads until the writer closes, and routes the
bytes through the stdin-append path. Each reader session lands as its own
source under a fifo://<abs_path>#<suffix> URI, so repeated batches
accumulate instead of overwriting. A .jsonl FIFO name still triggers the
transcript extractor, preserving role / session / turn / tool metadata.
Watch a transcript directory for new sessions
Polling-based: Lantern re-scans the directory every interval and ingests any file whose content hash has changed. Unchanged files are a no-op, so this is cheap to leave running. Stop with Ctrl-C.
Drill into a single source
See what drifted since the last ingest
Snapshot the store before a risky change
.lantern-ignore
Lantern respects .lantern-ignore files for excluding paths from ingestion,
similar to .gitignore. Place a .lantern-ignore file in the directory being
ingested:
# Ignore build artifacts
target/
dist/
build/
# Ignore dependencies
node_modules/
.venv/
vendor/
# Ignore but keep one file
!important-logs/
# Ignore specific extensions
*.log
*.tmp
Pattern syntax:
#— comments*,?,**— glob wildcards/suffix — match directories only!prefix — negate (un-ignore)
Default ignores (applied when no .lantern-ignore exists):
.git/, target/, node_modules/, .hermes/, __pycache__/, .venv/, vendor/
Use --no-ignore to bypass all ignore rules:
Data model
Two tables carry the indexed state; both are visible from sqlite3:
sources— one row per ingested artifact. Keepsuri, optional filesystempath,kind(text/markdown,text/plain,application/jsonl), totalbytes,content_sha256, and timestamps.chunks— one row per deterministic slice of a source. Keeps the parentsource_id,ordinal,byte_start/byte_end,char_count, chunk text, and chunksha256.
A shadow FTS5 virtual table (chunks_fts) is kept in sync by triggers and
supplies BM25 ranking and snippet highlighting to search.
Development
Status
Early but usable. The CLI is stable, the schema versions its migrations (now through v7),
and every command has integration test coverage. Keyword search (FTS5 BM25),
semantic search (Ollama embeddings with cosine similarity, now auto-accelerated
with sqlite-vec for the default unfiltered path and backfilled on upgrade), hybrid search,
an opt-in --vec-semantic path, and an MCP server are all implemented. Ingestion supports
.lantern-ignore for excluding build artifacts and dependencies.
License
Lantern is licensed under the GNU Affero General Public License v3.0 only (AGPL-3.0-only).
Copyright (C) 2026 Raphael Bitton
See LICENSE.