Skip to main content

Module code_index

Module code_index 

Source
Expand description

Code index host capability.

Ports the deterministic trigram/word index plus the live workspace state (agent registry, advisory locks, append-only version log, file id assignment, cached reads) that previously lived in Sources/BurinCodeIndex/ on the Swift side. The capability owns one SharedIndex cell per instance; cloning the capability shares state with every Harn VM that has been wired against it.

Surface — every builtin is locked by schemas/code_index/<method>.json:

§Workspace queries (the original 5)

BuiltinWhat it does
hostlib_code_index_queryTrigram-accelerated literal substring search.
hostlib_code_index_rebuildWalk a workspace and (re)build the in-memory index.
hostlib_code_index_statsCount files/trigrams/words + last rebuild timestamp.
hostlib_code_index_imports_forImports declared by a single file (with resolutions).
hostlib_code_index_importers_ofReverse lookup: who imports the given module/path?

§Live workspace state (added in #776)

  • Agents: agent_register, agent_heartbeat, agent_unregister, current_agent_id, status.
  • Locks: lock_try, lock_release.
  • Change log: current_seq, changes_since, version_record.
  • File table: path_to_id, id_to_path, file_ids, file_meta, file_hash.
  • Cached reads: read_range, reindex_file, trigram_query, extract_trigrams, word_get, deps_get, outline_get.

§Concurrency model

All ops serialise through a single Arc<Mutex<Option<IndexState>>>. That matches the Swift actor: the IDE editor, eval, and live agent all see one consistent view. The capability is Send + Sync so embedders can share it across threads, but the mutex still serialises actual work.

Structs§

AgentInfo
One row in the registry. Public so embedders that want to surface a status panel can read the lifecycle state without going through the host builtins.
AgentRegistry
Per-workspace agent registry plus advisory per-file lock table.
BuildOutcome
Summary returned from IndexState::build_from_root.
ChangeRecord
Public denormalised form returned by changes_since.
CodeIndexCapability
Code-index capability handle.
CodeIndexSnapshot
Persistent on-disk form of the entire workspace index.
DepGraph
Forward + reverse import graph plus the side-table of unresolved import strings (raw text we couldn’t map back to a known file).
IndexState
In-memory index for one workspace. Composed from the per-file table, the trigram + word sub-indexes, the dep graph, the append-only version log, and the agent registry.
IndexedFile
Per-file metadata persisted in the index.
IndexedSymbol
Outline-style symbol entry. Reserved for AST integration; the code-index importer leaves IndexedFile::symbols empty, but the shape is kept stable so storage upgrades won’t have to re-key.
RegistryConfig
Registry config — defaults match the Swift actor on the burin-code side so the cross-repo schema-drift tests stay aligned.
SnapshotMeta
On-disk metadata header. Small and cheap to read so embedders can peek at a snapshot without parsing the whole thing.
TrigramIndex
Trigram posting list: trigram -> set of file ids that contain it, plus a per-file reverse map for cheap re-indexing.
VersionEntry
One entry in the per-file history.
VersionLog
Append-only log keyed by path. Both forward query patterns — “everything since X” and “the latest entry for this path” — are served from the same map.
WordHit
Single occurrence of an identifier-shaped token: which file it landed in and on which 1-based line number.
WordIndex
Inverted word index keyed on identifier-shaped tokens.

Enums§

AgentState
Lifecycle state of one tracked agent.
EditOp
Edit-classification for one record. The string forms ride out to Harn scripts and the cross-repo schema so callers can switch on them.

Constants§

HISTORY_LIMIT
Maximum number of entries kept per path. Older entries roll off the front in FIFO order — matches the Swift constant.

Type Aliases§

AgentId
Stable identifier for an agent in the registry.
FileId
Monotonically-assigned identifier for a file in the index. Stable across re-indexes of the same path so sub-indexes can key on FileId without invalidating string keys.
SharedIndex
Shared, mutable cell carrying the (at most one) live workspace index. Mutex rather than RwLock because rebuilds flip the slot wholesale and every mutating op (record_edit, agent_register, lock_try, etc.) needs exclusive access. Single-threaded VM scripts pay no real cost from the choice; embedders that fan out across threads are still safe because the mutex serialises everyone.