pub struct Database { /* private fields */ }Implementations§
Source§impl Database
impl Database
Sourcepub fn get_metadata(&self, key: &str) -> Result<Option<String>>
pub fn get_metadata(&self, key: &str) -> Result<Option<String>>
Retrieve a metadata value by key.
Sourcepub fn set_metadata(&self, key: &str, value: &str) -> Result<()>
pub fn set_metadata(&self, key: &str, value: &str) -> Result<()>
Store a metadata key-value pair (upserts on conflict).
tx-safe: single statement — see Self::begin_indexing_tx.
Sourcepub fn reconcile_embedding_fingerprint(
&self,
fp: &EmbeddingFingerprint,
) -> Result<()>
pub fn reconcile_embedding_fingerprint( &self, fp: &EmbeddingFingerprint, ) -> Result<()>
Reconcile the stored embedding fingerprint with the one currently in
use. Call right after Database::open from any code path that owns
an EmbeddingProvider (indexer, watcher, MCP serve, rag index).
Behavior:
- All three fields (provider, model, dimension) match stored → no-op, zero writes.
- Any field differs → drop
symbol_vec, clearsymbol_embedding_map, recreate the vector table at the new dimension, update all three metadata keys. The user must runcartog rag indexto repopulate. - DB has dimension but no provider/model (older cartog versions) → backfill provider+model without wiping. The stored vectors stay valid against whatever stack produced them; we just record the identity going forward.
Writes use retry_busy so a concurrent writer on the same DB does
not crash this caller with SQLITE_BUSY.
Sourcepub fn upsert_file(&self, file: &FileInfo) -> Result<()>
pub fn upsert_file(&self, file: &FileInfo) -> Result<()>
Insert or update file metadata.
tx-safe: single statement — see Self::begin_indexing_tx.
Sourcepub fn get_file(&self, path: &str) -> Result<Option<FileInfo>>
pub fn get_file(&self, path: &str) -> Result<Option<FileInfo>>
Look up stored metadata for a file.
Sourcepub fn clear_edges_for_file(&self, path: &str) -> Result<()>
pub fn clear_edges_for_file(&self, path: &str) -> Result<()>
Remove edges only for a file (used by Merkle diff which updates symbols surgically).
tx-safe: single statement — see Self::begin_indexing_tx.
Sourcepub fn clear_file_data(&self, path: &str) -> Result<()>
pub fn clear_file_data(&self, path: &str) -> Result<()>
Remove all symbols, edges, and RAG data for a file (before re-indexing it).
Sourcepub fn clear_file_data_in_tx(&self, path: &str) -> Result<()>
pub fn clear_file_data_in_tx(&self, path: &str) -> Result<()>
Like Self::clear_file_data but assumes the caller already holds an
open transaction. Used by cartog-indexer to wrap the entire Phase 3
pipeline atomically.
Sourcepub fn remove_file(&self, path: &str) -> Result<()>
pub fn remove_file(&self, path: &str) -> Result<()>
Remove a file and all its symbols and edges from the index.
Sourcepub fn remove_file_in_tx(&self, path: &str) -> Result<()>
pub fn remove_file_in_tx(&self, path: &str) -> Result<()>
Like Self::remove_file but assumes the caller already holds an
open transaction.
Sourcepub fn insert_symbol(&self, sym: &Symbol) -> Result<()>
pub fn insert_symbol(&self, sym: &Symbol) -> Result<()>
Insert or replace a single symbol.
Sourcepub fn insert_symbols(&self, symbols: &[Symbol]) -> Result<()>
pub fn insert_symbols(&self, symbols: &[Symbol]) -> Result<()>
Insert or replace multiple symbols in a single transaction.
Sourcepub fn insert_symbols_in_tx(&self, symbols: &[Symbol]) -> Result<()>
pub fn insert_symbols_in_tx(&self, symbols: &[Symbol]) -> Result<()>
Like Self::insert_symbols but assumes the caller already holds an
open transaction.
Sourcepub fn get_symbol_hashes_for_file(
&self,
file_path: &str,
) -> Result<Vec<(String, Option<String>, Option<String>)>>
pub fn get_symbol_hashes_for_file( &self, file_path: &str, ) -> Result<Vec<(String, Option<String>, Option<String>)>>
Get stored symbol hashes for a file (for Merkle diff).
Returns (id, content_hash, subtree_hash) tuples.
Sourcepub fn update_symbol_position(
&self,
id: &str,
start_line: u32,
end_line: u32,
start_byte: u32,
end_byte: u32,
) -> Result<()>
pub fn update_symbol_position( &self, id: &str, start_line: u32, end_line: u32, start_byte: u32, end_byte: u32, ) -> Result<()>
Update only the position fields of a symbol (for moved-but-unchanged symbols).
Sourcepub fn delete_symbols(&self, ids: &[String]) -> Result<()>
pub fn delete_symbols(&self, ids: &[String]) -> Result<()>
Delete multiple symbols and cascade (edges, content, embeddings) in a single transaction.
Sourcepub fn delete_symbols_in_tx(&self, ids: &[String]) -> Result<()>
pub fn delete_symbols_in_tx(&self, ids: &[String]) -> Result<()>
Like Self::delete_symbols but assumes the caller already holds an
open transaction.
Sourcepub fn delete_symbol(&self, id: &str) -> Result<()>
pub fn delete_symbol(&self, id: &str) -> Result<()>
Delete a single symbol and cascade to edges, content, and embeddings.
Sourcepub fn insert_edge(&self, edge: &Edge) -> Result<()>
pub fn insert_edge(&self, edge: &Edge) -> Result<()>
Insert a single edge.
Sourcepub fn insert_edges(&self, edges: &[Edge]) -> Result<()>
pub fn insert_edges(&self, edges: &[Edge]) -> Result<()>
Insert multiple edges in a single transaction.
Sourcepub fn insert_edges_in_tx(&self, edges: &[Edge]) -> Result<()>
pub fn insert_edges_in_tx(&self, edges: &[Edge]) -> Result<()>
Like Self::insert_edges but assumes the caller already holds an
open transaction.
Source§impl Database
impl Database
Sourcepub fn open(path: impl AsRef<Path>, embedding_dim: usize) -> DbResult<Self>
pub fn open(path: impl AsRef<Path>, embedding_dim: usize) -> DbResult<Self>
Open or create the database at the given path.
embedding_dim sets the vector dimension for the sqlite-vec table.
If the stored dimension differs from the requested one, the vector index
is cleared and recreated (a re-index via cartog rag index is needed).
Sourcepub fn open_existing_rw(path: impl AsRef<Path>) -> DbResult<Self>
pub fn open_existing_rw(path: impl AsRef<Path>) -> DbResult<Self>
Open an existing on-disk database in read-write mode without running schema migrations or the embedding-fingerprint reconcile. Used by the Phase 5 promoter: a secondary that detected its primary died and validated the on-disk schema/fingerprint against its pinned snapshot before claiming the slot. Re-running the migration would re-trigger the SQLITE_BUSY race that the election was meant to prevent.
Verifies that schema_version still matches SCHEMA_VERSION to
guard against a race where another writer upgraded the schema
between the secondary’s attach and this promotion. Returns
DbError::SchemaDrift in that case so the promoter aborts and
the MCP process exits cleanly.
Sourcepub fn open_readonly(path: impl AsRef<Path>) -> DbResult<Self>
pub fn open_readonly(path: impl AsRef<Path>) -> DbResult<Self>
Open an existing on-disk database in read-only mode for a secondary cartog process (Phase 4 read-only attach). Skips schema migrations and the embedding-fingerprint reconcile — the primary writer owns those.
Behaviour:
- Opens with
SQLITE_OPEN_READ_ONLYso write attempts surface asSQLITE_READONLYerrors at runtime (a defense-in-depth backup for the higher-level tool gating). - Reads the
metadatasnapshot (schema version + embedding fingerprint) and stores it on the returnedDatabaseso the promoter (Phase 5) can compare against the on-disk values later. - Returns
DbError::SchemaDriftif the storedschema_versiondoesn’t match this binary’s expected version — the primary upgraded cartog underneath us and queries can’t be trusted.
Sourcepub fn is_read_only(&self) -> bool
pub fn is_read_only(&self) -> bool
True when this Database was opened via Self::open_readonly.
MCP tool gating (Phase 4) consults this to refuse the 2 write tools.
Sourcepub fn pinned_attach(&self) -> Option<&PinnedAttach>
pub fn pinned_attach(&self) -> Option<&PinnedAttach>
Snapshot captured at attach time when Self::open_readonly was
used. None for read-write opens.
Sourcepub fn begin_indexing_tx(&self) -> Result<Transaction<'_>>
pub fn begin_indexing_tx(&self) -> Result<Transaction<'_>>
Open a single SQLite transaction that the caller is expected to wrap around a multi-step indexing pipeline.
Drop without commit() rolls back, so a panic mid-pipeline leaves the
DB in its prior state.
§Calling conventions inside the transaction
Helpers fall into two categories:
-
Batched writers must use the
_in_txvariant. Their non-_in_txwrapper issues its ownBEGINand would error out at runtime (cannot start a transaction within a transaction). Examples:Self::insert_symbols_in_tx,Self::delete_symbols_in_tx,Self::insert_edges_in_tx,Self::insert_symbol_contents_in_tx,Self::clear_file_data_in_tx,Self::remove_file_in_tx,Self::resolve_edges_in_tx,Self::resolve_edges_scoped_in_tx. -
Single-statement helpers can be called directly. They issue one
self.conn.execute(...)and participate transparently in the active transaction. Examples used bycartog-indexer’s Phase 3 today:Self::upsert_file,Self::clear_edges_for_file,Self::set_metadata,Self::compute_in_degrees,Self::compute_in_degrees_scoped,Self::invalidate_edges_targeting. These are tagged with// tx-safe: single statementso the contract survives drive-by edits.
§Why unchecked_transaction rather than rusqlite::Connection::transaction
transaction() requires &mut Connection, which would force every
caller of Database to hold a mutable borrow for the entire pipeline.
unchecked_transaction() works through &Connection and produces an
equivalent rusqlite::Transaction with the same DropBehavior::Rollback
default — only borrow-check ergonomics differ.
§Errors
Returns an error if SQLite cannot begin a transaction — typically because another transaction is already active on this connection.
Sourcepub fn optimize(&self) -> Result<()>
pub fn optimize(&self) -> Result<()>
Refresh the query planner’s statistics via PRAGMA optimize.
SQLite picks join order and index use from sqlite_stat1; without it,
the planner guesses from index shape alone and can mis-plan (the tier-2
resolution misplan in #110 was one such case). PRAGMA optimize runs
ANALYZE only on tables whose row counts have drifted since the last
analyze, so it is a cheap no-op when nothing changed — unlike a bare
ANALYZE, which would re-scan every index on each call and reintroduce
a per-index O(repo) cost.
Call AFTER committing the indexing transaction, not inside it: a stats rebuild bundled into the big write tx would bloat it. No-op-safe to call when nothing was indexed, but the indexer skips it on no-op runs anyway.
Source§impl Database
impl Database
Sourcepub fn unresolved_edges(&self) -> Result<Vec<UnresolvedEdge>>
pub fn unresolved_edges(&self) -> Result<Vec<UnresolvedEdge>>
Return edges still waiting for resolution (resolution_state = 0).
Edges marked state = 2 (unresolvable), state = 3 (external), or
state = 4 (heuristic-exhausted) are excluded so a dirty reindex
doesn’t re-query the language server for edges it already classified.
All three are sticky and re-enter the unresolved set only via
Self::reset_unresolvable_for_names when a matching symbol is added,
Self::reset_all_unresolvable on --force, or (state 4 only)
Self::reopen_heuristic_exhausted before an LSP-enabled reindex.
(Self::invalidate_edges_targeting only touches state=1 rows because
it filters on target_id IS NOT NULL, and state {2, 3, 4} rows always
have target_id NULL.)
tx-safe: read-only single statement — see note above the section header.
Sourcepub fn find_symbol_at_location(
&self,
file_path: &str,
line: u32,
) -> Result<Option<String>>
pub fn find_symbol_at_location( &self, file_path: &str, line: u32, ) -> Result<Option<String>>
Find the tightest-enclosing symbol at a given file + line.
tx-safe: read-only single statement — see the LSP-section header note.
Sourcepub fn update_edge_target(&self, edge_id: i64, target_id: &str) -> Result<()>
pub fn update_edge_target(&self, edge_id: i64, target_id: &str) -> Result<()>
Update a single edge’s target_id and flip it to resolution_state = 1.
tx-safe: single statement — see the LSP-section header note. If you
ever batch this internally with unchecked_transaction(), also update
index_directory so it does not call lsp_resolve_edges inside its
outer transaction.
Sourcepub fn is_edge_unresolvable(&self, edge_id: i64) -> Result<bool>
pub fn is_edge_unresolvable(&self, edge_id: i64) -> Result<bool>
Test-only inspector: returns true when the edge is at resolution_state = 2.
Exposed because the column is otherwise crate-private, and downstream integration tests (cartog-indexer) need a read-only way to assert the marker state without snapshotting raw SQL.
Sourcepub fn edge_resolution_state(&self, edge_id: i64) -> Result<i64>
pub fn edge_resolution_state(&self, edge_id: i64) -> Result<i64>
Test-only inspector: returns the raw resolution_state value for an edge.
0=unresolved, 1=resolved, 2=unresolvable, 3=external, 4=heuristic-exhausted.
Sourcepub fn has_heuristic_exhausted(&self) -> Result<bool>
pub fn has_heuristic_exhausted(&self) -> Result<bool>
Whether any edge sits at resolution_state = 4 (heuristic-exhausted).
Gates the MCP warm-LSP catch-up on no-op reindexes.
tx-safe: read-only single statement — see the LSP-section header note.
Sourcepub fn count_edges_in_state(&self, state: i64) -> Result<u32>
pub fn count_edges_in_state(&self, state: i64) -> Result<u32>
Test-only inspector: count edges at a given raw resolution_state
(see Self::edge_resolution_state for the state legend).
tx-safe: read-only single statement — see the LSP-section header note.
Sourcepub fn reset_all_unresolvable(&self) -> Result<u32>
pub fn reset_all_unresolvable(&self) -> Result<u32>
Reset every edge at resolution_state IN (2, 3, 4) back to 0. Used by
cartog index --force to honor the “retry everything” contract:
without this, the heuristic + LSP would still skip permanently-marked
edges (unresolvable, external, or heuristic-exhausted) even on a forced
re-index.
tx-safe: single statement — see the LSP-section header note.
Sourcepub fn reopen_heuristic_exhausted(&self) -> Result<u32>
pub fn reopen_heuristic_exhausted(&self) -> Result<u32>
Reset resolution_state from 4 (heuristic-exhausted) → 0 so a
later LSP-enabled reindex retries them.
State 4 is written by Self::mark_heuristic_exhausted_in_tx only in
LSP-disabled runs (--no-lsp, watch). When a subsequent cartog index does run LSP, those edges have never seen the language server, so
the indexer reopens them here before the LSP pass. Distinct from
Self::reset_all_unresolvable: this leaves the genuine LSP verdicts
(state {2, 3}) sealed.
tx-safe: single statement — see the LSP-section header note.
Sourcepub fn mark_edge_unresolvable(&self, edge_id: i64) -> Result<()>
pub fn mark_edge_unresolvable(&self, edge_id: i64) -> Result<()>
Mark a single edge as resolution_state = 2 (LSP definitively gave up).
Callers MUST only invoke this after a definitive negative answer from
the language server. Never call from a transient-error branch (server
crash, didOpen failure, half-loaded warmup) — the marker is sticky
across runs until Self::reset_unresolvable_for_names reopens it
(on a matching new symbol) or Self::reset_all_unresolvable runs
(--force).
The WHERE resolution_state = 0 guard preserves the invariant that
state {2, 3} rows have target_id IS NULL — without it an accidental
call on a state=1 (resolved) edge would silently flip the state while
keeping the stale target, hiding a corrupted edge from
Self::unresolved_edges.
tx-safe: single statement — see the LSP-section header note.
Sourcepub fn mark_edge_external(&self, edge_id: i64) -> Result<()>
pub fn mark_edge_external(&self, edge_id: i64) -> Result<()>
Mark a single edge as resolution_state = 3 (LSP located the target
outside the indexed root — stdlib, third-party deps, node_modules).
Same stickiness contract as Self::mark_edge_unresolvable: only call
after a definitive positive answer naming an out-of-root URI;
reopened by the same name-keyed and force-reset paths. The
WHERE resolution_state = 0 guard preserves the target_id IS NULL
invariant for state=3 rows.
tx-safe: single statement — see the LSP-section header note.
Sourcepub fn reset_unresolvable_for_names(&self, names: &[String]) -> Result<u32>
pub fn reset_unresolvable_for_names(&self, names: &[String]) -> Result<u32>
Reset resolution_state from {2, 3, 4} → 0 for edges whose target_name is in names.
Called from the indexer when new symbols are added: an edge previously
“unresolvable” (no symbol with this name), “external” (target outside
the index, now vendored in-tree), or “heuristic-exhausted” may now
resolve against the freshly-added target. Returns edges reopened; no-op
when names is empty.
tx-safe: single statement. Names are batched to honor SQLite’s default 999-parameter limit; only rows at state {2, 3, 4} are touched so the write set stays tiny even on a large rename.
Source§impl Database
impl Database
Sourcepub fn search(
&self,
query: &str,
kind_filter: Option<SymbolKind>,
file_filter: Option<&str>,
limit: u32,
) -> Result<Vec<Symbol>>
pub fn search( &self, query: &str, kind_filter: Option<SymbolKind>, file_filter: Option<&str>, limit: u32, ) -> Result<Vec<Symbol>>
Search for symbols by name — case-insensitive, prefix match ranks before substring.
% and _ in query are treated as literals, not LIKE wildcards.
Note: LOWER() in SQLite is ASCII-only, which is acceptable for code identifiers.
Returns an error if query is empty or limit is zero.
Sourcepub fn outline(&self, file_path: &str) -> Result<Vec<Symbol>>
pub fn outline(&self, file_path: &str) -> Result<Vec<Symbol>>
Outline: all symbols in a file, ordered by line.
Sourcepub fn callees(&self, name: &str) -> Result<Vec<Edge>>
pub fn callees(&self, name: &str) -> Result<Vec<Edge>>
Find what a symbol calls (edges originating from symbols matching the name).
Sourcepub fn callee_ids_of(&self, source_id: &str) -> Result<Vec<String>>
pub fn callee_ids_of(&self, source_id: &str) -> Result<Vec<String>>
Resolved symbol ids that the symbol source_id calls. Only edges with a
resolved target_id are returned — keyed on the exact source id (not a
name), so an overloaded source resolves to the right callees.
Sourcepub fn caller_ids_of(&self, target_id: &str) -> Result<Vec<String>>
pub fn caller_ids_of(&self, target_id: &str) -> Result<Vec<String>>
Source symbol ids that call the symbol target_id (resolved incoming
calls edges). Keyed on the exact target id, so callers of one overload
aren’t confused with another sharing its name.
Sourcepub fn refs(
&self,
name: &str,
kind_filter: Option<EdgeKind>,
) -> Result<Vec<(Edge, Option<Symbol>)>>
pub fn refs( &self, name: &str, kind_filter: Option<EdgeKind>, ) -> Result<Vec<(Edge, Option<Symbol>)>>
All references to a name, with the source symbol resolved. Optionally filter by edge kind.
Sourcepub fn hierarchy(&self, class_name: &str) -> Result<Vec<(String, String)>>
pub fn hierarchy(&self, class_name: &str) -> Result<Vec<(String, String)>>
Inheritance hierarchy rooted at a class.
Like Self::refs, matches the resolved target symbol’s name too:
qualified target_names (PHP App\Auth\BaseService) never equal the
class’s short name, so children would otherwise be invisible. The
parent column reports the resolved short name when available.
Sourcepub fn file_deps(&self, file_path: &str) -> Result<Vec<Edge>>
pub fn file_deps(&self, file_path: &str) -> Result<Vec<Edge>>
File-level dependencies (imports from a file).
Sourcepub fn impact(&self, name: &str, max_depth: u32) -> Result<Vec<(Edge, u32)>>
pub fn impact(&self, name: &str, max_depth: u32) -> Result<Vec<(Edge, u32)>>
Transitive impact analysis: everything reachable within depth hops.
Evaluated as one recursive CTE rather than iterating refs() per
frontier node — saves N round-trips. The recursive step is split into
two complementary arms (a literal target_name match and a resolved
target_id match) so each seeks an index instead of full-scanning
edges; their UNION must equal the old single OR-join edge set — keep
the two arms jointly exhaustive when editing. Each unique edge is
returned once, labeled with the minimum depth at which it was reached.
Sourcepub fn trace(
&self,
from: &str,
to: &str,
max_depth: u32,
) -> Result<Option<Vec<PathHop>>>
pub fn trace( &self, from: &str, to: &str, max_depth: u32, ) -> Result<Option<Vec<PathHop>>>
Shortest forward calls path from from to to, or None if to is
unreachable within max_depth hops. from == to yields an empty path.
Matches the source symbol by name, so an ambiguous from follows every
match and the globally-shortest path wins. Only statically-resolved
calls edges participate (dynamic dispatch is absent).
§Errors
Returns an error if the SQLite query fails.
Sourcepub fn is_empty(&self) -> Result<bool>
pub fn is_empty(&self) -> Result<bool>
True when no symbols have been indexed yet (fresh/empty DB). Cheap —
a single EXISTS(SELECT 1 FROM symbols) that stops at the first row.
Used by query commands to distinguish “no index yet” from a no-match.
Sourcepub fn stats(&self) -> Result<IndexStats>
pub fn stats(&self) -> Result<IndexStats>
Index statistics.
Sourcepub fn log_query(&self, tool: &str, source: &str)
pub fn log_query(&self, tool: &str, source: &str)
Record a successful query against the index for the cartog stats --savings
/ cartog savings retention hook.
Best-effort: errors are swallowed (logged via warn!) so a failing
write never aborts the user’s actual query.
Read-only attach skips the write. Secondary MCP servers opened via
Self::open_readonly cannot write at all. As a result, queries
served by a secondary are NOT reflected in query_log — there is no
IPC that forwards them to the primary. cartog stats --savings on a
machine that runs multiple MCP servers will therefore undercount
secondary traffic, not overcount it. This is a deliberate trade-off:
the alternative would be a separate per-process file with its own
merge logic, which is more complexity than the retention hook needs.
Stored fields: tool (e.g. "search", "refs", MCP-side already
strips the cartog_ prefix so CLI and MCP rows aggregate), source
("cli" or "mcp"), and a unix-seconds timestamp. The query payload
itself is never recorded — see the privacy banner in README.
Sourcepub fn savings_breakdown(&self) -> Result<SavingsReport>
pub fn savings_breakdown(&self) -> Result<SavingsReport>
Aggregate query_log for cartog stats --savings / cartog savings.
Safe on read-only attach (it’s a read). Returns an empty report when
the query_log table is missing (the read-only attach path skips
schema bootstrap, so a v5 DB that lost the table — manual drop, partial
snapshot restore — would otherwise surface a no such table error).
Sourcepub fn top_symbols(&self, limit: u32) -> Result<Vec<Symbol>>
pub fn top_symbols(&self, limit: u32) -> Result<Vec<Symbol>>
Get all non-import symbols ordered by in-degree (highest first), then by file.
Used by cartog map to produce a centrality-ranked codebase summary.
Sourcepub fn has_indexed_files(&self) -> Result<bool>
pub fn has_indexed_files(&self) -> Result<bool>
Returns true if at least one file has been indexed.
Cheaper than Database::stats for the common “is the index empty?” check —
SQLite can satisfy LIMIT 1 with a single index seek rather than a full count.
pub fn symbols_for_files( &self, file_paths: &[String], kind_filter: Option<SymbolKind>, ) -> Result<Vec<Symbol>>
Source§impl Database
impl Database
Sourcepub fn upsert_symbol_content(
&self,
symbol_id: &str,
symbol_name: &str,
content: &str,
header: &str,
) -> Result<()>
pub fn upsert_symbol_content( &self, symbol_id: &str, symbol_name: &str, content: &str, header: &str, ) -> Result<()>
Insert or replace symbol content (raw source + metadata header for embedding).
symbol_name is used to compute a normalized form (camelCase/snake_case split)
stored in the FTS5 index for better keyword matching.
Sourcepub fn insert_symbol_contents(
&self,
items: &[(String, String, String, String)],
) -> Result<()>
pub fn insert_symbol_contents( &self, items: &[(String, String, String, String)], ) -> Result<()>
Insert multiple symbol contents in a single transaction.
Tuples: (symbol_id, symbol_name, content, header).
Sourcepub fn insert_symbol_contents_in_tx(
&self,
items: &[(String, String, String, String)],
) -> Result<()>
pub fn insert_symbol_contents_in_tx( &self, items: &[(String, String, String, String)], ) -> Result<()>
Like Self::insert_symbol_contents but assumes the caller already
holds an open transaction.
Sourcepub fn clear_symbol_content_for_file(&self, file_path: &str) -> Result<()>
pub fn clear_symbol_content_for_file(&self, file_path: &str) -> Result<()>
Remove symbol content for all symbols in a file.
Sourcepub fn get_symbol_content(
&self,
symbol_id: &str,
) -> Result<Option<(String, String)>>
pub fn get_symbol_content( &self, symbol_id: &str, ) -> Result<Option<(String, String)>>
Get the content + header for a symbol.
Sourcepub fn get_symbol_contents_batch(
&self,
symbol_ids: &[String],
) -> Result<HashMap<String, (String, String)>>
pub fn get_symbol_contents_batch( &self, symbol_ids: &[String], ) -> Result<HashMap<String, (String, String)>>
Batch fetch content + header for multiple symbols.
Returns a map of symbol_id → (content, header) for all found symbols.
Sourcepub fn fts5_search(&self, query: &str, limit: u32) -> Result<Vec<String>>
pub fn fts5_search(&self, query: &str, limit: u32) -> Result<Vec<String>>
Full-text search over symbol names and content using BM25 ranking.
Returns symbol IDs ordered by relevance (best match first).
Sourcepub fn fts5_search_kinded(
&self,
query: &str,
limit: u32,
scope: KindScope,
) -> Result<Vec<String>>
pub fn fts5_search_kinded( &self, query: &str, limit: u32, scope: KindScope, ) -> Result<Vec<String>>
Like Self::fts5_search but filters by kind in SQL, so a prose query
doesn’t spend the whole limit budget on Document (markdown) hits.
Sourcepub fn get_or_create_embedding_id(&self, symbol_id: &str) -> Result<i64>
pub fn get_or_create_embedding_id(&self, symbol_id: &str) -> Result<i64>
Get or create an integer ID for a symbol in the embedding map.
Returns the id (integer rowid) used as key in the vec0 virtual table.
Sourcepub fn symbol_id_for_embedding(
&self,
embedding_id: i64,
) -> Result<Option<String>>
pub fn symbol_id_for_embedding( &self, embedding_id: i64, ) -> Result<Option<String>>
Look up the symbol ID for an embedding map rowid.
Sourcepub fn symbol_ids_for_embeddings(
&self,
embedding_ids: &[i64],
) -> Result<Vec<(i64, String)>>
pub fn symbol_ids_for_embeddings( &self, embedding_ids: &[i64], ) -> Result<Vec<(i64, String)>>
Batch look up symbol IDs for multiple embedding map rowids.
Sourcepub fn upsert_embedding(
&self,
embedding_id: i64,
embedding: &[u8],
) -> Result<()>
pub fn upsert_embedding( &self, embedding_id: i64, embedding: &[u8], ) -> Result<()>
Insert or replace an embedding vector for a symbol.
embedding_id is the integer key from symbol_embedding_map.
embedding is a 384-dim f32 vector serialized as little-endian bytes.
Sourcepub fn insert_embeddings(&self, items: &[(i64, Vec<u8>)]) -> Result<()>
pub fn insert_embeddings(&self, items: &[(i64, Vec<u8>)]) -> Result<()>
Insert multiple embeddings in a single transaction.
Sourcepub fn vector_search(
&self,
query_embedding: &[u8],
limit: u32,
) -> Result<Vec<(i64, f64)>>
pub fn vector_search( &self, query_embedding: &[u8], limit: u32, ) -> Result<Vec<(i64, f64)>>
KNN vector search: find the limit nearest neighbors to query_embedding.
Returns (embedding_id, distance) pairs ordered by distance (ascending).
Sourcepub fn embedding_count(&self) -> Result<u32>
pub fn embedding_count(&self) -> Result<u32>
Count usable embeddings: map rows that have a matching symbol_vec row.
Orphan map rows (from a partially-failed embed) are excluded so callers
that gate on “repo has embeddings” don’t trip on non-functional rows.
Sourcepub fn has_embedding(&self, symbol_id: &str) -> Result<bool>
pub fn has_embedding(&self, symbol_id: &str) -> Result<bool>
Check if a symbol already has an embedding.
Sourcepub fn clear_rag_data_for_file(&self, file_path: &str) -> Result<()>
pub fn clear_rag_data_for_file(&self, file_path: &str) -> Result<()>
Remove all RAG data (content, FTS, embeddings, embedding map) for symbols in a file.
Sourcepub fn clear_embeddings_for_symbols_in_tx(&self, ids: &[String]) -> Result<()>
pub fn clear_embeddings_for_symbols_in_tx(&self, ids: &[String]) -> Result<()>
Drop the embedding (vector + map row) for each id so it re-enters
Self::symbols_needing_embeddings. Used on incremental re-index when a
symbol’s content changed but its stable id stayed the same — its content
row is rewritten elsewhere; only the now-drifted vector must be cleared.
Leaves symbol_content untouched. Assumes an open transaction.
Sourcepub fn clear_content_for_symbols_in_tx(&self, ids: &[String]) -> Result<()>
pub fn clear_content_for_symbols_in_tx(&self, ids: &[String]) -> Result<()>
Delete the stored content (and via the FTS trigger, the FTS row) for each id. Used on incremental re-index for a modified symbol whose new body no longer yields embeddable content, so its pre-edit content row doesn’t linger and re-embed stale text. Assumes an open transaction.
Sourcepub fn get_symbols_by_ids(&self, ids: &[String]) -> Result<Vec<Symbol>>
pub fn get_symbols_by_ids(&self, ids: &[String]) -> Result<Vec<Symbol>>
Get multiple symbols by their IDs, preserving order.
Sourcepub fn symbols_needing_embeddings(&self) -> Result<Vec<String>>
pub fn symbols_needing_embeddings(&self) -> Result<Vec<String>>
Get all symbol IDs that have content stored but no embedding yet.
Variables are excluded — they are too numerous and low-signal for embedding.
Sourcepub fn symbol_content_count(&self) -> Result<u32>
pub fn symbol_content_count(&self) -> Result<u32>
Count symbols that have content stored.
Sourcepub fn all_content_symbol_ids(&self) -> Result<Vec<String>>
pub fn all_content_symbol_ids(&self) -> Result<Vec<String>>
Get all symbol IDs that have content stored (excluding variables and imports).
Sourcepub fn clear_all_embeddings(&self) -> Result<()>
pub fn clear_all_embeddings(&self) -> Result<()>
Clear all embedding data (for force re-embed).
Source§impl Database
impl Database
Sourcepub fn resolve_edges(&self) -> Result<u32>
pub fn resolve_edges(&self) -> Result<u32>
Resolve target_name → target_id for all unresolved edges.
Runs two passes so that import edges resolved in pass 1 enable import-path resolution (tier 2) for non-import edges in pass 2.
6-tier priority resolution (per pass):
- Same file — symbol with matching name in the same file
- Import-path — follow resolved imports to find the target in the imported file
- Same directory — symbol in a file in the same directory
- Parent scope preference — when multiple global matches, prefer same parent scope
- Unique project-wide match — exactly one symbol with that name globally
- Class over constructor — when exactly 2 matches and one is a class, prefer class
Sourcepub fn resolve_edges_in_tx(&self) -> Result<u32>
pub fn resolve_edges_in_tx(&self) -> Result<u32>
Like Self::resolve_edges but assumes the caller already holds an
open transaction.
Sourcepub fn compute_in_degrees(&self) -> Result<u32>
pub fn compute_in_degrees(&self) -> Result<u32>
Compute and store in-degree centrality for all symbols.
In-degree = number of resolved incoming edges (calls, imports, inherits, etc.). Higher in-degree means the symbol is referenced more across the codebase. Resets all in-degree values to 0 first, then batch-updates from the edges table.
tx-safe: two unconditional statements participate in any active outer
transaction — see Self::begin_indexing_tx.
Sourcepub fn invalidate_edges_targeting(
&self,
dirty_files: &HashSet<String>,
) -> Result<u32>
pub fn invalidate_edges_targeting( &self, dirty_files: &HashSet<String>, ) -> Result<u32>
Invalidate resolved edges that point into any of the dirty files.
When a file is re-indexed, its symbols may have been renamed/removed. Edges from unchanged files that previously resolved to those symbols must be cleared so they can be re-resolved against the new symbol set.
tx-safe: single statement — see Self::begin_indexing_tx.
Sourcepub fn resolve_edges_scoped(&self, dirty_files: &HashSet<String>) -> Result<u32>
pub fn resolve_edges_scoped(&self, dirty_files: &HashSet<String>) -> Result<u32>
Resolve edges scoped to dirty files only.
Processes: edges originating from dirty files (freshly extracted)
and edges whose target was just invalidated (target_id set to NULL).
Uses the same 6-tier heuristic as resolve_edges.
Resolve edges after scoped invalidation.
After invalidate_edges_targeting has cleared target_ids for edges
pointing into dirty files, this re-resolves all currently unresolved edges.
Fewer edges are unresolved compared to a first-time full resolve.
Sourcepub fn resolve_edges_scoped_in_tx(
&self,
dirty_files: &HashSet<String>,
) -> Result<u32>
pub fn resolve_edges_scoped_in_tx( &self, dirty_files: &HashSet<String>, ) -> Result<u32>
Like Self::resolve_edges_scoped but assumes the caller already
holds an open transaction.
Sourcepub fn mark_heuristic_exhausted_in_tx(&self) -> Result<u32>
pub fn mark_heuristic_exhausted_in_tx(&self) -> Result<u32>
Mark every still-unresolved edge as resolution_state = 4
(heuristic-exhausted, no LSP pass ran).
Call ONLY at the end of an index run that did not (and will not) run the
LSP pass — i.e. --no-lsp, cartog watch, or a feature-lsp-off build.
In those runs the 6-tier heuristic is the only resolver, so a leftover
state=0 edge is permanently unresolvable until the symbol graph changes.
Without this marker every incremental re-index re-walks the whole
state=0 backlog (the watch-mode amplification from #109): the
resolution_state = 0 scan in resolve_edges_pass grows with the
permanent-failure set, not just the dirty edges.
State 4 is sticky like {2, 3} and re-enters the unresolved set the same
way: Self::reset_unresolvable_for_names when a matching symbol is
added, Self::reset_all_unresolvable on --force, or
Self::reopen_heuristic_exhausted before a later LSP-enabled reindex.
Returns the number of edges marked.
tx-safe: single statement — see Self::begin_indexing_tx.
Sourcepub fn compute_in_degrees_scoped(
&self,
dirty_files: &HashSet<String>,
) -> Result<u32>
pub fn compute_in_degrees_scoped( &self, dirty_files: &HashSet<String>, ) -> Result<u32>
Recompute in-degree centrality after an incremental re-index.
Scoping the reset to dirty files cannot be correct: a symbol in an
unchanged file that lost its incoming edge is unfindable once the
dirty file’s old edges are deleted. Instead, correct every symbol
whose stored value disagrees with the actual edge count — a full
ground-truth pass that writes only the rows that changed (unlike
Self::compute_in_degrees, which rewrites all rows).
tx-safe: every internal statement participates in any active outer
transaction — see Self::begin_indexing_tx. Does NOT open one of
its own, unlike the batched *_in_tx helpers; outside an outer
transaction the zeroing is not atomic with the recompute.