Skip to main content

Crate code2graph

Crate code2graph 

Source
Expand description

§code2graph

Source files → structural facts. A purpose-neutral, language-agnostic code-graph extraction library: it turns source code into symbols, references, and cross-file edges as plain data — no storage, no scoring, no embeddings, no judgement. See README.md for the design boundary.

§Pipeline

source ──[extract]──▶ FileFacts (symbols + references) ──[resolve]──▶ CodeGraph (symbols + edges)
use code2graph::{extract_path, resolve::{Resolver, SymbolTableResolver}};

let a = extract_path("src/util.rs", "pub fn helper() {}").unwrap();
let b = extract_path("src/main.rs", "pub fn run() { helper() }").unwrap();
let graph = SymbolTableResolver.resolve(&[a, b]);
assert_eq!(graph.edges.len(), 1); // run --calls--> helper

§Design

  • Identity (symbol) is SCIP-aligned: a symbol is a descriptor path rendering to a stable, human-readable string, so cross-file matching is string equality.
  • Resolution (resolve) is a tier seam: the fast recall-first SymbolTableResolver (name matching, all languages, NameOnly edges) and the precise scope-aware ScopeGraphResolver (lexical-scope + import + qualified-path resolution, Scoped/Exact edges, Rust/Python/TypeScript) emit the same schema, tagging every edge with a graph::Confidence and a graph::Provenance (which analysis derived it, orthogonal to confidence). A consumer picks the tier; the output shape is identical.
  • Cross-language bridges (FfiBridgeResolver) link call sites to FFI exports (Rust #[no_mangle] → C, today) across a language boundary, deterministically and with honest confidence — composable on top of any tier.
  • Incremental maintenance (IncrementalGraph) keeps a resolved graph current as files change: each file is resolved in isolation and cross-file edges are stitched on demand, so re-extracting one file rebuilds only that file’s subgraph — never the whole workspace.
  • No storage, no source bodiesgraph::Symbols carry a byte span; consumers slice what they need.

§Coverage

All 23 languages (lang::Language) are implemented end-to-end, each behind the extract::Extractor trait.

Re-exports§

pub use error::CodegraphError;
pub use error::Result;
pub use extract::Extractor;
pub use extract::extract_file;
pub use extract::extract_path;
pub use graph::Binding;
pub use graph::BindingKind;
pub use graph::BindingTarget;
pub use graph::ByteSpan;
pub use graph::CodeGraph;
pub use graph::Confidence;
pub use graph::Edge;
pub use graph::EntryPoint;
pub use graph::FfiAbi;
pub use graph::FfiExport;
pub use graph::FileFacts;
pub use graph::Occurrence;
pub use graph::Provenance;
pub use graph::RefRole;
pub use graph::Reference;
pub use graph::Scope;
pub use graph::ScopeId;
pub use graph::ScopeKind;
pub use graph::Symbol;
pub use graph::SymbolKind;
pub use graph::TypeRefContext;
pub use graph::Visibility;
pub use lang::Language;
pub use resolve::FfiBridgeResolver;
pub use resolve::FileSubgraph;
pub use resolve::IncrementalGraph;
pub use resolve::LayeredResolver;
pub use resolve::Resolver;
pub use resolve::ScopeGraphResolver;
pub use resolve::SymbolTableResolver;
pub use symbol::Descriptor;
pub use symbol::Package;
pub use symbol::SymbolId;

Modules§

error
CodegraphError — errors surfaced by extraction and resolution.
extract
Extraction: one tree-sitter pass per language → neutral FileFacts.
grammar
Grammar chokepoint — the sole importer of every tree_sitter_* grammar crate.
graph
Neutral graph data model — the facts code2graph produces.
lang
The set of languages code2graph can parse, plus extension dispatch.
package
Optional package enrichment: stamp Package identity onto extracted facts.
resolve
Resolution: link references to definitions, producing cross-file edges.
symbol
SCIP-aligned symbol identity: descriptors and SymbolId.