Expand description
AST and semantic extraction engine for graphify.
Implements a two-pass extraction pipeline ported from the Python extract.py:
- Pass 1 (deterministic): regex-based AST extraction of functions, classes, imports, and call relationships from source code.
- Pass 2 (semantic): Claude API–based extraction of higher-level concepts from documents, papers, and images.
Modules§
- ast_
extract - Regex-based AST extraction engine.
- dedup
- Deduplication of extracted nodes and edges.
- lang_
config - Per-language configuration for tree-sitter–based extraction.
- parser
- Parser trait for pluggable extraction backends.
- semantic
- Semantic extraction via Claude API (Pass 2).
- treesitter
- Tree-sitter based AST extraction engine.
Constants§
- DISPATCH
- Maps file extensions to language identifiers used by the extraction engine.
Functions§
- collect_
files - Recursively collect all supported source files under
target. - extract
- Run Pass 1 extraction on a set of file paths.
- language_
for_ path - Return the language name for a file extension (e.g.
".py"→"python").