ImpactSense Parser
A multi-language static analysis tool written in Rust that parses source code using Tree-Sitter, extracts structural symbols (files, classes, functions, API endpoints), and builds a dependency graph in Neo4j for impact analysis.
Given a codebase, it answers questions like "If I change this class, which functions and files are affected?" by constructing a queryable graph of code relationships.
Supported Languages
| Language | Parsing | Classes | Call Graph | File Dependencies | API Endpoints |
|---|---|---|---|---|---|
| Java | Full AST | Yes | Partial | Yes (imports) | Spring |
| C# | Full AST | Yes | Partial | Yes (using) |
ASP.NET |
| Go | Full AST | Yes | Partial | Yes (import) |
Chi/Gin/Echo |
| Erlang | AST | Module | Yes | Yes | Cowboy |
| JavaScript | Full AST | No | Intra-file | Yes (imports) | No |
| TypeScript | Full AST | No | Intra-file | Yes (imports) | No |
| Python | Full AST | No | Intra-file | Yes (imports) | No |
| Rust | Full AST | No | Intra-file | Yes (use) |
No |
Architecture
┌──────────────────────┐
│ CLI (main.rs) │
│ clap arg parsing │
└──────────┬───────────┘
│
▼
┌──────────────────────┐
│ scanner.rs │
│ walkdir + rayon │
│ parallel file parse │
└──────────┬───────────┘
│
Vec<ParsedFile>
│
┌────────────────┼────────────────┐
▼ ▼
┌───────────────────┐ ┌────────────────────┐
│ JSON output │ │ graph.rs │
│ (--output-json) │ │ Neo4j persistence │
│ AST summaries │ │ (--push-to-neo4j) │
└───────────────────┘ └────────────────────┘
- Scan — Recursively walks the target directory, identifies source files by extension, and filters by max file size.
- Parse — Each file is parsed in parallel (via Rayon) using Tree-Sitter grammars, producing an AST per file.
- Extract — Language-specific extractors pull out classes, functions, imports, call sites, API endpoints, and external API references.
- Persist — Extracted symbols and relationships are written to Neo4j as a labeled property graph. Relationships are batched (3000 edges per flush) to reduce round-trips.
- Post-process —
SAME_APIedges are created between internalApiEndpointnodes andExternalApinodes that share a normalized path.
Prerequisites
- Rust (edition 2024) — install via rustup
- Neo4j 5 — run via Docker (see below)
- C compiler — required by
build.rsto compile the vendored Erlang Tree-Sitter grammar
Installation
The build step compiles the vendored Erlang grammar from vendor/tree-sitter-erlang/ via build.rs.
Neo4j Setup
Start a Neo4j 5 instance with Docker:
The Neo4j Browser will be available at http://localhost:7474/.
Usage
Basic — parse and output JSON
Parse and push to Neo4j
Full options with custom Neo4j credentials
CLI Reference
| Argument | Type | Default | Description |
|---|---|---|---|
ROOT |
path | (required) | Root directory to scan |
--output-json |
path | — | Write AST summaries to a JSON file |
--push-to-neo4j |
flag | false |
Push the parsed graph into Neo4j |
--clean |
flag | false |
Delete all existing nodes before pushing |
--neo4j-uri |
string | bolt://localhost:7688 |
Neo4j Bolt URI |
--neo4j-user |
string | neo4j |
Neo4j username |
--neo4j-password |
string | parser1234 |
Neo4j password |
--follow-symlinks |
flag | false |
Follow symbolic links during traversal |
--max-file-size |
bytes | 2 MiB | Skip files larger than this |
Graph Schema
Node Types
| Label | Key Properties |
|---|---|
File |
path, language, framework?, project_name?, is_test? |
Module |
name, path, language (Erlang modules) |
Class |
name, fqn, path, language?, project_name? |
Function |
name, fqn, path, language, arity?, return_type?, param_count? |
ApiEndpoint |
methods[], path, norm_path?, framework? |
ExternalApi |
name, base_url?, path?, norm_path?, provider? |
Relationships
(:File)-[:DECLARES_MODULE]->(:Module)
(:File)-[:DECLARES_CLASS]->(:Class)
(:File)-[:DECLARES_FUNCTION]->(:Function)
(:Class)-[:DECLARES_FUNCTION]->(:Function)
(:Module)-[:DECLARES_FUNCTION]->(:Function)
(:File)-[:DEPENDS_ON_FILE]->(:File)
(:Function)-[:CALLS_FUNCTION]->(:Function)
(:Function)-[:USES_CLASS]->(:Class)
(:ApiEndpoint)-[:HANDLED_BY]->(:Function)
(:Function)-[:CALLS_EXTERNAL_API]->(:ExternalApi)
(:ApiEndpoint)-[:SAME_API]->(:ExternalApi)
Example Queries
Once the graph is in Neo4j, you can run Cypher queries for impact analysis:
// Which functions call OrderDetail.setAmenities?
MATCH (caller:Function)-[:CALLS_FUNCTION]->(target:Function {name: "setAmenities"})
WHERE target.fqn CONTAINS "OrderDetail"
RETURN caller.fqn, caller.path
// Which files depend on OrderDetail.java?
MATCH (f:File)-[:DEPENDS_ON_FILE]->(dep:File)
WHERE dep.path CONTAINS "OrderDetail.java"
RETURN f.path
// All functions reachable within 3 hops from a given function
MATCH path = (start:Function {name: "processOrder"})-[:CALLS_FUNCTION*1..3]->(downstream:Function)
RETURN downstream.fqn, length(path) AS depth
// API endpoints and their handler functions
MATCH (ep:ApiEndpoint)-[:HANDLED_BY]->(fn:Function)
RETURN ep.path, ep.methods, fn.fqn
MCP Server Integration
The parser ships with a FastMCP server so it can be invoked as a tool from Cursor IDE or any MCP-compatible client.
Setup
The MCP server exposes a parse_repository tool with parameters matching the CLI arguments. It runs cargo run as a subprocess, pipes progress logs to stderr (to keep the JSON-RPC stdout channel clean), and returns the parse results.
Tool: parse_repository
| Parameter | Type | Description |
|---|---|---|
root_path |
string | Directory to parse |
follow_symlinks |
bool | Follow symlinks |
max_file_size |
int | Max file size in bytes |
push_to_neo4j |
bool | Push graph to Neo4j |
neo4j_uri |
string | Neo4j Bolt URI |
neo4j_user |
string | Neo4j username |
neo4j_password |
string | Neo4j password |
Project Structure
parser/
├── Cargo.toml # Rust dependencies and build config
├── build.rs # Compiles vendored Erlang grammar (C → .a)
├── graph_schema.md # Neo4j node/relationship schema reference
├── src/
│ ├── main.rs # CLI entry point (clap)
│ ├── lib.rs # Language registry and Tree-Sitter wrapper
│ ├── scanner.rs # Directory walker + parallel parser
│ ├── graph.rs # Symbol extraction + Neo4j persistence
│ ├── edge.rs # Relationship type enum
│ ├── schema.rs # Node labels and property constants
│ ├── ir.rs # Intermediate representation for serialization
│ └── erlang.rs # FFI binding for vendored Erlang grammar
├── vendor/
│ └── tree-sitter-erlang/ # Vendored Erlang Tree-Sitter grammar (C source)
├── mcp/
│ ├── main.py # MCP server entry point
│ ├── app.py # FastMCP app definition
│ ├── services/
│ │ └── parser_service.py # Subprocess runner for cargo
│ ├── tools/
│ │ └── parser_tools.py # parse_repository tool definition
│ └── requirements.txt # Python dependencies
└── prompts/ # Prompt templates for MCP tool usage
Known Limitations
- Java imports are filtered to
com.redbus.genai.*by default — other internal packages are not tracked. - C# and Go lack file-level dependency edges (
DEPENDS_ON_FILE). - Erlang uses regex-based text parsing instead of the Tree-Sitter AST for function extraction.
- JS, TS, Python, Rust only extract top-level functions (no classes); call graphs are intra-file and file dependencies are best-effort from imports/
use. - Class inheritance (
extends/implements) is not tracked for any language. - Neo4j writes are sequential per file, which can be slow for large codebases (10k+ files).
- No incremental parsing in CLI — the full codebase is re-parsed on every CLI run (MCP server supports incremental file-watcher updates).
See shortcomings.txt for a detailed analysis.
Client-side library (in-memory graph)
Add from crates.io:
[]
= "0.1"
The impactsense-parser crate builds an InMemoryGraph in RAM with indexed queries for IDE/MCP use. Optional RedCompressor integration stores Zstd code_bytes on symbols in ProjectIr (same HTTP API as the Neo4j server path).
use CompressorConfig;
use ScanOptions;
use parse_project;
use GraphStore;
let scan = default; // compression on by default (RedCompressor HTTP API)
let graph = parse_project?;
let callers = graph.callers;
let impact = graph.impact;
Export IR as JSON from the CLI:
Cargo features
| Feature | Default | Description |
|---|---|---|
neo4j |
yes | Neo4j persistence (--push-to-neo4j, webhook) |
compressor |
no | Feature flag placeholder (compressor is always available via CompressorConfig) |
Cursor MCP setup
One install gives you both the CLI and the MCP server:
Binaries are placed in ~/.cargo/bin/:
impactsense-parser— CLIimpactsense-mcp— MCP server for Cursor
Create .cursor/mcp.json in your project:
Replace YOUR_USER with your username, or run which impactsense-mcp after install to get the exact path.
Restart Cursor. The server parses your open workspace once at startup, then keeps the graph updated as you edit files.
Compression is on by default for MCP and the library. Disable with IMPACTSENSE_COMPRESS_CODEBLOCKS=0 or CLI --no-compress-codeblocks. Override the API URL with REDCOMPRESSOR_URL (default http://10.166.1.220:8787).
MCP tools
| Tool | Description |
|---|---|
find_symbol |
Search by name or FQN substring |
callers / callees |
Direct call graph neighbors |
file_dependencies |
Import/file deps for a path |
symbols_in_file |
Declared symbols in one file |
impact_analysis |
Transitive callers (bounded depth) |
graph_stats |
Node/edge counts |
explain_symbol_logic |
Decompressed implementation source for a symbol FQN (use when you need to read what code does, not only who calls it) |
Use explain_symbol_logic when an agent needs the body of a function, class, module, or property. Use callers / callees / impact_analysis for dependency and blast-radius questions. Optional include_callers / include_callees on explain attach direct neighbors for functions.
The graph lives in MCP process memory. Restart MCP/Cursor to re-bootstrap after large branch switches.