go-brrr

Token-efficient code analysis for LLMs - High-performance Rust implementation.

 ____  ____  ____  ____
| __ )|  _ \|  _ \|  _ \
|  _ \| |_) | |_) | |_) |
| |_) |  _ <|  _ <|  _ <
|____/|_| \_\_| \_\_| \_\

A blazing-fast CLI tool for extracting structured code information optimized for feeding context to Large Language Models. Achieves up to 95% token savings compared to raw source code by providing structured summaries, call graphs, and semantic search capabilities.

Features

AST Analysis

File Tree - JSON-structured directory traversal with extension filtering
Code Structure - Extract functions, classes, methods, and docstrings
Full Extraction - Complete AST analysis with type information and decorators

Control Flow Analysis

CFG - Control flow graph generation with Mermaid/DOT export
DFG - Data flow graph showing variable dependencies
Program Slicing - Backward/forward slicing to find affected code paths
PDG - Program Dependence Graph combining CFG and DFG

Call Graph Analysis

Cross-file Call Graph - Build project-wide function call relationships
Impact Analysis - Find all callers of a function (transitive)
Dead Code Detection - Identify unreachable functions
Architectural Layers - Detect entry/middle/leaf layer patterns
Import Analysis - Parse and trace module imports

Semantic Search

Embedding-based Search - Natural language code search using vector similarity
TEI Integration - gRPC client for text-embeddings-inference server
HNSW Index - Fast approximate nearest neighbor search via usearch
Multi-language Index - Unified index across all supported languages

Security Scanning

SQL Injection (CWE-89) - Detect unsafe query construction
Command Injection (CWE-78) - Find shell execution vulnerabilities
XSS (CWE-79) - Cross-site scripting detection for JS/TS
Path Traversal (CWE-22) - Directory traversal vulnerabilities
Secrets Detection (CWE-798) - Hardcoded credentials and API keys
Weak Cryptography (CWE-327) - Insecure algorithm usage
Unsafe Deserialization (CWE-502) - Pickle, YAML, ObjectInputStream
ReDoS (CWE-1333) - Regular expression denial of service
Taint Analysis - Track data flow from sources to sinks
SARIF Output - GitHub/GitLab security tab integration

Code Metrics

Cyclomatic Complexity - Branch and loop complexity measurement
Cognitive Complexity - SonarSource methodology for understandability
Halstead Metrics - Vocabulary, volume, difficulty, effort, bugs estimate
Maintainability Index - Combined metric with comment bonus
Lines of Code - Physical, logical, source, and comment LOC
Nesting Depth - Deep nesting detection with suggestions
Function Size - SLOC, parameters, variables, return points
Coupling - Afferent/efferent coupling and instability metrics
Cohesion - LCOM variants for class quality

Code Quality

Clone Detection - Textual (Type-1) and structural (Type-2/3) duplicates
God Class Detection - SRP violations with weighted scoring
Long Method Detection - Oversized functions with extraction suggestions
Circular Dependencies - Package/module/class/function level cycles
Design Pattern Detection - Singleton, Factory, Builder, Observer, etc.

Performance Optimizations

jemalloc - High-performance memory allocator
SIMD Operations - portable_simd for vectorized computations
Parallel Processing - Rayon for multi-threaded file analysis
Aho-Corasick - Multi-pattern string matching
PHF - Compile-time perfect hash functions for O(1) keyword lookups
FxHash - Fast non-cryptographic hashing
xxHash - SIMD-accelerated hashing (xxh3)

Installation

From Source

# Clone the repository
git clone https://github.com/GrigoryEvko/go-brrr
cd go-brrr

# Build with release optimizations
cargo build --release

# Install to ~/.cargo/bin
cargo install --path .

Requirements

Rust 1.70+ (nightly required for portable_simd feature)
For semantic search: TEI server or local embedding model

# Set nightly toolchain (required for SIMD)
rustup override set nightly

Quick Start

# Show file tree
brrr tree ./src --ext .rs

# Extract code structure
brrr structure . --lang rust

# Full file analysis
brrr extract src/main.rs

# Build call graph
brrr calls ./src

# Find dead code
brrr dead ./src

# Control flow graph
brrr cfg src/main.rs main

# Security scan
brrr security scan ./src

# Code metrics report
brrr metrics report ./src

# Semantic search (requires index)
brrr semantic index .
brrr semantic search "authentication handler" .

Commands Reference

File Operations

tree

Display directory structure in JSON format.

brrr tree ./src                     # Default tree
brrr tree ./src --ext .rs .toml     # Filter by extensions
brrr tree ./src --show-hidden       # Include dotfiles
brrr tree ./src --max-depth 3       # Limit depth

structure

Extract functions, classes, and methods from source files.

brrr structure .                    # Current directory
brrr structure ./src --lang python  # Specific language
brrr structure ./src --limit 100    # Limit files analyzed

extract

Full AST extraction from a single file.

brrr extract src/main.py
brrr extract src/api.py --class UserController
brrr extract src/api.py --function process_data
brrr extract src/api.py --method UserController.get_user

search

Regex pattern search with context lines.

brrr search "def process" ./src
brrr search "async fn" ./src --ext .rs
brrr search "TODO" . -C 2 --max 50

Flow Analysis

cfg

Generate control flow graph for a function.

brrr cfg src/main.py process_data
brrr cfg src/main.rs handle_request --format mermaid
brrr cfg src/main.go Process --format dot

Output formats: json (default), mermaid, dot

dfg

Generate data flow graph showing variable dependencies.

brrr dfg src/processor.py process_data
brrr dfg src/handler.rs handle --lang rust

slice

Compute program slice - find lines affecting or affected by a target line.

# Backward slice: what affects line 42?
brrr slice src/main.py process 42

# Forward slice: what does line 10 affect?
brrr slice src/main.py process 10 --direction forward

# Track specific variable
brrr slice src/main.py process 42 --var result

# Extended output with metrics
brrr slice src/main.py process 42 --extended

Call Graph Analysis

calls

Build cross-file call graph.

brrr calls ./src
brrr calls ./src --lang python
brrr calls ./src --extended      # Include call line numbers

impact

Find all callers of a function (reverse call graph).

brrr impact process_data ./src
brrr impact get_user ./src --depth 5
brrr impact critical_func ./src --file api

dead

Find unreachable (dead) code.

brrr dead ./src
brrr dead ./src --entry main cli   # Additional entry points
brrr dead ./src --lang python

arch

Detect architectural layers from call patterns.

brrr arch ./src
brrr arch ./src --lang typescript

Import Analysis

imports

Parse import statements from a source file.

brrr imports src/main.py
brrr imports src/index.ts --lang typescript

importers

Find all files that import a module.

brrr importers json ./src
brrr importers UserController ./src --lang python

change-impact

Find tests affected by changed files.

brrr change-impact src/api.py           # Specific files
brrr change-impact --git                 # Use git diff
brrr change-impact --session             # Use session-modified files
brrr change-impact --run                 # Actually run affected tests

Semantic Search

semantic index

Build semantic index for a project.

brrr semantic index .
brrr semantic index ./src --lang python
brrr semantic index . --model all-MiniLM-L6-v2   # Smaller model (80MB)
brrr semantic index . --backend tei              # Use TEI server

semantic search

Search code using natural language queries.

brrr semantic search "authentication logic" .
brrr semantic search "database connection" ./src --k 10
brrr semantic search "error handling" . --expand   # Include call graph
brrr semantic search "user validation" . --task code_retrieval

semantic cache

Manage the semantic index cache.

brrr semantic cache stats       # Show cache statistics
brrr semantic cache clear       # Clear all cached indexes
brrr semantic cache invalidate  # Invalidate specific project

semantic device

Show compute device and backend info.

brrr semantic device

Security Scanning

security scan

Run all security analyzers.

brrr security scan ./src
brrr security scan ./src --severity high      # Only high/critical
brrr security scan ./src --format sarif       # SARIF output for CI
brrr security scan ./src --category injection # Only injection issues
brrr security scan ./src --fail-on high       # Exit 1 if high+ found

Individual Scanners

brrr security sql-injection ./src
brrr security command-injection ./src
brrr security xss ./src
brrr security path-traversal ./src
brrr security secrets ./src
brrr security crypto ./src
brrr security deserialization ./src
brrr security redos ./src

Code Metrics

metrics report

Generate comprehensive metrics report.

brrr metrics report ./src
brrr metrics ./src                           # Shorthand
brrr metrics ./src --format text
brrr metrics ./src --fail-on critical        # CI quality gate
brrr metrics ./src --thresholds strict       # Stricter thresholds

metrics complexity

Calculate cyclomatic complexity.

brrr metrics complexity ./src
brrr metrics complexity ./src --threshold 10 --sort
brrr metrics complexity ./src --violations-only

metrics cognitive

Calculate cognitive complexity (SonarSource methodology).

brrr metrics cognitive ./src
brrr metrics cognitive ./src --breakdown     # Detailed contributions

metrics halstead

Calculate Halstead complexity metrics.

brrr metrics halstead ./src
brrr metrics halstead ./src --sort-by-difficulty
brrr metrics halstead ./src --show-tokens

metrics maintainability

Calculate Maintainability Index.

brrr metrics maintainability ./src
brrr metrics maintainability ./src --threshold 50 --sort
brrr metrics maintainability ./src --include-comments

metrics loc

Calculate lines of code metrics.

brrr metrics loc ./src
brrr metrics loc ./src --by-language
brrr metrics loc ./src --function-threshold 50 --violations-only

metrics nesting

Calculate nesting depth metrics.

brrr metrics nesting ./src
brrr metrics nesting ./src --threshold 5 --details

metrics functions

Calculate function size metrics.

brrr metrics functions ./src
brrr metrics functions ./src --sort-by sloc --violations-only
brrr metrics functions ./src --sloc-warn 30 --sloc-critical 60

metrics coupling

Calculate coupling metrics for modules.

brrr metrics coupling ./src
brrr metrics coupling ./src --level module
brrr metrics coupling ./src --show-cycles --show-edges

metrics cohesion

Calculate class cohesion metrics (LCOM variants).

brrr metrics cohesion ./src
brrr metrics cohesion ./src --threshold 2 --show-components

Code Quality

quality clones

Detect code clones (duplicate code).

brrr quality clones ./src
brrr quality clones ./src --min-lines 10
brrr quality clones ./src --include-tests

quality structural-clones

Detect structural code clones (Type-2/Type-3).

brrr quality structural-clones ./src
brrr quality structural-clones ./src --similarity 0.8
brrr quality structural-clones ./src --type2-only

quality god-class

Detect God classes violating SRP.

brrr quality god-class ./src
brrr quality god-class ./src --threshold 15
brrr quality god-class ./src --method-threshold 20 --attribute-threshold 15

quality long-method

Detect long methods with extraction suggestions.

brrr quality long-method ./src
brrr quality long-method ./src --max-lines 30 --max-complexity 10
brrr quality long-method ./src --show-suggestions
brrr quality long-method ./src --strict   # Stricter thresholds

quality circular

Detect circular dependencies.

brrr quality circular ./src
brrr quality circular ./src --level function
brrr quality circular ./src --max-suggestions 20

quality patterns

Detect design patterns.

brrr quality patterns ./src
brrr quality patterns ./src --pattern singleton
brrr quality patterns ./src --min-confidence 0.7

Diagnostics

diagnostics

Run type checker and linter.

brrr diagnostics src/main.py
brrr diagnostics ./src --project
brrr diagnostics ./src --no-lint    # Type checker only

doctor

Check and install diagnostic tools.

brrr doctor                   # Check all tools
brrr doctor --json            # JSON output
brrr doctor --install python  # Install Python tools

Daemon Management

daemon start

Start background daemon for faster queries.

brrr daemon start
brrr daemon start -p /path/to/project

daemon stop

Stop the daemon gracefully.

brrr daemon stop

daemon status

Check daemon status.

brrr daemon status

daemon notify

Notify daemon of file changes.

brrr daemon notify src/changed_file.py

Cache Management

warm

Pre-build call graph cache.

brrr warm ./src --lang python
brrr warm ./src --lang all --background

Configuration

.brrrignore

Create a .brrrignore file in your project root to exclude files from analysis. Uses gitignore syntax.

# Dependencies
node_modules/
.venv/
vendor/
target/

# Build outputs
dist/
build/
*.pyc

# IDE files
.idea/
.vscode/

# Security - always exclude
.env
*.pem
*.key
credentials.*

# Custom patterns
large_test_fixtures/

.brrr/config.toml

Project-specific configuration (optional).

[general]
default_language = "python"

[metrics]
cyclomatic_threshold = 10
cognitive_threshold = 15
maintainability_threshold = 50

[security]
fail_on_severity = "high"
include_suppressed = false

[semantic]
model = "bge-large-en-v1.5"
backend = "auto"

Supported Languages

Language	Tree-sitter	Call Graph	Metrics	Security
Python	Yes	Yes	Yes	Full
TypeScript	Yes	Yes	Yes	Full
JavaScript	Yes	Yes	Yes	Full
Go	Yes	Yes	Yes	Full
Rust	Yes	Yes	Yes	Full
Java	Yes	Yes	Yes	Full
C	Yes	Yes	Yes	Partial
C++	Yes	Yes	Yes	Partial

Additional languages supported for structure extraction only: Ruby, PHP, Kotlin, Swift, C#, Scala, Lua, Elixir

Architecture

Core Components

src/
|-- ast/          # AST extraction, file tree, code structure
|-- callgraph/    # Cross-file call graph, impact analysis, dead code
|-- cfg/          # Control flow graph builder and rendering
|-- dfg/          # Data flow graph and program slicing
|-- pdg/          # Program Dependence Graph (CFG + DFG)
|-- embedding/    # Vector index (usearch) and TEI gRPC client
|-- semantic/     # Semantic search, code chunking, unit extraction
|-- lang/         # Language-specific tree-sitter configurations
|-- metrics/      # All complexity and quality metrics
|-- security/     # Vulnerability scanners and taint analysis
|-- quality/      # Code smells, clones, patterns
|-- simd.rs       # SIMD-accelerated operations
|-- util/         # Path validation, ignore patterns, helpers

Tree-sitter Integration

All parsing is done through tree-sitter for consistent, fast, and accurate AST extraction across languages. Language grammars are included as dependencies:

tree-sitter 0.26
tree-sitter-python 0.25
tree-sitter-typescript 0.23
tree-sitter-go 0.25
tree-sitter-rust 0.24
tree-sitter-java 0.23
tree-sitter-c 0.24
tree-sitter-cpp 0.23

Embedding Pipeline

Extract code units (functions, classes) via AST
Generate embeddings via TEI server or local model
Build HNSW index using usearch
Store metadata in JSON sidecar file
Search returns keys mapping back to units

SIMD Optimizations

The src/simd.rs module provides portable SIMD operations:

sum_f32 / dot_product - 8x speedup for embedding similarity
count_byte / find_newlines - 32x speedup for line counting
all_equal - Fast duplicate detection
cosine_similarity - Vectorized similarity computation
find_matching_u32 - Fast edge filtering in dataflow analysis

Targets: x86_64 (SSE2/AVX2/AVX-512), aarch64 (NEON)

Performance

Key Optimizations

Optimization	Impact	Use Case
jemalloc	10-20% faster allocation	Heavy object creation
SIMD dot product	8x throughput	Embedding similarity
SIMD byte search	32x throughput	Line counting
Rayon parallelism	Linear scaling	Multi-file analysis
FxHash	2x faster hashing	Hash maps
PHF	O(1) keyword lookup	Language detection
Aho-Corasick	Multi-pattern matching	Pattern detection
usearch HNSW	Sub-linear search	Semantic search
LRU cache	Avoid recomputation	Query embeddings

Benchmarks

Run benchmarks with:

cargo bench

Available benchmarks:

ast_parsing - Tree-sitter parse performance
ast_extraction - Full file extraction
flow_analysis - CFG/DFG construction
semantic - Embedding and search
callgraph - Call graph building
e2e - End-to-end scenarios

Output Formats

Most commands support multiple output formats:

json (default) - Structured JSON for programmatic use
text - Human-readable text output
mermaid - Mermaid diagram syntax (CFG)
dot - Graphviz DOT format (CFG)
sarif - SARIF v2.1 for security findings (CI/CD integration)
csv - CSV format for metrics export

Global Options

--no-ignore     # Ignore .brrrignore patterns
-v, -vv, -vvv   # Verbosity levels (info, debug, trace)
--format        # Output format (json, text, mermaid, dot, csv)

Exit Codes

0 - Success
1 - Error or findings above fail threshold
2 - Invalid arguments

Environment Variables

BRRR_LOG=debug          # Set log level
BRRR_TEI_URL=...        # TEI server URL for semantic search
RUST_BACKTRACE=1        # Enable backtraces for debugging

Suppressing Security Findings

Add inline comments to suppress specific findings:

# brrr-ignore: SQLI-001
cursor.execute(query)  # Known safe

# Also supports:
# nosec
# noqa
# security-ignore

License

Apache-2.0

Contributing

Contributions are welcome. Please ensure:

Code passes cargo clippy without warnings
All tests pass: cargo test
New features include tests
Documentation is updated

Related Projects

tree-sitter - Incremental parsing
usearch - Vector search
text-embeddings-inference - Embedding server

go-brrr 0.1.0

go-brrr

Features

AST Analysis

Control Flow Analysis

Call Graph Analysis

Semantic Search

Security Scanning

Code Metrics

Code Quality

Performance Optimizations

Installation

From Source

Requirements

Quick Start

Commands Reference

File Operations

tree

structure

extract

search

Flow Analysis

cfg

dfg

slice

Call Graph Analysis

calls

impact

dead

arch

Import Analysis

imports

importers

change-impact

Semantic Search

semantic index

semantic search

semantic cache

semantic device

Security Scanning

security scan

Individual Scanners

Code Metrics

metrics report

metrics complexity

metrics cognitive

metrics halstead

metrics maintainability

metrics loc

metrics nesting

metrics functions

metrics coupling

metrics cohesion

Code Quality

quality clones

quality structural-clones

quality god-class

quality long-method

quality circular

quality patterns

Diagnostics

diagnostics

doctor

Daemon Management

daemon start

daemon stop

daemon status

daemon notify

Cache Management

warm

Configuration

.brrrignore

.brrr/config.toml

Supported Languages

Architecture

Core Components

Tree-sitter Integration

Embedding Pipeline

SIMD Optimizations

Performance