go-brrr
Token-efficient code analysis for LLMs - High-performance Rust implementation.
____ ____ ____ ____
| __ )| _ \| _ \| _ \
| _ \| |_) | |_) | |_) |
| |_) | _ <| _ <| _ <
|____/|_| \_\_| \_\_| \_\
A blazing-fast CLI tool for extracting structured code information optimized for feeding context to Large Language Models. Achieves up to 95% token savings compared to raw source code by providing structured summaries, call graphs, and semantic search capabilities.
Features
AST Analysis
- File Tree - JSON-structured directory traversal with extension filtering
- Code Structure - Extract functions, classes, methods, and docstrings
- Full Extraction - Complete AST analysis with type information and decorators
Control Flow Analysis
- CFG - Control flow graph generation with Mermaid/DOT export
- DFG - Data flow graph showing variable dependencies
- Program Slicing - Backward/forward slicing to find affected code paths
- PDG - Program Dependence Graph combining CFG and DFG
Call Graph Analysis
- Cross-file Call Graph - Build project-wide function call relationships
- Impact Analysis - Find all callers of a function (transitive)
- Dead Code Detection - Identify unreachable functions
- Architectural Layers - Detect entry/middle/leaf layer patterns
- Import Analysis - Parse and trace module imports
Semantic Search
- Embedding-based Search - Natural language code search using vector similarity
- TEI Integration - gRPC client for text-embeddings-inference server
- HNSW Index - Fast approximate nearest neighbor search via usearch
- Multi-language Index - Unified index across all supported languages
Security Scanning
- SQL Injection (CWE-89) - Detect unsafe query construction
- Command Injection (CWE-78) - Find shell execution vulnerabilities
- XSS (CWE-79) - Cross-site scripting detection for JS/TS
- Path Traversal (CWE-22) - Directory traversal vulnerabilities
- Secrets Detection (CWE-798) - Hardcoded credentials and API keys
- Weak Cryptography (CWE-327) - Insecure algorithm usage
- Unsafe Deserialization (CWE-502) - Pickle, YAML, ObjectInputStream
- ReDoS (CWE-1333) - Regular expression denial of service
- Taint Analysis - Track data flow from sources to sinks
- SARIF Output - GitHub/GitLab security tab integration
Code Metrics
- Cyclomatic Complexity - Branch and loop complexity measurement
- Cognitive Complexity - SonarSource methodology for understandability
- Halstead Metrics - Vocabulary, volume, difficulty, effort, bugs estimate
- Maintainability Index - Combined metric with comment bonus
- Lines of Code - Physical, logical, source, and comment LOC
- Nesting Depth - Deep nesting detection with suggestions
- Function Size - SLOC, parameters, variables, return points
- Coupling - Afferent/efferent coupling and instability metrics
- Cohesion - LCOM variants for class quality
Code Quality
- Clone Detection - Textual (Type-1) and structural (Type-2/3) duplicates
- God Class Detection - SRP violations with weighted scoring
- Long Method Detection - Oversized functions with extraction suggestions
- Circular Dependencies - Package/module/class/function level cycles
- Design Pattern Detection - Singleton, Factory, Builder, Observer, etc.
Performance Optimizations
- jemalloc - High-performance memory allocator
- SIMD Operations - portable_simd for vectorized computations
- Parallel Processing - Rayon for multi-threaded file analysis
- Aho-Corasick - Multi-pattern string matching
- PHF - Compile-time perfect hash functions for O(1) keyword lookups
- FxHash - Fast non-cryptographic hashing
- xxHash - SIMD-accelerated hashing (xxh3)
Installation
From Source
# Clone the repository
# Build with release optimizations
# Install to ~/.cargo/bin
Requirements
- Rust 1.70+ (nightly required for
portable_simdfeature) - For semantic search: TEI server or local embedding model
# Set nightly toolchain (required for SIMD)
Quick Start
# Show file tree
# Extract code structure
# Full file analysis
# Build call graph
# Find dead code
# Control flow graph
# Security scan
# Code metrics report
# Semantic search (requires index)
Commands Reference
File Operations
tree
Display directory structure in JSON format.
structure
Extract functions, classes, and methods from source files.
extract
Full AST extraction from a single file.
search
Regex pattern search with context lines.
Flow Analysis
cfg
Generate control flow graph for a function.
Output formats: json (default), mermaid, dot
dfg
Generate data flow graph showing variable dependencies.
slice
Compute program slice - find lines affecting or affected by a target line.
# Backward slice: what affects line 42?
# Forward slice: what does line 10 affect?
# Track specific variable
# Extended output with metrics
Call Graph Analysis
calls
Build cross-file call graph.
impact
Find all callers of a function (reverse call graph).
dead
Find unreachable (dead) code.
arch
Detect architectural layers from call patterns.
Import Analysis
imports
Parse import statements from a source file.
importers
Find all files that import a module.
change-impact
Find tests affected by changed files.
Semantic Search
semantic index
Build semantic index for a project.
semantic search
Search code using natural language queries.
semantic cache
Manage the semantic index cache.
semantic device
Show compute device and backend info.
Security Scanning
security scan
Run all security analyzers.
Individual Scanners
Code Metrics
metrics report
Generate comprehensive metrics report.
metrics complexity
Calculate cyclomatic complexity.
metrics cognitive
Calculate cognitive complexity (SonarSource methodology).
metrics halstead
Calculate Halstead complexity metrics.
metrics maintainability
Calculate Maintainability Index.
metrics loc
Calculate lines of code metrics.
metrics nesting
Calculate nesting depth metrics.
metrics functions
Calculate function size metrics.
metrics coupling
Calculate coupling metrics for modules.
metrics cohesion
Calculate class cohesion metrics (LCOM variants).
Code Quality
quality clones
Detect code clones (duplicate code).
quality structural-clones
Detect structural code clones (Type-2/Type-3).
quality god-class
Detect God classes violating SRP.
quality long-method
Detect long methods with extraction suggestions.
quality circular
Detect circular dependencies.
quality patterns
Detect design patterns.
Diagnostics
diagnostics
Run type checker and linter.
doctor
Check and install diagnostic tools.
Daemon Management
daemon start
Start background daemon for faster queries.
daemon stop
Stop the daemon gracefully.
daemon status
Check daemon status.
daemon notify
Notify daemon of file changes.
Cache Management
warm
Pre-build call graph cache.
Configuration
.brrrignore
Create a .brrrignore file in your project root to exclude files from analysis. Uses gitignore syntax.
# Dependencies
node_modules/
.venv/
vendor/
target/
# Build outputs
dist/
build/
*.pyc
# IDE files
.idea/
.vscode/
# Security - always exclude
.env
*.pem
*.key
credentials.*
# Custom patterns
large_test_fixtures/
.brrr/config.toml
Project-specific configuration (optional).
[]
= "python"
[]
= 10
= 15
= 50
[]
= "high"
= false
[]
= "bge-large-en-v1.5"
= "auto"
Supported Languages
| Language | Tree-sitter | Call Graph | Metrics | Security |
|---|---|---|---|---|
| Python | Yes | Yes | Yes | Full |
| TypeScript | Yes | Yes | Yes | Full |
| JavaScript | Yes | Yes | Yes | Full |
| Go | Yes | Yes | Yes | Full |
| Rust | Yes | Yes | Yes | Full |
| Java | Yes | Yes | Yes | Full |
| C | Yes | Yes | Yes | Partial |
| C++ | Yes | Yes | Yes | Partial |
Additional languages supported for structure extraction only: Ruby, PHP, Kotlin, Swift, C#, Scala, Lua, Elixir
Architecture
Core Components
src/
|-- ast/ # AST extraction, file tree, code structure
|-- callgraph/ # Cross-file call graph, impact analysis, dead code
|-- cfg/ # Control flow graph builder and rendering
|-- dfg/ # Data flow graph and program slicing
|-- pdg/ # Program Dependence Graph (CFG + DFG)
|-- embedding/ # Vector index (usearch) and TEI gRPC client
|-- semantic/ # Semantic search, code chunking, unit extraction
|-- lang/ # Language-specific tree-sitter configurations
|-- metrics/ # All complexity and quality metrics
|-- security/ # Vulnerability scanners and taint analysis
|-- quality/ # Code smells, clones, patterns
|-- simd.rs # SIMD-accelerated operations
|-- util/ # Path validation, ignore patterns, helpers
Tree-sitter Integration
All parsing is done through tree-sitter for consistent, fast, and accurate AST extraction across languages. Language grammars are included as dependencies:
tree-sitter0.26tree-sitter-python0.25tree-sitter-typescript0.23tree-sitter-go0.25tree-sitter-rust0.24tree-sitter-java0.23tree-sitter-c0.24tree-sitter-cpp0.23
Embedding Pipeline
- Extract code units (functions, classes) via AST
- Generate embeddings via TEI server or local model
- Build HNSW index using usearch
- Store metadata in JSON sidecar file
- Search returns keys mapping back to units
SIMD Optimizations
The src/simd.rs module provides portable SIMD operations:
sum_f32/dot_product- 8x speedup for embedding similaritycount_byte/find_newlines- 32x speedup for line countingall_equal- Fast duplicate detectioncosine_similarity- Vectorized similarity computationfind_matching_u32- Fast edge filtering in dataflow analysis
Targets: x86_64 (SSE2/AVX2/AVX-512), aarch64 (NEON)
Performance
Key Optimizations
| Optimization | Impact | Use Case |
|---|---|---|
| jemalloc | 10-20% faster allocation | Heavy object creation |
| SIMD dot product | 8x throughput | Embedding similarity |
| SIMD byte search | 32x throughput | Line counting |
| Rayon parallelism | Linear scaling | Multi-file analysis |
| FxHash | 2x faster hashing | Hash maps |
| PHF | O(1) keyword lookup | Language detection |
| Aho-Corasick | Multi-pattern matching | Pattern detection |
| usearch HNSW | Sub-linear search | Semantic search |
| LRU cache | Avoid recomputation | Query embeddings |
Benchmarks
Run benchmarks with:
Available benchmarks:
ast_parsing- Tree-sitter parse performanceast_extraction- Full file extractionflow_analysis- CFG/DFG constructionsemantic- Embedding and searchcallgraph- Call graph buildinge2e- End-to-end scenarios
Output Formats
Most commands support multiple output formats:
- json (default) - Structured JSON for programmatic use
- text - Human-readable text output
- mermaid - Mermaid diagram syntax (CFG)
- dot - Graphviz DOT format (CFG)
- sarif - SARIF v2.1 for security findings (CI/CD integration)
- csv - CSV format for metrics export
Global Options
Exit Codes
0- Success1- Error or findings above fail threshold2- Invalid arguments
Environment Variables
BRRR_LOG=debug # Set log level
BRRR_TEI_URL=... # TEI server URL for semantic search
RUST_BACKTRACE=1 # Enable backtraces for debugging
Suppressing Security Findings
Add inline comments to suppress specific findings:
# brrr-ignore: SQLI-001
# Known safe
# Also supports:
# nosec
# noqa
# security-ignore
License
Apache-2.0
Contributing
Contributions are welcome. Please ensure:
- Code passes
cargo clippywithout warnings - All tests pass:
cargo test - New features include tests
- Documentation is updated
Related Projects
- tree-sitter - Incremental parsing
- usearch - Vector search
- text-embeddings-inference - Embedding server