Expand description
llm-brrr - Token-efficient code analysis for LLMs.
This library provides tools for extracting and analyzing code structure using tree-sitter parsers for multiple languages. It enables 95% token savings when analyzing codebases by providing structured summaries instead of raw source code.
§Architecture
The library is organized into several layers:
- AST Layer (
ast): File tree generation, code structure extraction, and AST parsing - CFG Layer ([
cfg]): Control flow graph extraction with cyclomatic complexity - DFG Layer (
dfg): Data flow graph extraction and basic program slicing - PDG Layer (
pdg): Program Dependence Graph combining CFG + DFG for accurate slicing - Call Graph Layer (
callgraph): Cross-file call graph analysis and impact detection - Semantic Layer (
semantic): Semantic pattern detection, embedding unit extraction, and code enrichment - Language Layer (
lang): Multi-language support via tree-sitter
§Quick Start
use go_brrr::{get_tree, get_tree_default, get_structure, extract_file, get_cfg, get_slice, get_backward_slice};
// Get file tree for a project (convenience wrapper with default options)
let tree = get_tree_default("./src", Some(".py"))?;
// Or use full API with explicit options
let tree_full = get_tree("./src", Some(".py"), true, true)?;
// Get code structure (functions, classes) summary
let structure = get_structure("./src", Some("python"), 100, true)?;
// Extract full AST from a single file (None = no base path validation)
let module = extract_file("./src/main.py", None)?;
// Get control flow graph for a function
let cfg = get_cfg("./src/main.py", "process_data")?;
// Get program slice (what affects line 42?) - using convenience wrapper
let affected_lines = get_backward_slice("./src/main.py", "process_data", 42)?;
// Or use full API with direction, variable, and language options
let forward = get_slice("./src/main.py", "process_data", 10, Some("forward"), None, None)?;
let var_slice = get_slice("./src/main.py", "process_data", 42, None, Some("x"), None)?;§Call Graph Analysis
use go_brrr::{build_callgraph, get_impact, find_dead_code, get_context};
// Build project-wide call graph
let graph = build_callgraph("./src")?;
// Find all callers of a function (impact analysis)
let callers = get_impact("./src", "critical_function", 3)?;
// Find unreachable/dead code
let dead = find_dead_code("./src")?;
// Get LLM-ready context for an entry point
let context = get_context("./src", "main", 2)?;§Project Scanning
use go_brrr::{scan_project_files, scan_extensions, get_project_metadata, ScanConfig, scan_with_config};
// Scan all source files (respects .gitignore and .brrrignore)
let result = scan_project_files("./project", None, true)?;
println!("Found {} files", result.files.len());
// Scan only Python files
let py_result = scan_project_files("./project", Some("python"), true)?;
// Scan by file extension
let rs_files = scan_extensions("./project", &[".rs", ".toml"])?;
// Get file metadata (size, modification time, language)
let metadata = get_project_metadata("./project", None)?;
for meta in &metadata {
println!("{}: {} bytes", meta.path.display(), meta.size);
}
// Advanced: custom scan configuration
let config = ScanConfig::for_language("python")
.with_excludes(&["**/test/**"])
.with_metadata();
let result = scan_with_config("./project", &config)?;§Semantic Pattern Detection
Automatically detect semantic patterns in code for enriched embeddings:
use go_brrr::{detect_semantic_patterns, SemanticPattern, SEMANTIC_PATTERNS};
// Detect patterns in code
let code = "def validate_user(user): assert user is not None";
let patterns = detect_semantic_patterns(code);
assert!(patterns.contains(&"validation".to_string()));
// Access all available patterns
for pattern in SEMANTIC_PATTERNS {
println!("Pattern: {} - {}", pattern.name, pattern.pattern);
}Detected patterns include: crud, validation, transform, error_handling,
async_ops, iteration, api_endpoint, database, auth, cache, test,
logging, and config.
Re-exports§
pub use error::Result;pub use error::BrrrError;pub use ast::CallGraphInfo;pub use ast::ClassInfo;pub use ast::ClassSummary;pub use ast::CodeStructure;pub use ast::FieldInfo;pub use ast::FileTreeEntry;pub use ast::FunctionInfo;pub use ast::FunctionSummary;pub use ast::ImportInfo;pub use ast::ModuleInfo;pub use ast::AstExtractor;pub use ast::clear_parser_cache;pub use ast::clear_query_cache;pub use ast::extract_imports;pub use cfg::BlockId;pub use cfg::BlockType;pub use cfg::CFGBlock;pub use cfg::CFGEdge;pub use cfg::CFGError;pub use cfg::CFGInfo;pub use cfg::EdgeType;pub use cfg::to_ascii as cfg_to_ascii;pub use cfg::to_dot as cfg_to_dot;pub use cfg::to_json as cfg_to_json;pub use cfg::to_json_compact as cfg_to_json_compact;pub use cfg::to_mermaid as cfg_to_mermaid;pub use dfg::DFGInfo;pub use dfg::DataflowEdge;pub use dfg::DataflowKind;pub use pdg::BranchType;pub use pdg::ControlDependence;pub use pdg::PDGInfo;pub use pdg::SliceCriteria;pub use pdg::SliceMetrics;pub use pdg::SliceResult;pub use pdg::backward_slice as pdg_backward_slice;pub use pdg::forward_slice as pdg_forward_slice;pub use metrics::analyze_complexity;pub use metrics::analyze_file_complexity;pub use metrics::ComplexityAnalysis;pub use metrics::ComplexityStats;pub use metrics::CyclomaticComplexity;pub use metrics::FunctionComplexity;pub use metrics::RiskLevel;pub use metrics::analyze_nesting;pub use metrics::analyze_file_nesting;pub use metrics::NestingMetrics;pub use metrics::NestingAnalysis;pub use metrics::NestingStats;pub use metrics::FunctionNesting;pub use metrics::NestingDepthLevel;pub use metrics::NestingConstruct;pub use metrics::DeepNesting;pub use metrics::NestingAnalysisError;pub use callgraph::CallEdge;pub use callgraph::CallGraph;pub use callgraph::FunctionRef;pub use callgraph::FunctionDef;pub use callgraph::FunctionIndex;pub use callgraph::IndexStats;pub use callgraph::scanner::ErrorHandling;pub use callgraph::scanner::FileMetadata;pub use callgraph::scanner::ProjectScanner;pub use callgraph::scanner::ScanConfig;pub use callgraph::scanner::ScanError;pub use callgraph::scanner::ScanErrorKind;pub use callgraph::scanner::ScanResult;pub use callgraph::analyze_dead_code;pub use callgraph::analyze_dead_code_with_config;pub use callgraph::DeadCodeConfig;pub use callgraph::DeadCodeResult;pub use callgraph::DeadCodeStats;pub use callgraph::DeadFunction;pub use callgraph::DeadReason;pub use callgraph::classify_entry_point;pub use callgraph::detect_entry_points_with_config;pub use callgraph::EntryPointKind;pub use callgraph::analyze_impact;pub use callgraph::CallerInfo;pub use callgraph::ImpactConfig;pub use callgraph::ImpactResult;pub use callgraph::analyze_architecture;pub use callgraph::ArchAnalysis;pub use callgraph::ArchStats;pub use callgraph::CycleDependency;pub use callgraph::get_cache_dir;pub use callgraph::get_cache_file;pub use callgraph::get_or_build_graph_with_config;pub use callgraph::invalidate_cache;pub use callgraph::warm_cache_with_config;pub use callgraph::CachedCallGraph;pub use callgraph::CachedEdge;pub use lang::BoxedLanguage;pub use lang::Language;pub use lang::LanguageRegistry;pub use semantic::ChunkInfo;pub use semantic::CodeComplexity;pub use semantic::CodeLocation;pub use semantic::ContentHashedIndex;pub use semantic::EmbeddingUnit;pub use semantic::SearchResult;pub use semantic::SemanticPattern;pub use semantic::UnitKind;pub use semantic::CHUNK_OVERLAP_TOKENS;pub use semantic::MAX_CODE_PREVIEW_TOKENS;pub use semantic::MAX_EMBEDDING_TOKENS;pub use semantic::SEMANTIC_PATTERNS;pub use embedding::distances_to_scores;pub use embedding::distances_to_scores_for_metric;pub use embedding::is_normalized;pub use embedding::normalize_vector;pub use embedding::IndexConfig;pub use embedding::Metric;pub use embedding::Quantization;pub use embedding::VectorIndex;pub use security::injection::command::CommandInjectionFinding;pub use security::injection::command::CommandSink;pub use security::injection::command::Confidence;pub use security::injection::command::InjectionKind;pub use security::injection::command::Severity as CommandSeverity;pub use security::injection::command::SourceLocation;pub use security::injection::command::TaintSource;pub use security::injection::command::TaintSourceKind;pub use security::injection::command::scan_command_injection;pub use security::injection::command::scan_file_command_injection;pub use security::injection::sql::Location as SqlLocation;pub use security::injection::sql::SQLInjectionFinding;pub use security::injection::sql::ScanResult as SqlScanResult;pub use security::injection::sql::Severity as SqlSeverity;pub use security::injection::sql::SqlInjectionDetector;pub use security::injection::sql::SqlSinkType;pub use security::injection::sql::UnsafePattern;pub use security::injection::path_traversal::Confidence as PathTraversalConfidence;pub use security::injection::path_traversal::FileOperationType;pub use security::injection::path_traversal::FileSink;pub use security::injection::path_traversal::PathTraversalFinding;pub use security::injection::path_traversal::ScanResult as PathTraversalScanResult;pub use security::injection::path_traversal::Severity as PathTraversalSeverity;pub use security::injection::path_traversal::SourceLocation as PathTraversalLocation;pub use security::injection::path_traversal::VulnerablePattern as PathTraversalPattern;pub use security::injection::path_traversal::scan_path_traversal;pub use security::injection::path_traversal::scan_file_path_traversal;pub use security::injection::path_traversal::get_file_sinks;pub use security::crypto::Algorithm as CryptoAlgorithm;pub use security::crypto::Confidence as CryptoConfidence;pub use security::crypto::Location as CryptoLocation;pub use security::crypto::ScanResult as CryptoScanResult;pub use security::crypto::Severity as CryptoSeverity;pub use security::crypto::UsageContext as CryptoUsageContext;pub use security::crypto::WeakCryptoDetector;pub use security::crypto::WeakCryptoFinding;pub use security::crypto::WeakCryptoIssue;pub use security::crypto::scan_weak_crypto;pub use security::crypto::scan_file_weak_crypto;pub use security::scan_security;pub use security::Confidence as UnifiedConfidence;pub use security::InjectionType;pub use security::Location as UnifiedLocation;pub use security::ScanSummary;pub use security::SecurityCategory;pub use security::SecurityConfig;pub use security::SecurityFinding;pub use security::SecurityReport;pub use security::Severity as UnifiedSeverity;pub use security::check_suppression;pub use security::is_suppressed;pub use security::sarif::SarifLog;pub use quality::clones::detect_clones;pub use quality::clones::format_clone_summary;pub use quality::clones::Clone;pub use quality::clones::CloneAnalysis;pub use quality::clones::CloneConfig;pub use quality::clones::CloneError;pub use quality::clones::CloneInstance;pub use quality::clones::CloneStats;pub use quality::clones::CloneType;pub use quality::clones::TextualCloneDetector;pub use patterns::detect_patterns;pub use patterns::format_pattern_summary;pub use patterns::DesignPattern;pub use patterns::Location as PatternLocation;pub use patterns::PatternAnalysis;pub use patterns::PatternCategory;pub use patterns::PatternConfig;pub use patterns::PatternDetector;pub use patterns::PatternError;pub use patterns::PatternMatch;pub use patterns::PatternStats;
Modules§
- ast
- AST extraction and code structure analysis.
- callgraph
- Cross-file call graph analysis.
- cfg
- Control flow graph extraction and rendering.
- dfg
- Data flow graph extraction and program slicing.
- embedding
- Embedding and vector index support for semantic search.
- error
- Central error types for go-brrr.
- lang
- Language support module with implementations for all supported languages.
- metrics
- Code metrics calculation for software quality analysis.
- patterns
- Design pattern detection module.
- pdg
- Program Dependence Graph (PDG) extraction and slicing.
- quality
- Code quality analysis module.
- security
- Security analysis module for detecting vulnerabilities in source code.
- semantic
- Semantic search and embedding support.
- simd
- SIMD-accelerated numeric and byte operations using portable_simd.
- util
- Utility modules for go-brrr.
Structs§
- Function
Context - Context about a function including its code and metadata.
- Importer
Info - Result of an importer search - a file that imports a given module.
- Indexing
Config - Configuration for function indexing.
- Intra
File Call - Detailed information about a single intra-file function call.
- Relevant
Context - Relevant context for LLM consumption with call graph information.
Enums§
- Source
Input - Input that can be either a file path or source code string.
Functions§
- build_
callgraph - Build the cross-file call graph for a project.
- build_
embedding_ text - Build embedding text from a semantic unit.
- build_
function_ index - Build a function index for a project.
- build_
function_ index_ with_ config - Build a function index with custom configuration.
- count_
tokens - Count tokens in text using tiktoken (cl100k_base).
- detect_
semantic_ patterns - Detect semantic patterns in code.
- estimate_
file_ count - Estimate the number of source files in a project.
- extract_
file - Extract complete AST information from a source file with optional path containment validation.
- extract_
file_ unchecked - Extract complete AST information from a source file without path validation.
- extract_
file_ units - Extract semantic units from a single file.
- extract_
from_ source - Extract file information from a source code string.
- extract_
semantic_ units - Extract semantic units from a project for embedding.
- extract_
semantic_ units_ with_ callgraph - Extract semantic units with call graph information.
- find_
dead_ code - Find dead (unreachable) code in a project.
- get_
backward_ slice - Get backward program slice (convenience wrapper).
- get_cfg
- Get control flow graph for a function.
- get_
cfg_ ascii - Get CFG as ASCII art string.
- get_
cfg_ auto - Get control flow graph with auto-detected language (convenience wrapper).
- get_
cfg_ blocks - Get CFG blocks for a function.
- get_
cfg_ dot - Get CFG as DOT (Graphviz) string.
- get_
cfg_ edges - Get CFG edges for a function.
- get_
cfg_ from_ source - Get control flow graph from a source code string.
- get_
cfg_ json - Get CFG as JSON value.
- get_
cfg_ mermaid - Get CFG as Mermaid diagram string.
- get_
context - Get LLM-ready context for a function entry point.
- get_
def_ use_ chains - Get def-use chains for a specific variable.
- get_dfg
- Get data flow graph for a function.
- get_
dfg_ auto - Get data flow graph with auto-detected language (convenience wrapper).
- get_
dfg_ edges - Get DFG edges for a function.
- get_
dfg_ from_ source - Get data flow graph from a source code string.
- get_
dfg_ variables - Get variables tracked in DFG.
- get_
forward_ slice - Get forward program slice for a line of code.
- get_
impact - Find all callers of a function (impact analysis).
- get_
importers - Find all files that import a given module.
- get_
imports - Extract import statements from a source file.
- get_
intra_ file_ calls - Get function call relationships within a single file.
- get_
intra_ file_ calls_ detailed - Get detailed intra-file call information with line and column numbers.
- get_pdg
- Get program dependence graph (PDG) for a function.
- get_
pdg_ auto - Get PDG with auto-detected language (convenience wrapper).
- get_
pdg_ slice - Get PDG-based slice with direction parameter.
- get_
project_ metadata - Get file metadata for a project directory.
- get_
slice - Get program slice for a line of code.
- get_
slice_ dfg_ only - Get backward program slice using DFG-only (data flow only).
- get_
slice_ from_ source - Get program slice from a source code string.
- get_
structure - Get code structure summary for a project.
- get_
structure_ default - Get code structure with default options.
- get_
tree - Get the file tree for a directory.
- get_
tree_ default - Get file tree with default options (convenience wrapper).
- query
- Query project for LLM-ready context string.
- scan_
extensions - Scan project for files with specific extensions.
- scan_
project_ files - Scan project for source files of a specific language.
- scan_
with_ config - Scan project files with custom configuration.
- warm_
callgraph - Warm the call graph cache for a project.