Crate go_brrr

Crate go_brrr 

Source
Expand description

llm-brrr - Token-efficient code analysis for LLMs.

This library provides tools for extracting and analyzing code structure using tree-sitter parsers for multiple languages. It enables 95% token savings when analyzing codebases by providing structured summaries instead of raw source code.

§Architecture

The library is organized into several layers:

  • AST Layer (ast): File tree generation, code structure extraction, and AST parsing
  • CFG Layer ([cfg]): Control flow graph extraction with cyclomatic complexity
  • DFG Layer (dfg): Data flow graph extraction and basic program slicing
  • PDG Layer (pdg): Program Dependence Graph combining CFG + DFG for accurate slicing
  • Call Graph Layer (callgraph): Cross-file call graph analysis and impact detection
  • Semantic Layer (semantic): Semantic pattern detection, embedding unit extraction, and code enrichment
  • Language Layer (lang): Multi-language support via tree-sitter

§Quick Start

use go_brrr::{get_tree, get_tree_default, get_structure, extract_file, get_cfg, get_slice, get_backward_slice};

// Get file tree for a project (convenience wrapper with default options)
let tree = get_tree_default("./src", Some(".py"))?;

// Or use full API with explicit options
let tree_full = get_tree("./src", Some(".py"), true, true)?;

// Get code structure (functions, classes) summary
let structure = get_structure("./src", Some("python"), 100, true)?;

// Extract full AST from a single file (None = no base path validation)
let module = extract_file("./src/main.py", None)?;

// Get control flow graph for a function
let cfg = get_cfg("./src/main.py", "process_data")?;

// Get program slice (what affects line 42?) - using convenience wrapper
let affected_lines = get_backward_slice("./src/main.py", "process_data", 42)?;

// Or use full API with direction, variable, and language options
let forward = get_slice("./src/main.py", "process_data", 10, Some("forward"), None, None)?;
let var_slice = get_slice("./src/main.py", "process_data", 42, None, Some("x"), None)?;

§Call Graph Analysis

use go_brrr::{build_callgraph, get_impact, find_dead_code, get_context};

// Build project-wide call graph
let graph = build_callgraph("./src")?;

// Find all callers of a function (impact analysis)
let callers = get_impact("./src", "critical_function", 3)?;

// Find unreachable/dead code
let dead = find_dead_code("./src")?;

// Get LLM-ready context for an entry point
let context = get_context("./src", "main", 2)?;

§Project Scanning

use go_brrr::{scan_project_files, scan_extensions, get_project_metadata, ScanConfig, scan_with_config};

// Scan all source files (respects .gitignore and .brrrignore)
let result = scan_project_files("./project", None, true)?;
println!("Found {} files", result.files.len());

// Scan only Python files
let py_result = scan_project_files("./project", Some("python"), true)?;

// Scan by file extension
let rs_files = scan_extensions("./project", &[".rs", ".toml"])?;

// Get file metadata (size, modification time, language)
let metadata = get_project_metadata("./project", None)?;
for meta in &metadata {
    println!("{}: {} bytes", meta.path.display(), meta.size);
}

// Advanced: custom scan configuration
let config = ScanConfig::for_language("python")
    .with_excludes(&["**/test/**"])
    .with_metadata();
let result = scan_with_config("./project", &config)?;

§Semantic Pattern Detection

Automatically detect semantic patterns in code for enriched embeddings:

use go_brrr::{detect_semantic_patterns, SemanticPattern, SEMANTIC_PATTERNS};

// Detect patterns in code
let code = "def validate_user(user): assert user is not None";
let patterns = detect_semantic_patterns(code);
assert!(patterns.contains(&"validation".to_string()));

// Access all available patterns
for pattern in SEMANTIC_PATTERNS {
    println!("Pattern: {} - {}", pattern.name, pattern.pattern);
}

Detected patterns include: crud, validation, transform, error_handling, async_ops, iteration, api_endpoint, database, auth, cache, test, logging, and config.

Re-exports§

pub use error::Result;
pub use error::BrrrError;
pub use ast::CallGraphInfo;
pub use ast::ClassInfo;
pub use ast::ClassSummary;
pub use ast::CodeStructure;
pub use ast::FieldInfo;
pub use ast::FileTreeEntry;
pub use ast::FunctionInfo;
pub use ast::FunctionSummary;
pub use ast::ImportInfo;
pub use ast::ModuleInfo;
pub use ast::AstExtractor;
pub use ast::clear_parser_cache;
pub use ast::clear_query_cache;
pub use ast::extract_imports;
pub use cfg::BlockId;
pub use cfg::BlockType;
pub use cfg::CFGBlock;
pub use cfg::CFGEdge;
pub use cfg::CFGError;
pub use cfg::CFGInfo;
pub use cfg::EdgeType;
pub use cfg::to_ascii as cfg_to_ascii;
pub use cfg::to_dot as cfg_to_dot;
pub use cfg::to_json as cfg_to_json;
pub use cfg::to_json_compact as cfg_to_json_compact;
pub use cfg::to_mermaid as cfg_to_mermaid;
pub use dfg::DFGInfo;
pub use dfg::DataflowEdge;
pub use dfg::DataflowKind;
pub use pdg::BranchType;
pub use pdg::ControlDependence;
pub use pdg::PDGInfo;
pub use pdg::SliceCriteria;
pub use pdg::SliceMetrics;
pub use pdg::SliceResult;
pub use pdg::backward_slice as pdg_backward_slice;
pub use pdg::forward_slice as pdg_forward_slice;
pub use metrics::analyze_complexity;
pub use metrics::analyze_file_complexity;
pub use metrics::ComplexityAnalysis;
pub use metrics::ComplexityStats;
pub use metrics::CyclomaticComplexity;
pub use metrics::FunctionComplexity;
pub use metrics::RiskLevel;
pub use metrics::analyze_nesting;
pub use metrics::analyze_file_nesting;
pub use metrics::NestingMetrics;
pub use metrics::NestingAnalysis;
pub use metrics::NestingStats;
pub use metrics::FunctionNesting;
pub use metrics::NestingDepthLevel;
pub use metrics::NestingConstruct;
pub use metrics::DeepNesting;
pub use metrics::NestingAnalysisError;
pub use callgraph::CallEdge;
pub use callgraph::CallGraph;
pub use callgraph::FunctionRef;
pub use callgraph::FunctionDef;
pub use callgraph::FunctionIndex;
pub use callgraph::IndexStats;
pub use callgraph::scanner::ErrorHandling;
pub use callgraph::scanner::FileMetadata;
pub use callgraph::scanner::ProjectScanner;
pub use callgraph::scanner::ScanConfig;
pub use callgraph::scanner::ScanError;
pub use callgraph::scanner::ScanErrorKind;
pub use callgraph::scanner::ScanResult;
pub use callgraph::analyze_dead_code;
pub use callgraph::analyze_dead_code_with_config;
pub use callgraph::DeadCodeConfig;
pub use callgraph::DeadCodeResult;
pub use callgraph::DeadCodeStats;
pub use callgraph::DeadFunction;
pub use callgraph::DeadReason;
pub use callgraph::classify_entry_point;
pub use callgraph::detect_entry_points_with_config;
pub use callgraph::EntryPointKind;
pub use callgraph::analyze_impact;
pub use callgraph::CallerInfo;
pub use callgraph::ImpactConfig;
pub use callgraph::ImpactResult;
pub use callgraph::analyze_architecture;
pub use callgraph::ArchAnalysis;
pub use callgraph::ArchStats;
pub use callgraph::CycleDependency;
pub use callgraph::get_cache_dir;
pub use callgraph::get_cache_file;
pub use callgraph::get_or_build_graph_with_config;
pub use callgraph::invalidate_cache;
pub use callgraph::warm_cache_with_config;
pub use callgraph::CachedCallGraph;
pub use callgraph::CachedEdge;
pub use lang::BoxedLanguage;
pub use lang::Language;
pub use lang::LanguageRegistry;
pub use semantic::ChunkInfo;
pub use semantic::CodeComplexity;
pub use semantic::CodeLocation;
pub use semantic::ContentHashedIndex;
pub use semantic::EmbeddingUnit;
pub use semantic::SearchResult;
pub use semantic::SemanticPattern;
pub use semantic::UnitKind;
pub use semantic::CHUNK_OVERLAP_TOKENS;
pub use semantic::MAX_CODE_PREVIEW_TOKENS;
pub use semantic::MAX_EMBEDDING_TOKENS;
pub use semantic::SEMANTIC_PATTERNS;
pub use embedding::distances_to_scores;
pub use embedding::distances_to_scores_for_metric;
pub use embedding::is_normalized;
pub use embedding::normalize_vector;
pub use embedding::IndexConfig;
pub use embedding::Metric;
pub use embedding::Quantization;
pub use embedding::VectorIndex;
pub use security::injection::command::CommandInjectionFinding;
pub use security::injection::command::CommandSink;
pub use security::injection::command::Confidence;
pub use security::injection::command::InjectionKind;
pub use security::injection::command::Severity as CommandSeverity;
pub use security::injection::command::SourceLocation;
pub use security::injection::command::TaintSource;
pub use security::injection::command::TaintSourceKind;
pub use security::injection::command::scan_command_injection;
pub use security::injection::command::scan_file_command_injection;
pub use security::injection::sql::Location as SqlLocation;
pub use security::injection::sql::SQLInjectionFinding;
pub use security::injection::sql::ScanResult as SqlScanResult;
pub use security::injection::sql::Severity as SqlSeverity;
pub use security::injection::sql::SqlInjectionDetector;
pub use security::injection::sql::SqlSinkType;
pub use security::injection::sql::UnsafePattern;
pub use security::injection::path_traversal::Confidence as PathTraversalConfidence;
pub use security::injection::path_traversal::FileOperationType;
pub use security::injection::path_traversal::FileSink;
pub use security::injection::path_traversal::PathTraversalFinding;
pub use security::injection::path_traversal::ScanResult as PathTraversalScanResult;
pub use security::injection::path_traversal::Severity as PathTraversalSeverity;
pub use security::injection::path_traversal::SourceLocation as PathTraversalLocation;
pub use security::injection::path_traversal::VulnerablePattern as PathTraversalPattern;
pub use security::injection::path_traversal::scan_path_traversal;
pub use security::injection::path_traversal::scan_file_path_traversal;
pub use security::injection::path_traversal::get_file_sinks;
pub use security::crypto::Algorithm as CryptoAlgorithm;
pub use security::crypto::Confidence as CryptoConfidence;
pub use security::crypto::Location as CryptoLocation;
pub use security::crypto::ScanResult as CryptoScanResult;
pub use security::crypto::Severity as CryptoSeverity;
pub use security::crypto::UsageContext as CryptoUsageContext;
pub use security::crypto::WeakCryptoDetector;
pub use security::crypto::WeakCryptoFinding;
pub use security::crypto::WeakCryptoIssue;
pub use security::crypto::scan_weak_crypto;
pub use security::crypto::scan_file_weak_crypto;
pub use security::scan_security;
pub use security::Confidence as UnifiedConfidence;
pub use security::InjectionType;
pub use security::Location as UnifiedLocation;
pub use security::ScanSummary;
pub use security::SecurityCategory;
pub use security::SecurityConfig;
pub use security::SecurityFinding;
pub use security::SecurityReport;
pub use security::Severity as UnifiedSeverity;
pub use security::check_suppression;
pub use security::is_suppressed;
pub use security::sarif::SarifLog;
pub use quality::clones::detect_clones;
pub use quality::clones::format_clone_summary;
pub use quality::clones::Clone;
pub use quality::clones::CloneAnalysis;
pub use quality::clones::CloneConfig;
pub use quality::clones::CloneError;
pub use quality::clones::CloneInstance;
pub use quality::clones::CloneStats;
pub use quality::clones::CloneType;
pub use quality::clones::TextualCloneDetector;
pub use patterns::detect_patterns;
pub use patterns::format_pattern_summary;
pub use patterns::DesignPattern;
pub use patterns::Location as PatternLocation;
pub use patterns::PatternAnalysis;
pub use patterns::PatternCategory;
pub use patterns::PatternConfig;
pub use patterns::PatternDetector;
pub use patterns::PatternError;
pub use patterns::PatternMatch;
pub use patterns::PatternStats;

Modules§

ast
AST extraction and code structure analysis.
callgraph
Cross-file call graph analysis.
cfg
Control flow graph extraction and rendering.
dfg
Data flow graph extraction and program slicing.
embedding
Embedding and vector index support for semantic search.
error
Central error types for go-brrr.
lang
Language support module with implementations for all supported languages.
metrics
Code metrics calculation for software quality analysis.
patterns
Design pattern detection module.
pdg
Program Dependence Graph (PDG) extraction and slicing.
quality
Code quality analysis module.
security
Security analysis module for detecting vulnerabilities in source code.
semantic
Semantic search and embedding support.
simd
SIMD-accelerated numeric and byte operations using portable_simd.
util
Utility modules for go-brrr.

Structs§

FunctionContext
Context about a function including its code and metadata.
ImporterInfo
Result of an importer search - a file that imports a given module.
IndexingConfig
Configuration for function indexing.
IntraFileCall
Detailed information about a single intra-file function call.
RelevantContext
Relevant context for LLM consumption with call graph information.

Enums§

SourceInput
Input that can be either a file path or source code string.

Functions§

build_callgraph
Build the cross-file call graph for a project.
build_embedding_text
Build embedding text from a semantic unit.
build_function_index
Build a function index for a project.
build_function_index_with_config
Build a function index with custom configuration.
count_tokens
Count tokens in text using tiktoken (cl100k_base).
detect_semantic_patterns
Detect semantic patterns in code.
estimate_file_count
Estimate the number of source files in a project.
extract_file
Extract complete AST information from a source file with optional path containment validation.
extract_file_unchecked
Extract complete AST information from a source file without path validation.
extract_file_units
Extract semantic units from a single file.
extract_from_source
Extract file information from a source code string.
extract_semantic_units
Extract semantic units from a project for embedding.
extract_semantic_units_with_callgraph
Extract semantic units with call graph information.
find_dead_code
Find dead (unreachable) code in a project.
get_backward_slice
Get backward program slice (convenience wrapper).
get_cfg
Get control flow graph for a function.
get_cfg_ascii
Get CFG as ASCII art string.
get_cfg_auto
Get control flow graph with auto-detected language (convenience wrapper).
get_cfg_blocks
Get CFG blocks for a function.
get_cfg_dot
Get CFG as DOT (Graphviz) string.
get_cfg_edges
Get CFG edges for a function.
get_cfg_from_source
Get control flow graph from a source code string.
get_cfg_json
Get CFG as JSON value.
get_cfg_mermaid
Get CFG as Mermaid diagram string.
get_context
Get LLM-ready context for a function entry point.
get_def_use_chains
Get def-use chains for a specific variable.
get_dfg
Get data flow graph for a function.
get_dfg_auto
Get data flow graph with auto-detected language (convenience wrapper).
get_dfg_edges
Get DFG edges for a function.
get_dfg_from_source
Get data flow graph from a source code string.
get_dfg_variables
Get variables tracked in DFG.
get_forward_slice
Get forward program slice for a line of code.
get_impact
Find all callers of a function (impact analysis).
get_importers
Find all files that import a given module.
get_imports
Extract import statements from a source file.
get_intra_file_calls
Get function call relationships within a single file.
get_intra_file_calls_detailed
Get detailed intra-file call information with line and column numbers.
get_pdg
Get program dependence graph (PDG) for a function.
get_pdg_auto
Get PDG with auto-detected language (convenience wrapper).
get_pdg_slice
Get PDG-based slice with direction parameter.
get_project_metadata
Get file metadata for a project directory.
get_slice
Get program slice for a line of code.
get_slice_dfg_only
Get backward program slice using DFG-only (data flow only).
get_slice_from_source
Get program slice from a source code string.
get_structure
Get code structure summary for a project.
get_structure_default
Get code structure with default options.
get_tree
Get the file tree for a directory.
get_tree_default
Get file tree with default options (convenience wrapper).
query
Query project for LLM-ready context string.
scan_extensions
Scan project for files with specific extensions.
scan_project_files
Scan project for source files of a specific language.
scan_with_config
Scan project files with custom configuration.
warm_callgraph
Warm the call graph cache for a project.