# Comprehensive Code Metrics Implementation
## Overview
Complete implementation of code complexity, size, and maintainability metrics for comprehensive code quality analysis.
---
## 1. Code Complexity Metrics
### Cyclomatic Complexity
**Definition**: Counts the number of linearly independent paths through the code based on control flow.
**Formula**: `CC = E - N + 2P`
- E = edges in control flow graph
- N = nodes in control flow graph
- P = connected components
**Simplified Calculation**: `CC = 1 + decision_points`
**Interpretation**:
- **1-10**: Simple, low risk
- **11-20**: Moderate complexity, moderate risk
- **21-50**: Complex, high risk
- **>50**: Very complex, very high risk
**Use Cases**:
- Identify functions needing refactoring
- Estimate testing effort
- Assess code maintainability
---
### Halstead Metrics
**Definition**: Measures code complexity based on operators and operands.
**Components**:
1. **n1**: Unique operators (e.g., +, -, *, /, =, ==, if, for)
2. **n2**: Unique operands (e.g., variables, constants)
3. **N1**: Total operators
4. **N2**: Total operands
5. **Vocabulary**: `n = n1 + n2`
6. **Length**: `N = N1 + N2`
7. **Volume**: `V = N * log2(n)` - Information content
8. **Difficulty**: `D = (n1/2) * (N2/n2)` - Mental effort
9. **Effort**: `E = D * V` - Total mental effort
10. **Time**: `T = E / 18` - Estimated time (seconds)
11. **Bugs**: `B = V / 3000` - Estimated delivered bugs
**Interpretation**:
- **Volume < 1000**: Simple code
- **Volume 1000-8000**: Moderate complexity
- **Volume > 8000**: Complex code
- **Bugs**: Estimated number of bugs in delivered code
**Use Cases**:
- Estimate development time
- Predict bug density
- Assess code understandability
---
### Essential Complexity (EC)
**Definition**: Measures complexity of program structure by identifying unstructured constructs.
**Calculation**:
- Base complexity: 1
- Add 1 for each `goto` statement
- Add 1 for each nesting level > 3
**Interpretation**:
- **EC = 1**: Structured code (ideal)
- **EC > 1**: Contains unstructured elements
- **EC > 5**: Highly unstructured (needs refactoring)
**Use Cases**:
- Identify code needing restructuring
- Assess code readability
- Guide refactoring priorities
---
### NPath Complexity
**Definition**: Measures the number of possible execution paths through a function.
**Formula**: Product of decision points (exponential growth)
**Approximation**: `NPath ≈ 2^decision_count`
**Interpretation**:
- **< 200**: Simple function
- **200-1000**: Moderate complexity
- **> 1000**: Overly complex function
**Use Cases**:
- Detect overly complex functions
- Estimate test case requirements
- Guide function decomposition
---
## 2. Code Size Metrics
### Lines of Code (LOC)
**Variants**:
1. **Total Lines**: All lines including blanks and comments
2. **SLOC (Source Lines of Code)**: Actual code lines (excluding comments and blanks)
3. **LLOC (Logical Lines of Code)**: Executable statements
**Interpretation**:
- **SLOC < 100**: Small file (easy to maintain)
- **SLOC 100-500**: Medium file
- **SLOC > 500**: Large file (consider splitting)
---
### Number of Classes, Methods, Functions
**Metrics**:
- Total classes/structs/enums
- Total methods/functions
- Average methods per class
**Interpretation**:
- **Methods per class < 10**: Good modularity
- **Methods per class > 20**: Consider splitting class
---
### Code Density
**Formula**: `Code Density = SLOC / Total Lines`
**Interpretation**:
- **< 0.5**: Low density (many comments/blanks)
- **0.5-0.8**: Balanced
- **> 0.8**: High density (may need more comments)
---
### Comment-to-Code Ratio
**Formula**: `Comment Ratio = Comment Lines / SLOC`
**Interpretation**:
- **< 0.1**: Under-documented
- **0.1-0.3**: Well-documented
- **> 0.3**: Possibly over-documented
---
## 3. Maintainability Metrics
### Maintainability Index (MI)
**Definition**: Composite metric combining cyclomatic complexity, LOC, and Halstead volume.
**Formula**: `MI = 171 - 5.2*ln(V) - 0.23*G - 16.2*ln(LOC)`
**Normalized**: 0-100 scale
**Interpretation**:
- **80-100**: Excellent maintainability
- **60-79**: Good maintainability
- **40-59**: Moderate maintainability
- **20-39**: Poor maintainability
- **0-19**: Very poor maintainability
**Use Cases**:
- Overall code quality assessment
- Prioritize refactoring efforts
- Track quality trends over time
---
### Code Churn
**Definition**: Measures how often code changes.
**Metrics**:
- Lines added
- Lines deleted
- Lines modified
- Churn rate
**Interpretation**:
- **High churn**: Unstable code, frequent changes
- **Low churn**: Stable code
- **Sudden spikes**: Potential issues or refactoring
**Use Cases**:
- Identify unstable modules
- Predict bug-prone areas
- Assess code stability
---
### Depth of Inheritance Tree (DIT)
**Definition**: Maximum depth of class inheritance hierarchy.
**Interpretation**:
- **DIT ≤ 3**: Good design
- **DIT 4-5**: Acceptable
- **DIT > 5**: Deep hierarchy (hard to understand)
**Use Cases**:
- Assess OOP design quality
- Identify over-engineered hierarchies
- Guide refactoring
---
### Coupling Between Objects (CBO)
**Definition**: Number of other classes a class is coupled to.
**Interpretation**:
- **CBO < 5**: Low coupling (good)
- **CBO 5-10**: Moderate coupling
- **CBO > 10**: High coupling (poor modularity)
**Use Cases**:
- Assess module independence
- Identify tightly coupled code
- Guide decoupling efforts
---
### Lack of Cohesion in Methods (LCOM)
**Definition**: Measures how closely related methods in a class are.
**Formula**: Ratio of method pairs that don't share fields
**Interpretation**:
- **LCOM < 0.3**: High cohesion (good)
- **LCOM 0.3-0.7**: Moderate cohesion
- **LCOM > 0.7**: Low cohesion (poor class design)
**Use Cases**:
- Identify classes needing splitting
- Assess class design quality
- Guide refactoring
---
## Implementation Details
### Module: `src/codemetrics.rs` (850 LOC)
**Data Structures**:
```rust
pub struct FileMetrics {
pub complexity: ComplexityMetrics,
pub size: SizeMetrics,
pub maintainability: MaintainabilityMetrics,
}
pub struct ComplexityMetrics {
pub cyclomatic_complexity: usize,
pub halstead: HalsteadMetrics,
pub essential_complexity: usize,
pub npath_complexity: u64,
}
pub struct SizeMetrics {
pub total_lines: usize,
pub source_lines: usize,
pub logical_lines: usize,
pub comment_lines: usize,
pub code_density: f64,
pub comment_ratio: f64,
}
pub struct MaintainabilityMetrics {
pub maintainability_index: f64,
pub code_churn: CodeChurn,
pub depth_of_inheritance: usize,
pub coupling_between_objects: usize,
pub lack_of_cohesion: f64,
}
```
---
## CLI Usage
### Basic Analysis
```bash
codesearch metrics .
```
### With Filters
```bash
# Specific extensions
codesearch metrics . -e rs,py,js
# Exclude directories
codesearch metrics . --exclude target,node_modules
# Detailed output
codesearch metrics . --detailed
# JSON output
codesearch metrics . --format json
```
---
## Output Examples
### Summary Output
```
Comprehensive Code Metrics Report
======================================================================
Project Summary:
Total files: 45
Total lines: 12,543
Source lines (SLOC): 9,234
Total functions: 234
Total classes: 56
Complexity Metrics:
Avg Cyclomatic Complexity: 5.2
Avg Halstead Volume: 342.5
Estimated Bugs (Halstead): 2.3
Maintainability:
Avg Maintainability Index: 72.5/100
Rating: Good
```
### Detailed File Output
```
File: src/main.rs
Size Metrics:
Total lines: 450
SLOC: 320
LLOC: 280
Comment lines: 85
Code density: 71.11%
Comment ratio: 26.56%
Complexity Metrics:
Cyclomatic: 15
Essential: 2
NPath: 512
Halstead Metrics:
Volume: 1250.5
Difficulty: 12.3
Effort: 15381.2
Time (seconds): 854.5
Estimated bugs: 0.42
Maintainability:
MI: 68.5/100
DIT: 2
CBO: 8
LCOM: 0.35
```
---
## Language Support
**Full Support**:
- Rust (fn, struct, impl, use)
- Python (def, class, import)
- JavaScript/TypeScript (function, class, import)
- Java/Kotlin (class, method, import)
- Go (func, struct, import)
**Partial Support**:
- C/C++ (functions, includes)
- Other languages (basic LOC counting)
---
## Metric Correlations
### Quality Indicators
**High Quality Code**:
- Low Cyclomatic Complexity (< 10)
- High Maintainability Index (> 70)
- Low LCOM (< 0.3)
- Balanced Comment Ratio (0.1-0.3)
- Low CBO (< 5)
**Low Quality Code**:
- High Cyclomatic Complexity (> 20)
- Low Maintainability Index (< 40)
- High LCOM (> 0.7)
- Very low or very high Comment Ratio
- High CBO (> 10)
### Refactoring Priorities
**Priority 1** (Critical):
- MI < 40
- Cyclomatic > 30
- NPath > 10,000
- LCOM > 0.8
**Priority 2** (High):
- MI 40-60
- Cyclomatic 20-30
- NPath 1,000-10,000
- LCOM 0.6-0.8
**Priority 3** (Medium):
- MI 60-70
- Cyclomatic 10-20
- NPath 200-1,000
- LCOM 0.4-0.6
---
## Use Cases
### 1. Code Review
```bash
codesearch metrics src/new_feature.rs --detailed
```
Check if new code meets quality standards.
### 2. Technical Debt Assessment
```bash
codesearch metrics . --format json > metrics.json
```
Track metrics over time to monitor technical debt.
### 3. Refactoring Prioritization
```bash
Identify files with lowest maintainability.
### 4. Testing Effort Estimation
```bash
Estimate test cases needed based on complexity.
### 5. Bug Prediction
```bash
Predict bug-prone areas using Halstead metrics.
---
## Integration with Other Metrics
### Combined Analysis
```bash
# Metrics + Design
codesearch metrics .
codesearch design-metrics .
# Metrics + Complexity
codesearch metrics .
codesearch complexity . --threshold 10
# Metrics + Dead Code
codesearch metrics .
codesearch deadcode .
```
### Correlation Insights
- **High Complexity + Low MI**: Urgent refactoring needed
- **High CBO + High Churn**: Unstable architecture
- **Low Comment Ratio + High Complexity**: Needs documentation
- **High LCOM + Many Methods**: Class needs splitting
---
## Performance
**Analysis Speed**:
- Small projects (<100 files): < 2 seconds
- Medium projects (100-1000 files): 2-10 seconds
- Large projects (1000-10000 files): 10-60 seconds
**Memory Usage**:
- Proportional to file count
- Typical: 20-100 MB for medium projects
---
## Best Practices
### For New Code
- Target MI > 70
- Keep Cyclomatic < 10
- Maintain Comment Ratio 0.1-0.3
- Keep LCOM < 0.5
- Limit CBO < 5
### For Legacy Code
- Gradually improve MI
- Refactor functions with Cyclomatic > 20
- Add comments where ratio < 0.1
- Split classes with LCOM > 0.7
### For Teams
- Set quality gates in CI/CD
- Track metrics trends
- Review metrics in code reviews
- Celebrate improvements
---
## Testing
**Unit Tests**: 4 comprehensive tests covering:
- Halstead calculation
- Size metrics
- Cyclomatic complexity
- Maintainability index
Run tests:
```bash
cargo test codemetrics
```
---
## Limitations
1. **Heuristic-Based**: Uses pattern matching, not full semantic analysis
2. **Language-Specific**: Best results with fully supported languages
3. **Dynamic Features**: Cannot analyze runtime behavior
4. **Approximations**: Some metrics use simplified formulas
5. **Context-Free**: Doesn't consider business logic complexity
---
## Future Enhancements
1. **Temporal Analysis**: Track metrics over time
2. **Predictive Models**: ML-based bug prediction
3. **Custom Thresholds**: Configurable quality gates
4. **IDE Integration**: Real-time metrics in editor
5. **Team Metrics**: Per-developer/team analysis
6. **Visualization**: Interactive charts and graphs
---
## References
- **McCabe, T.J.** - "A Complexity Measure" (Cyclomatic Complexity)
- **Halstead, M.H.** - "Elements of Software Science" (Halstead Metrics)
- **Oman & Hagemeister** - "Metrics for Assessing a Software System's Maintainability" (MI)
- **Chidamber & Kemerer** - "A Metrics Suite for Object-Oriented Design" (CBO, LCOM, DIT)
- **ISO/IEC 25010** - Software Quality Model
---
## Conclusion
Comprehensive code metrics provide quantitative measures of code quality across three dimensions:
✅ **Complexity Metrics**:
- Cyclomatic Complexity
- Halstead Metrics (11 sub-metrics)
- Essential Complexity
- NPath Complexity
✅ **Size Metrics**:
- Lines of Code (LOC, SLOC, LLOC)
- Number of Classes/Methods/Functions
- Code Density
- Comment-to-Code Ratio
✅ **Maintainability Metrics**:
- Maintainability Index (MI)
- Code Churn
- Depth of Inheritance Tree (DIT)
- Coupling Between Objects (CBO)
- Lack of Cohesion in Methods (LCOM)
Use these metrics to:
- Assess code quality objectively
- Prioritize refactoring efforts
- Predict bug-prone areas
- Estimate testing effort
- Track technical debt
- Improve code maintainability