Expand description
Code Feature Extraction for Commit-Level Analysis.
Extracts 8-dimensional feature vectors from code commits for defect prediction and code quality analysis. Based on D’Ambros et al. (2012) benchmark methodology for software defect prediction.
§Feature Vector
The CommitFeatures struct contains:
defect_category- Predicted defect category (0-255)files_changed- Number of files modifiedlines_added- Lines of code addedlines_deleted- Lines of code removedcomplexity_delta- Change in cyclomatic complexitytimestamp- Unix timestamp of commithour_of_day- Hour when commit was made (0-23)day_of_week- Day of week (0=Sunday, 6=Saturday)
§Example
use aprender::synthetic::code_features::{CodeFeatureExtractor, CommitFeatures, CommitDiff};
let extractor = CodeFeatureExtractor::new();
let diff = CommitDiff {
files_changed: 3,
lines_added: 150,
lines_deleted: 50,
timestamp: 1700000000,
message: "fix: resolve memory leak".to_string(),
};
let features = extractor.extract(&diff);
assert_eq!(features.files_changed, 3.0);§References
- D’Ambros et al. (2012). “Evaluating Defect Prediction Approaches”
Structs§
- Code
Feature Extractor - Feature extractor for commit-level defect prediction.
- Commit
Diff - Input data for feature extraction - minimal commit diff information.
- Commit
Features - Commit-level features for defect prediction (8-dimensional).
- Feature
Stats - Statistics for feature normalization.