Skip to main content

Module code_features

Module code_features 

Source
Expand description

Code Feature Extraction for Commit-Level Analysis.

Extracts 8-dimensional feature vectors from code commits for defect prediction and code quality analysis. Based on D’Ambros et al. (2012) benchmark methodology for software defect prediction.

§Feature Vector

The CommitFeatures struct contains:

  1. defect_category - Predicted defect category (0-255)
  2. files_changed - Number of files modified
  3. lines_added - Lines of code added
  4. lines_deleted - Lines of code removed
  5. complexity_delta - Change in cyclomatic complexity
  6. timestamp - Unix timestamp of commit
  7. hour_of_day - Hour when commit was made (0-23)
  8. day_of_week - Day of week (0=Sunday, 6=Saturday)

§Example

use aprender::synthetic::code_features::{CodeFeatureExtractor, CommitFeatures, CommitDiff};

let extractor = CodeFeatureExtractor::new();

let diff = CommitDiff {
    files_changed: 3,
    lines_added: 150,
    lines_deleted: 50,
    timestamp: 1700000000,
    message: "fix: resolve memory leak".to_string(),
};

let features = extractor.extract(&diff);
assert_eq!(features.files_changed, 3.0);

§References

  • D’Ambros et al. (2012). “Evaluating Defect Prediction Approaches”

Structs§

CodeFeatureExtractor
Feature extractor for commit-level defect prediction.
CommitDiff
Input data for feature extraction - minimal commit diff information.
CommitFeatures
Commit-level features for defect prediction (8-dimensional).
FeatureStats
Statistics for feature normalization.