Skip to main content

Module code_eda

Module code_eda 

Source
Expand description

Code-Specific EDA (Easy Data Augmentation) for source code.

Implements code augmentation techniques inspired by Wei & Zou (2019) EDA paper, adapted for programming languages. Operations preserve syntactic validity while introducing meaningful variation for training code analysis models.

§Operations

  1. Variable Renaming (VR): Rename variables to synonyms (e.g., x -> value)
  2. Comment Insertion (CI): Insert comments or assertions
  3. Statement Reorder (SR): Reorder independent statements
  4. Dead Code Removal (DCR): Remove comments/whitespace

§Example

use aprender::synthetic::code_eda::{CodeEda, CodeEdaConfig};
use aprender::synthetic::{SyntheticGenerator, SyntheticConfig};

let config = CodeEdaConfig::default();
let generator = CodeEda::new(config);

let code = "let x = 42;\nprintln!(\"{}\", x);";
let augmented = generator.augment(code, 42);
assert!(!augmented.is_empty());

Structs§

CodeEda
Code-specific EDA generator.
CodeEdaConfig
Configuration for code-specific EDA augmentation.
VariableSynonyms
Variable synonym dictionary for code identifiers.

Enums§

CodeLanguage
Supported programming languages for code augmentation.