# ML Test Case Generators
This module provides comprehensive test case generation for machine learning models, focusing on adversarial examples, edge cases, and synthetic data generation.
## Features
### ๐ฏ Adversarial Examples
Generate inputs designed to fool ML models using various attack methods:
- **FGSM** (Fast Gradient Sign Method) - Single-step attack
- **PGD** (Projected Gradient Descent) - Iterative attack
- **BIM** (Basic Iterative Method) - Iterative FGSM
- **CW** (Carlini-Wagner) - Optimization-based attack
### ๐ Edge Cases
Test model robustness with systematic edge case generation:
- **Boundary Values** - Test extreme input ranges
- **Corner Cases** - Test combinations of extreme values
- **Equivalence Classes** - Test representative values
- **Invalid Inputs** - Test malformed or out-of-range data
- **Special Values** - NaN, infinity, zero, large numbers
### ๐งช Synthetic Data
Augment datasets with generated synthetic samples:
- **SMOTE** - Synthetic Minority Over-sampling Technique
- **Gaussian Copula** - Statistical modeling of dependencies
- **VAE-based** - Variational Autoencoder generation
- **Noise Augmentation** - Simple noise-based augmentation
- **GAN-based** - Generative Adversarial Network synthesis
## Quick Start
```rust
use reasonkit::ml_testing::{AdversarialGenerator, EdgeCaseGenerator, SyntheticDataGenerator};
// Generate adversarial examples
let attacker = AdversarialGenerator::fgsm(0.1);
let adversarial = attacker.generate(&model, &input, None)?;
// Generate edge cases
let edge_gen = EdgeCaseGenerator::boundary_values();
let edge_cases = edge_gen.generate(&schema, &config)?;
// Generate synthetic data
let synth_gen = SyntheticDataGenerator::smote();
let synthetic = synth_gen.generate(&training_data, 1000, &config)?;
```
## Configuration
```rust
use reasonkit::ml_testing::{GenerationConfig, AdversarialConfig, EdgeCaseConfig};
// Global generation settings
let config = GenerationConfig {
num_cases: 100,
seed: Some(42),
max_perturbation: 0.3,
include_metadata: true,
target_success_rate: 0.8,
};
// Attack-specific settings
let adv_config = AdversarialConfig {
method: AttackMethod::PGD,
epsilon: 0.1,
num_iterations: 10,
..Default::default()
};
```
## Input Schema
Define your data structure for systematic test generation:
```rust
use reasonkit::ml_testing::{InputSchema, FeatureType, FeatureConstraint};
use std::collections::HashMap;
let mut schema = InputSchema {
features: HashMap::new(),
constraints: HashMap::new(),
};
// Numeric feature with range constraints
schema.features.insert("age".to_string(), FeatureType::Numeric);
schema.constraints.insert(
"age".to_string(),
FeatureConstraint::Range { min: 0.0, max: 100.0 }
);
// Categorical feature
schema.features.insert("category".to_string(),
FeatureType::Categorical(vec!["A".to_string(), "B".to_string(), "C".to_string()])
);
```
## ML Model Interface
Implement the `MLModel` trait for your model:
```rust
use reasonkit::ml_testing::MLModel;
use ndarray::ArrayD;
struct MyModel { /* ... */ }
impl MLModel for MyModel {
fn forward(&self, input: &ArrayD<f32>) -> Result<ArrayD<f32>> {
// Your model's forward pass
}
fn gradient(&self, input: &ArrayD<f32>, target: Option<&ArrayD<f32>>) -> Result<ArrayD<f32>> {
// Compute gradients for adversarial attacks
}
fn input_shape(&self) -> Vec<usize> { vec![784] }
fn output_shape(&self) -> Vec<usize> { vec![10] }
}
```
## Use Cases
### Model Robustness Testing
- Identify adversarial vulnerabilities
- Test edge case handling
- Validate input sanitization
### Dataset Augmentation
- Balance imbalanced datasets
- Generate training data variations
- Improve model generalization
### Quality Assurance
- Automated test case generation
- Regression testing
- Continuous integration
## Performance Considerations
- **Memory Usage**: Large datasets may require streaming processing
- **Computation Time**: Adversarial attacks can be computationally intensive
- **Numerical Stability**: Monitor for NaN/inf values in gradients
## Feature Flags
Enable the module with:
```toml
[dependencies]
reasonkit-core = { version = "0.1", features = ["ml-testing"] }
```
This includes the required `ndarray` dependency for tensor operations.
## Examples
See `examples/ml_testing_demo.rs` for a complete working example demonstrating all three types of test case generation.
## API Reference
- [`AdversarialGenerator`] - Generate adversarial examples
- [`EdgeCaseGenerator`] - Generate edge cases and boundary values
- [`SyntheticDataGenerator`] - Generate synthetic training data
- [`GenerationConfig`] - Configuration for test case generation
- [`TestCase`] - Individual generated test case with metadata
- [`MLModel`] - Trait for ML models supporting test generation