reasonkit-core 0.1.8

The Reasoning Engine โ€” Auditable Reasoning for Production AI | Rust-Native | Turn Prompts into Protocols
# ML Test Case Generators

This module provides comprehensive test case generation for machine learning models, focusing on adversarial examples, edge cases, and synthetic data generation.

## Features

### ๐ŸŽฏ Adversarial Examples

Generate inputs designed to fool ML models using various attack methods:

- **FGSM** (Fast Gradient Sign Method) - Single-step attack
- **PGD** (Projected Gradient Descent) - Iterative attack
- **BIM** (Basic Iterative Method) - Iterative FGSM
- **CW** (Carlini-Wagner) - Optimization-based attack

### ๐Ÿ” Edge Cases

Test model robustness with systematic edge case generation:

- **Boundary Values** - Test extreme input ranges
- **Corner Cases** - Test combinations of extreme values
- **Equivalence Classes** - Test representative values
- **Invalid Inputs** - Test malformed or out-of-range data
- **Special Values** - NaN, infinity, zero, large numbers

### ๐Ÿงช Synthetic Data

Augment datasets with generated synthetic samples:

- **SMOTE** - Synthetic Minority Over-sampling Technique
- **Gaussian Copula** - Statistical modeling of dependencies
- **VAE-based** - Variational Autoencoder generation
- **Noise Augmentation** - Simple noise-based augmentation
- **GAN-based** - Generative Adversarial Network synthesis

## Quick Start

```rust
use reasonkit::ml_testing::{AdversarialGenerator, EdgeCaseGenerator, SyntheticDataGenerator};

// Generate adversarial examples
let attacker = AdversarialGenerator::fgsm(0.1);
let adversarial = attacker.generate(&model, &input, None)?;

// Generate edge cases
let edge_gen = EdgeCaseGenerator::boundary_values();
let edge_cases = edge_gen.generate(&schema, &config)?;

// Generate synthetic data
let synth_gen = SyntheticDataGenerator::smote();
let synthetic = synth_gen.generate(&training_data, 1000, &config)?;
```

## Configuration

```rust
use reasonkit::ml_testing::{GenerationConfig, AdversarialConfig, EdgeCaseConfig};

// Global generation settings
let config = GenerationConfig {
    num_cases: 100,
    seed: Some(42),
    max_perturbation: 0.3,
    include_metadata: true,
    target_success_rate: 0.8,
};

// Attack-specific settings
let adv_config = AdversarialConfig {
    method: AttackMethod::PGD,
    epsilon: 0.1,
    num_iterations: 10,
    ..Default::default()
};
```

## Input Schema

Define your data structure for systematic test generation:

```rust
use reasonkit::ml_testing::{InputSchema, FeatureType, FeatureConstraint};
use std::collections::HashMap;

let mut schema = InputSchema {
    features: HashMap::new(),
    constraints: HashMap::new(),
};

// Numeric feature with range constraints
schema.features.insert("age".to_string(), FeatureType::Numeric);
schema.constraints.insert(
    "age".to_string(),
    FeatureConstraint::Range { min: 0.0, max: 100.0 }
);

// Categorical feature
schema.features.insert("category".to_string(),
    FeatureType::Categorical(vec!["A".to_string(), "B".to_string(), "C".to_string()])
);
```

## ML Model Interface

Implement the `MLModel` trait for your model:

```rust
use reasonkit::ml_testing::MLModel;
use ndarray::ArrayD;

struct MyModel { /* ... */ }

impl MLModel for MyModel {
    fn forward(&self, input: &ArrayD<f32>) -> Result<ArrayD<f32>> {
        // Your model's forward pass
    }

    fn gradient(&self, input: &ArrayD<f32>, target: Option<&ArrayD<f32>>) -> Result<ArrayD<f32>> {
        // Compute gradients for adversarial attacks
    }

    fn input_shape(&self) -> Vec<usize> { vec![784] }
    fn output_shape(&self) -> Vec<usize> { vec![10] }
}
```

## Use Cases

### Model Robustness Testing

- Identify adversarial vulnerabilities
- Test edge case handling
- Validate input sanitization

### Dataset Augmentation

- Balance imbalanced datasets
- Generate training data variations
- Improve model generalization

### Quality Assurance

- Automated test case generation
- Regression testing
- Continuous integration

## Performance Considerations

- **Memory Usage**: Large datasets may require streaming processing
- **Computation Time**: Adversarial attacks can be computationally intensive
- **Numerical Stability**: Monitor for NaN/inf values in gradients

## Feature Flags

Enable the module with:

```toml
[dependencies]
reasonkit-core = { version = "0.1", features = ["ml-testing"] }
```

This includes the required `ndarray` dependency for tensor operations.

## Examples

See `examples/ml_testing_demo.rs` for a complete working example demonstrating all three types of test case generation.

## API Reference

- [`AdversarialGenerator`] - Generate adversarial examples
- [`EdgeCaseGenerator`] - Generate edge cases and boundary values
- [`SyntheticDataGenerator`] - Generate synthetic training data
- [`GenerationConfig`] - Configuration for test case generation
- [`TestCase`] - Individual generated test case with metadata
- [`MLModel`] - Trait for ML models supporting test generation