Expand description
Adversarial training utilities for TensorLogic.
Provides FGSM (Fast Gradient Sign Method), PGD (Projected Gradient Descent), adversarial example generation, adversarial training loss, and robustness evaluation.
§References
- Goodfellow et al. (2014): “Explaining and Harnessing Adversarial Examples” (FGSM)
- Madry et al. (2017): “Towards Deep Learning Models Resistant to Adversarial Attacks” (PGD)
Structs§
- Adversarial
Example - The result of running an adversarial attack on a single input.
- Adversarial
Train Stats - Summary statistics collected during adversarial training over a batch.
- Attack
Config - Configuration for an adversarial attack.
- Cross
Entropy Attack Loss - Cross-entropy loss for multi-class classification attacks.
- Linear
Attack Model - A simple linear model
f(x) = W·x + bused primarily for testing attacks. - MseAttack
Loss - Mean-squared-error loss for regression attacks.
Enums§
- Adversarial
Error - Errors that can arise during adversarial attack construction or execution.
- Perturb
Norm - The norm used to measure and project the adversarial perturbation.
Traits§
- Attack
Loss - A differentiable loss function used by attack algorithms.
- Attack
Model - A model that can be attacked.
Functions§
- adversarial_
training_ loss - Compute the combined adversarial training loss over a batch:
- fgsm
- Fast Gradient Sign Method (Goodfellow et al., 2014).
- pgd
- Projected Gradient Descent (Madry et al., 2017).
- project_
l1 - Project
perturbationonto the L1 ball of radiusepsilon. - project_
l2 - Project
perturbationonto the L2 ball of radiusepsilon. - project_
linf - Project
perturbationonto the L∞ ball of radiusepsilon. - robustness_
eval - Evaluate the model’s adversarial robustness on a set of samples.