# Fuzzing Integration with CI/CD
This document describes how fuzz testing is integrated into the CacheKit CI/CD pipeline.
## Overview
CacheKit uses comprehensive fuzz testing with **27 fuzz targets** covering 9 core data structures. Fuzzing runs automatically in CI/CD to catch bugs, edge cases, and security issues.
## Fuzz Target Coverage
| **ClockRing** | 3 | Arbitrary ops, insert stress, eviction patterns |
| **FixedHistory** | 3 | Arbitrary ops, record stress, property tests |
| **FrequencyBuckets** | 3 | Arbitrary ops, stress, property tests |
| **GhostList** | 3 | Arbitrary ops, LRU stress, property tests |
| **KeyInterner** | 3 | Arbitrary ops, stress, property tests |
| **IntrusiveList** | 3 | Arbitrary ops, stress, property tests |
| **LazyMinHeap** | 3 | Arbitrary ops, stress, property tests |
| **ShardSelector** | 3 | Arbitrary ops, distribution, property tests |
| **SlotArena** | 3 | Arbitrary ops, stress, property tests |
| **Total** | **27** | All core data structures |
## CI/CD Integration
### Dynamic Target Discovery
**Zero Configuration Required!** 🎉
The CI/CD pipeline automatically discovers all fuzz targets using `cargo fuzz list`. When you add a new fuzz target to `fuzz/fuzz_targets/` and register it in `fuzz/Cargo.toml`, the CI/CD pipeline will automatically:
- ✅ Include it in nightly continuous fuzzing
- ✅ Run it in PR smoke tests (if it matches naming convention)
- ✅ Generate coverage reports for it
**No workflow file updates needed!**
### Naming Convention
For best CI integration, follow this naming convention:
- `<module>_arbitrary_ops.rs` - Random operation sequences (used in PR smoke tests)
- `<module>_stress.rs` - Heavy load testing
- `<module>_property_tests.rs` - Specific invariant testing
Targets ending in `_arbitrary_ops` are automatically used for PR smoke tests as they provide the best general-purpose testing.
### 1. Pull Request Smoke Tests (`.github/workflows/ci.yml`)
**When**: Every pull request
**Duration**: ~60 seconds per `_arbitrary_ops` target
**Purpose**: Catch obvious bugs early before merging
Automatically discovers and runs all `_arbitrary_ops` targets:
```bash
# Automatic discovery
for target in $TARGETS; do
cargo fuzz run "$target" -- -max_total_time=60 -seed=$seed
done
```
**What it catches**:
- Basic crashes and panics
- Obvious logic errors
- Assertion failures
- Memory safety issues
### 2. Continuous Fuzzing (`.github/workflows/fuzz.yml`)
**When**:
- Nightly at 2 AM UTC
- On pushes to main branch
- Manual trigger with configurable duration
**Duration**: 1 hour per target by default (configurable)
**Purpose**: Deep fuzzing to find subtle bugs
**Strategy**:
- **Automatically discovers ALL fuzz targets** using `cargo fuzz list`
- Runs all targets in parallel (GitHub Actions matrix strategy)
- Each target runs independently with its own corpus
- Failures don't block other targets (`fail-fast: false`)
**How Discovery Works**:
```yaml
discover-targets:
steps:
- name: List fuzz targets
run: |
cd fuzz
TARGETS=$(cargo fuzz list | jq -R -s -c 'split("\n") | map(select(length > 0))')
echo "targets=$TARGETS" >> $GITHUB_OUTPUT
fuzz-continuous:
needs: discover-targets
strategy:
matrix:
target: ${{ fromJson(needs.discover-targets.outputs.targets) }}
```
**Features**:
- **Corpus Management**: Corpora are cached and restored between runs
- **Corpus Minimization**: Automatically minimizes corpus after each run
- **Crash Detection**: Uploads crash artifacts for reproduction
- **Issue Creation**: Automatically creates GitHub issues for crashes found during nightly runs
- **Coverage Reporting**: Generates coverage reports for representative targets
### 3. Corpus Management
Fuzzing corpora (interesting test inputs) are preserved across runs:
```yaml
- name: Restore corpus
uses: actions/cache@v4
with:
path: fuzz/corpus/${{ matrix.target }}
key: fuzz-corpus-${{ matrix.target }}-${{ github.sha }}
restore-keys: |
fuzz-corpus-${{ matrix.target }}-
```
This ensures:
- Accumulated fuzzing progress is preserved
- Previously discovered interesting inputs are retained
- Each run builds on previous fuzzing efforts
### 4. Crash Handling
When crashes are detected:
1. **Artifacts Upload**: Crash inputs are uploaded for 90 days
```yaml
name: fuzz-crashes-${{ matrix.target }}-${{ github.sha }}
path: fuzz/artifacts/${{ matrix.target }}/
retention-days: 90
```
2. **GitHub Issue Creation** (nightly runs only):
- Creates issue with `bug`, `fuzzing`, `security` labels
- Includes reproduction instructions
- Links to artifact downloads
- Provides debugging guidance
3. **Workflow Failure**: Pipeline fails to alert maintainers
### 5. Coverage Reporting
For nightly runs, coverage is generated for representative targets:
- One arbitrary_ops target per data structure
- Generates JSON coverage reports
- Uploaded as artifacts for analysis
## Running Fuzzing Locally
### Quick Smoke Test
Run all smoke tests (60 seconds each):
```bash
cd fuzz
./run_smoke_tests.sh
```
Or manually:
```bash
cd fuzz
cargo fuzz run clock_ring_arbitrary_ops -- -max_total_time=60 -seed=1
cargo fuzz run fixed_history_arbitrary_ops -- -max_total_time=60 -seed=2
# ... etc
```
### Deep Fuzzing
Run a single target for extended duration:
```bash
cd fuzz
cargo fuzz run clock_ring_arbitrary_ops -- -max_total_time=3600
```
Run all targets (from fuzz/README.md):
```bash
cd fuzz
for target in $(cargo fuzz list); do
echo "Fuzzing $target..."
cargo fuzz run $target -- -max_total_time=300 -jobs=4
done
```
### Reproducing Crashes
If CI finds a crash:
1. Download the crash artifact from the GitHub Actions run
2. Reproduce locally:
```bash
cd fuzz
cargo fuzz run <target> fuzz/artifacts/<target>/<crash-file>
```
3. Debug with full backtrace:
```bash
RUST_BACKTRACE=full cargo fuzz run <target> <crash-file>
```
## Monitoring and Maintenance
### Monitoring Fuzzing Health
Check the following regularly:
1. **GitHub Issues**: Look for `[FUZZ]` prefix and `fuzzing` label
2. **Actions Tab**: Review nightly fuzzing workflow results
3. **Corpus Growth**: Ensure corpora are growing over time
4. **Coverage Reports**: Check coverage artifacts for code coverage
### Corpus Maintenance
Periodically minimize corpora to remove redundant inputs:
```bash
cd fuzz
for target in $(cargo fuzz list); do
cargo fuzz cmin $target
done
```
This is automatically done in CI after each run.
### Adding New Fuzz Targets
When adding new data structures or major features:
1. **Create fuzz target** in `fuzz/fuzz_targets/`:
```rust
#![no_main]
use libfuzzer_sys::fuzz_target;
```
2. **Register in `fuzz/Cargo.toml`**:
```toml
[[bin]]
name = "my_module_arbitrary_ops"
path = "fuzz_targets/my_module_arbitrary_ops.rs"
test = false
doc = false
```
3. **Document in `fuzz/README.md`** (optional but recommended)
**That's it!** ✅ The CI/CD pipeline will automatically discover and run your new target.
**Naming Tips**:
- End with `_arbitrary_ops` to include in PR smoke tests
- End with `_stress` or `_property_tests` for specialized testing (runs in nightly only)
- Use descriptive names like `<module>_<test_type>` for clarity
## Best Practices
### For Contributors
1. **Run smoke tests before pushing**:
```bash
cd fuzz && ./run_smoke_tests.sh
```
2. **Fix fuzzing failures immediately**: Fuzzing bugs often indicate real issues
3. **Don't disable fuzz tests**: If a fuzz test is flaky, fix the test or the code
### For Maintainers
1. **Review fuzzing issues promptly**: They often reveal security issues
2. **Keep corpora**: Don't delete cached corpora without good reason
3. **Monitor nightly runs**: Check for patterns in failures
4. **Update fuzz targets**: When APIs change, update fuzz targets accordingly
## Configuration
### Workflow Configuration
Edit `.github/workflows/fuzz.yml` to adjust:
- `duration`: Default 3600 seconds (1 hour), configurable via manual trigger
- `timeout-minutes`: Maximum 120 minutes per target
- `rss_limit_mb`: Memory limit 4096 MB
- `schedule`: Currently runs at 2 AM UTC daily
### Resource Limits
Current limits:
- **Memory**: 4 GB per target
- **Time**: 2 hours maximum (including setup)
- **Parallelism**: All 27 targets run in parallel (GitHub Actions manages)
## Troubleshooting
### Fuzzing Timeouts
If targets timeout:
- Check if the fuzz target is too slow
- Consider reducing complexity or using faster operations
- Adjust `timeout-minutes` in workflow
### Corpus Cache Misses
If corpora aren't being restored:
- Check cache key format in workflow
- Verify artifact upload/download steps
- Check GitHub Actions cache limits (10 GB per repo)
### False Positives
If fuzz tests find "issues" that aren't real bugs:
- Review the fuzz target logic
- Add proper preconditions (prop_assume, input validation)
- Consider if the behavior is actually a bug
## Resources
- [Fuzz Target Documentation](../../fuzz/README.md)
- [libFuzzer Documentation](https://llvm.org/docs/LibFuzzer.html)
- [cargo-fuzz Book](https://rust-fuzz.github.io/book/cargo-fuzz.html)
- [GitHub Actions Workflow Syntax](https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions)