# PRD: Quality Metrics & Performance Testing
**Status**: 📝 DRAFT
**Created**: 2025-11-26
**Owner**: AllFrame Core Team
**Priority**: P0 (Critical for 1.0)
---
## Executive Summary
Establish comprehensive quality metrics, performance testing, and binary size monitoring to ensure AllFrame meets its promises of being **lightweight**, **fast**, and **production-ready**.
### Goals
1. **Binary Size Monitoring** - Ensure < 8 MB target, track per feature
2. **Demo Scenarios** - Real-world examples demonstrating all features
3. **Performance Testing** - TechEmpower parity (> 500k req/s)
4. **Automated Monitoring** - CI/CD enforcement of all metrics
---
## 1. Binary Size Monitoring
### Problem
**Current State**:
- No automated binary size tracking
- No per-feature size breakdown
- No CI/CD enforcement of size limits
- Unknown impact of each feature flag
**Why This Matters**:
- AllFrame promises "< 8 MB binaries"
- Users need to know the cost of each feature
- Need to prevent size creep over time
- Competition (Axum ~6 MB, we target < 8 MB)
---
### Solution: cargo-bloat + CI/CD Integration
#### 1.1 Binary Size Targets
| Minimal (no features) | < 2 MB | 3 MB | 🚧 |
| Default (di, openapi, router) | < 4 MB | 5 MB | 🚧 |
| CQRS (di, openapi, cqrs) | < 5 MB | 6 MB | 🚧 |
| Full (all features) | < 8 MB | 10 MB | 🚧 |
| router-graphql | < 6 MB | 7 MB | 🚧 |
| router-grpc | < 7 MB | 8 MB | 🚧 |
| router-full | < 8 MB | 10 MB | 🚧 |
**Hard Limits**: CI/CD fails if exceeded
**Target**: Best effort, warnings if exceeded
---
#### 1.2 Implementation Plan
**Phase 1: Measurement (Week 1)**
Add `cargo-bloat` to CI/CD:
```bash
# Install cargo-bloat
cargo install cargo-bloat
# Measure binary size breakdown
cargo bloat --release --crates
# Measure per-feature
cargo bloat --release --features=di,openapi --crates
cargo bloat --release --features=cqrs --crates
cargo bloat --release --features=router-full --crates
```
**Output Example**:
```
File .text Size Crate
0.3% 5.8% 23.5KiB allframe_core
0.2% 4.2% 17.0KiB tokio
0.1% 2.1% 8.5KiB hyper
...
```
---
**Phase 2: CI/CD Integration (Week 1)**
Create `.github/workflows/binary-size-check.yml`:
```yaml
name: Binary Size Check
on:
pull_request:
push:
branches: [main]
jobs:
binary-size:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- uses: Swatinem/rust-cache@v2
- name: Install cargo-bloat
run: cargo install cargo-bloat
- name: Build minimal binary
run: cargo build --release --no-default-features
- name: Check minimal binary size
run: |
SIZE=$(stat -c%s "target/release/allframe" || stat -f%z "target/release/allframe")
echo "Minimal binary size: $(($SIZE / 1024 / 1024)) MB"
if [ $SIZE -gt 3145728 ]; then # 3 MB in bytes
echo "ERROR: Minimal binary exceeds 3 MB limit"
exit 1
fi
- name: Build default binary
run: cargo build --release
- name: Check default binary size
run: |
SIZE=$(stat -c%s "target/release/allframe" || stat -f%z "target/release/allframe")
echo "Default binary size: $(($SIZE / 1024 / 1024)) MB"
if [ $SIZE -gt 5242880 ]; then # 5 MB in bytes
echo "ERROR: Default binary exceeds 5 MB limit"
exit 1
fi
- name: Build full binary
run: cargo build --release --all-features
- name: Check full binary size
run: |
SIZE=$(stat -c%s "target/release/allframe" || stat -f%z "target/release/allframe")
echo "Full binary size: $(($SIZE / 1024 / 1024)) MB"
if [ $SIZE -gt 10485760 ]; then # 10 MB in bytes
echo "ERROR: Full binary exceeds 10 MB limit"
exit 1
fi
- name: Generate size breakdown
run: |
echo "## Binary Size Report" >> $GITHUB_STEP_SUMMARY
echo "" >> $GITHUB_STEP_SUMMARY
echo "| Configuration | Size | Limit | Status |" >> $GITHUB_STEP_SUMMARY
echo "|---------------|------|-------|--------|" >> $GITHUB_STEP_SUMMARY
# Add size rows here
- name: Upload size report as artifact
uses: actions/upload-artifact@v3
with:
name: binary-size-report
path: target/release/allframe
```
---
**Phase 3: Dashboard (Week 2)**
Create `scripts/binary-size-report.sh`:
```bash
#!/bin/bash
# Generate comprehensive binary size report
echo "# AllFrame Binary Size Report"
echo "Generated: $(date)"
echo ""
# Test each configuration
configs=(
"minimal:--no-default-features"
"default:"
"cqrs:--features=cqrs"
"router-graphql:--features=router-graphql"
"router-grpc:--features=router-grpc"
"full:--all-features"
)
for config in "${configs[@]}"; do
name="${config%%:*}"
flags="${config#*:}"
cargo build --release $flags 2>/dev/null
size=$(stat -c%s "target/release/allframe" 2>/dev/null || stat -f%z "target/release/allframe")
size_mb=$(echo "scale=2; $size / 1024 / 1024" | bc)
# Determine target and status
case $name in
minimal) target="< 2 MB"; status="✅" ;;
default) target="< 4 MB"; status="✅" ;;
cqrs) target="< 5 MB"; status="✅" ;;
router-graphql) target="< 6 MB"; status="✅" ;;
router-grpc) target="< 7 MB"; status="✅" ;;
full) target="< 8 MB"; status="✅" ;;
esac
echo "| $name | $size_mb MB | $target | $status |"
done
echo ""
echo "## Size Breakdown (Default Configuration)"
cargo bloat --release --crates -n 20
```
---
#### 1.3 Success Metrics
- ✅ Binary size measured for all feature combinations
- ✅ CI/CD fails if hard limits exceeded
- ✅ Dashboard shows size breakdown per crate
- ✅ PR comments show size impact
- ✅ Historical tracking (size over time)
---
## 2. Demo Scenarios
### Problem
**Current State**:
- Examples exist but are basic
- No comprehensive demo application
- No "kitchen sink" example showing all features
- Hard for users to see real-world usage
**Why This Matters**:
- First impression for potential users
- Documentation through working code
- Testing ground for new features
- Proof that AllFrame works in practice
---
### Solution: Comprehensive Demo Application
#### 2.1 Demo Scenarios
**Scenario 1: E-Commerce API (REST + CQRS)**
- Product catalog (queries)
- Shopping cart (commands)
- Order processing (sagas)
- Inventory management (projections)
- Event versioning (schema evolution)
**Purpose**: Show complete CQRS flow in real app
**Location**: `examples/ecommerce-api/`
**Features Demonstrated**:
- ✅ CommandBus (CreateOrder, UpdateCart)
- ✅ ProjectionRegistry (ProductCatalog, InventoryView)
- ✅ Event Versioning (OrderV1 → OrderV2)
- ✅ Saga Orchestration (Order fulfillment)
- ✅ REST API with Scalar docs
- ✅ OpenAPI auto-generation
---
**Scenario 2: Real-Time Chat (GraphQL + WebSockets)**
- User authentication
- Channel management
- Message streaming (subscriptions)
- Online presence
- Message history
**Purpose**: Show GraphQL + subscriptions + real-time
**Location**: `examples/chat-app/`
**Features Demonstrated**:
- ✅ GraphQL API with GraphiQL
- ✅ Subscriptions (real-time messages)
- ✅ CQRS (Commands for sending, queries for history)
- ✅ Event sourcing (message log)
- ✅ WebSocket connections
---
**Scenario 3: Microservices Gateway (Multi-Protocol)**
- REST endpoints
- GraphQL gateway
- gRPC services
- Protocol translation
- Service mesh integration
**Purpose**: Show protocol-agnostic routing
**Location**: `examples/api-gateway/`
**Features Demonstrated**:
- ✅ REST, GraphQL, gRPC in one app
- ✅ Protocol translation (REST → gRPC)
- ✅ Unified error handling
- ✅ OpenTelemetry tracing
- ✅ Contract testing
---
**Scenario 4: Banking System (Saga Orchestration)**
- Account management
- Money transfers (sagas)
- Transaction history
- Fraud detection
- Compensation logic
**Purpose**: Show saga compensation in detail
**Location**: `examples/banking-system/`
**Features Demonstrated**:
- ✅ Saga Orchestration (complex flows)
- ✅ Automatic compensation (rollbacks)
- ✅ Event sourcing (audit trail)
- ✅ CQRS (commands + queries)
- ✅ Per-step timeouts
---
**Scenario 5: Content Management System (Full Stack)**
- Article management
- User roles/permissions
- Media uploads
- Publishing workflow
- Version control
**Purpose**: Show complete production app
**Location**: `examples/cms/`
**Features Demonstrated**:
- ✅ All CQRS features
- ✅ All protocols (REST, GraphQL)
- ✅ Authentication/Authorization
- ✅ File handling
- ✅ Complex business logic
---
#### 2.2 Implementation Plan
**Week 1-2: E-Commerce API**
- Set up project structure
- Implement domain models
- Add CQRS infrastructure
- Create REST endpoints
- Generate OpenAPI docs
**Week 3: Real-Time Chat**
- GraphQL schema
- Subscription setup
- WebSocket handling
- Message persistence
**Week 4: API Gateway**
- Multi-protocol setup
- Protocol translation
- Service routing
- Tracing integration
**Week 5: Banking System**
- Saga definitions
- Compensation logic
- Transaction flows
- Error scenarios
**Week 6: CMS (Stretch)**
- Full application
- Production patterns
- Best practices
- Performance optimization
---
#### 2.3 Demo Structure
```
examples/
├── ecommerce-api/
│ ├── src/
│ │ ├── main.rs # Entry point
│ │ ├── domain/ # Domain models
│ │ ├── commands/ # Command handlers
│ │ ├── queries/ # Query handlers
│ │ ├── projections/ # Read models
│ │ └── sagas/ # Saga definitions
│ ├── tests/
│ │ ├── integration_tests.rs
│ │ └── contract_tests.rs
│ ├── Cargo.toml
│ └── README.md # How to run, what it shows
├── chat-app/
├── api-gateway/
├── banking-system/
├── cms/
└── README.md # Index of all examples
```
**Each demo includes**:
1. Comprehensive README
2. How to run (one command)
3. What features it demonstrates
4. API documentation (auto-generated)
5. Test suite (integration + contract)
6. Docker Compose (if needed)
---
#### 2.4 Success Metrics
- ✅ 5 comprehensive demo scenarios
- ✅ Each demo < 500 lines of code
- ✅ Each demo has README + tests
- ✅ Each demo runs with `cargo run --example <name>`
- ✅ Each demo shows unique features
- ✅ All demos referenced in main docs
---
## 3. Performance Testing
### Problem
**Current State**:
- No performance benchmarks
- No baseline metrics
- No regression testing
- Unknown vs competition (Actix, Axum)
**Why This Matters**:
- AllFrame promises "> 500k req/s"
- Need to prove competitive performance
- TechEmpower benchmarks are industry standard
- Performance regression must be caught early
---
### Solution: Criterion Benchmarks + TechEmpower
#### 3.1 Performance Targets
**Based on TechEmpower Round 23**:
| JSON Serialization | 500k req/s | 700k req/s | 650k req/s |
| Single Query | 100k req/s | 150k req/s | 120k req/s |
| Multiple Queries (20) | 50k req/s | 75k req/s | 60k req/s |
| Plaintext | 1M req/s | 1.5M req/s | 1.2M req/s |
| Fortunes | 80k req/s | 100k req/s | 90k req/s |
**CQRS-specific**:
| Command dispatch | 1M ops/s | 2M ops/s |
| Event append (memory) | 500k events/s | 1M events/s |
| Event append (AllSource) | 50k events/s | 100k events/s |
| Projection update | 2M updates/s | 5M updates/s |
| Saga execution (2 steps) | 100k sagas/s | 200k sagas/s |
---
#### 3.2 Benchmark Suite Structure
```
benches/
├── criterion/ # Criterion benchmarks
│ ├── command_bus.rs # CommandBus dispatch
│ ├── event_store.rs # Event append/read
│ ├── projections.rs # Projection updates
│ ├── sagas.rs # Saga execution
│ ├── router.rs # Route matching
│ └── serialization.rs # JSON serialize/deserialize
├── techempower/ # TechEmpower-compatible
│ ├── json.rs # JSON serialization test
│ ├── plaintext.rs # Plaintext test
│ ├── db.rs # Single/multiple queries
│ └── fortunes.rs # Fortunes test
└── memory_profiling/ # Memory usage tests
├── event_store_memory.rs
└── projection_memory.rs
```
---
#### 3.3 Criterion Benchmarks
**Example: CommandBus Benchmark**
```rust
// benches/criterion/command_bus.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion};
use allframe_core::cqrs::*;
#[derive(Clone)]
struct TestCommand {
value: i32,
}
impl Command for TestCommand {}
struct TestHandler;
#[async_trait::async_trait]
impl CommandHandler<TestCommand, TestEvent> for TestHandler {
async fn handle(&self, cmd: TestCommand) -> CommandResult<TestEvent> {
Ok(vec![TestEvent::Created { value: cmd.value }])
}
}
#[derive(Clone)]
enum TestEvent {
Created { value: i32 },
}
impl Event for TestEvent {}
fn command_bus_benchmark(c: &mut Criterion) {
let runtime = tokio::runtime::Runtime::new().unwrap();
c.bench_function("command_bus_dispatch", |b| {
let bus: CommandBus<TestEvent> = CommandBus::new();
runtime.block_on(async {
bus.register(TestHandler).await;
});
b.to_async(&runtime).iter(|| async {
let cmd = TestCommand { value: black_box(42) };
bus.dispatch(cmd).await.unwrap();
});
});
}
criterion_group!(benches, command_bus_benchmark);
criterion_main!(benches);
```
**Run**:
```bash
cargo bench --bench command_bus
```
**Expected Output**:
```
command_bus_dispatch time: [950.23 ns 952.45 ns 954.89 ns]
thrpt: [1.05 Melem/s 1.05 Melem/s 1.05 Melem/s]
```
---
#### 3.4 TechEmpower Benchmarks
**JSON Serialization**:
```rust
// benches/techempower/json.rs
use allframe::prelude::*;
use serde::{Deserialize, Serialize};
#[derive(Serialize, Deserialize)]
struct Message {
message: &'static str,
}
#[api_handler]
async fn json_handler() -> Json<Message> {
Json(Message { message: "Hello, World!" })
}
// Benchmark: Should achieve > 500k req/s
```
**Single Query** (with PostgreSQL):
```rust
// benches/techempower/db.rs
#[api_handler]
async fn single_query(db: Extension<PgPool>) -> Json<World> {
let id = random_id();
let world = sqlx::query_as!(World, "SELECT * FROM world WHERE id = $1", id)
.fetch_one(&db)
.await
.unwrap();
Json(world)
}
// Benchmark: Should achieve > 100k req/s
```
---
#### 3.5 CI/CD Integration
Create `.github/workflows/performance-check.yml`:
```yaml
name: Performance Check
on:
pull_request:
push:
branches: [main]
schedule:
- cron: '0 0 * * 0' # Weekly on Sunday
jobs:
criterion-benchmarks:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- uses: Swatinem/rust-cache@v2
- name: Run Criterion benchmarks
run: cargo bench --bench command_bus --bench event_store
- name: Compare with baseline
uses: benchmark-action/github-action-benchmark@v1
with:
tool: 'cargo'
output-file-path: target/criterion/output.json
github-token: ${{ secrets.GITHUB_TOKEN }}
auto-push: true
# Fail if performance degrades > 10%
alert-threshold: '110%'
comment-on-alert: true
techempower-benchmarks:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: postgres
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- name: Run TechEmpower benchmarks
run: cargo bench --bench json --bench plaintext
- name: Upload results
uses: actions/upload-artifact@v3
with:
name: techempower-results
path: target/criterion/
```
---
#### 3.6 Performance Dashboard
Create `scripts/performance-report.sh`:
```bash
#!/bin/bash
# Generate performance report comparing AllFrame vs competitors
echo "# AllFrame Performance Report"
echo "Generated: $(date)"
echo ""
echo "## Criterion Benchmarks"
cargo bench --bench command_bus 2>&1 | grep "time:"
echo ""
echo "## TechEmpower Comparison"
echo "| JSON | TBD | 650k | 600k | > 500k ✅ |"
echo "| Plaintext | TBD | 1.2M | 1.1M | > 1M ✅ |"
echo "| Single Query | TBD | 120k | 110k | > 100k ✅ |"
echo ""
echo "## CQRS Benchmarks"
echo "| Command dispatch | TBD | > 1M ops/s |"
echo "| Event append | TBD | > 500k events/s |"
```
---
#### 3.7 Memory Profiling
**Track memory usage**:
```rust
// benches/memory_profiling/event_store_memory.rs
use allframe_core::cqrs::*;
#[test]
fn test_event_store_memory() {
let store = EventStore::new();
// Measure baseline memory
let baseline = current_memory_usage();
// Append 1M events
for i in 0..1_000_000 {
store.append(&format!("agg-{}", i % 1000), vec![
TestEvent::Created { value: i }
]).await.unwrap();
}
// Measure final memory
let final_memory = current_memory_usage();
let used = final_memory - baseline;
// Should use < 100 MB for 1M events
assert!(used < 100 * 1024 * 1024, "Memory usage too high: {} MB", used / 1024 / 1024);
}
```
---
#### 3.8 Success Metrics
- ✅ Criterion benchmarks for all CQRS operations
- ✅ TechEmpower benchmarks (JSON, DB, Plaintext)
- ✅ CI/CD performance regression detection
- ✅ Performance dashboard (vs competitors)
- ✅ Memory profiling (< 100 MB for 1M events)
- ✅ Automated alerts on performance degradation
---
## 4. Implementation Timeline
### Phase 1: Binary Size Monitoring (Week 1)
- ✅ Add cargo-bloat to project
- ✅ Create binary size CI/CD workflow
- ✅ Set up size limits for each configuration
- ✅ Generate size report script
- ✅ Add size badges to README
**Deliverables**:
- `.github/workflows/binary-size-check.yml`
- `scripts/binary-size-report.sh`
- Updated README with size badges
---
### Phase 2: Demo Scenarios (Weeks 2-6)
- Week 2: E-Commerce API
- Week 3: Real-Time Chat
- Week 4: API Gateway
- Week 5: Banking System
- Week 6: CMS (stretch)
**Deliverables**:
- 5 comprehensive demo applications
- Each with README, tests, docs
- `examples/README.md` index
---
### Phase 3: Performance Testing (Weeks 7-8)
- Week 7: Criterion benchmarks
- Week 8: TechEmpower benchmarks
**Deliverables**:
- `benches/criterion/` suite
- `benches/techempower/` suite
- `.github/workflows/performance-check.yml`
- `scripts/performance-report.sh`
---
### Phase 4: Integration & Documentation (Week 9)
- Integrate all three systems
- Update main documentation
- Create quality metrics dashboard
- Set up monitoring
**Deliverables**:
- Comprehensive quality metrics docs
- Performance dashboard
- Updated README
- CI/CD fully automated
---
## 5. Success Criteria
### Binary Size
- ✅ All configurations under hard limits
- ✅ CI/CD enforces size limits
- ✅ Size breakdown available per feature
- ✅ Historical tracking in place
### Demo Scenarios
- ✅ 5 working demo applications
- ✅ Each demonstrates unique features
- ✅ All have tests and documentation
- ✅ Referenced in main docs
### Performance
- ✅ Criterion benchmarks passing
- ✅ TechEmpower targets met
- ✅ No performance regressions in CI
- ✅ Memory usage within limits
### Automation
- ✅ All checks in CI/CD
- ✅ Automated reports generated
- ✅ Dashboard available
- ✅ Alerts on violations
---
## 6. Monitoring & Alerts
### CI/CD Checks
1. **Binary Size** - Fail if hard limit exceeded
2. **Performance** - Fail if > 10% regression
3. **Memory** - Fail if > 100 MB for 1M events
4. **Demo Tests** - Fail if any demo broken
### Dashboard Metrics
1. Binary size trend (over time)
2. Performance trend (over time)
3. Memory usage trend
4. Test coverage trend
### Alerts
- Slack notification on size violation
- GitHub comment on performance regression
- Email on critical failures
---
## 7. Resources Required
### Tools
- `cargo-bloat` - Binary size analysis
- `criterion` - Benchmarking
- `valgrind`/`heaptrack` - Memory profiling
- PostgreSQL - TechEmpower DB tests
### Infrastructure
- GitHub Actions runners
- Benchmark result storage
- Performance dashboard hosting
### Time
- 9 weeks total
- 1 engineer full-time
- Can parallelize with Phase 6 work
---
## Appendix
### A. Existing Tools
**Binary Size**:
- cargo-bloat: https://github.com/RazrFalcon/cargo-bloat
- twiggy: https://github.com/rustwasm/twiggy
- cargo-size-profiler: https://github.com/nnethercote/dhat-rs
**Performance**:
- criterion: https://github.com/bheisler/criterion.rs
- TechEmpower: https://www.techempower.com/benchmarks/
- wrk: https://github.com/wg/wrk
**Memory**:
- valgrind: https://valgrind.org/
- heaptrack: https://github.com/KDE/heaptrack
- dhat-rs: https://github.com/nnethercote/dhat-rs
---
### B. Competitor Benchmarks
**TechEmpower Round 23**:
- Actix: 650k req/s (JSON)
- Axum: 600k req/s (JSON)
- Rocket: 400k req/s (JSON)
**Binary Sizes**:
- Actix: ~12 MB
- Axum: ~6 MB
- Rocket: ~10 MB
- AllFrame target: < 8 MB
---
**END OF PRD**
---
## Approval
- [ ] Engineering Lead
- [ ] Product Owner
- [ ] Performance Team
- [ ] DevOps Team
**Next Steps**:
1. Approve PRD
2. Begin Phase 1: Binary Size Monitoring
3. Schedule demo scenario development