cleanlib-cli 0.1.0

Terminal interface to CleanLibrary — query dependency verdicts and scan package manifests for ALLOW / DENY / WARN signals from the terminal or CI pipelines.
# cleanlib-cli — performance baseline (cycle-8 §2.2 / cycle-9 C2)

**Date**: 2026-05-31
**Author**: App engineering session
**Authority**: Cycle-8 close memo §6 carry C2 + Cycle-9 entry-stance App §1.2

## §0 Headline

End-to-end p99 latency baseline for `cleanlib-cli` against the production verdict surface, plus supplementary direct-HTTP measurements for transport-only reference. Sample size N=30 per pool; release-mode binary.

## §1 Methodology

- Binary: `cleanlib 0.1.0` built via `cargo build -p cleanlib-cli --release` (5.4 MB; `target/release/cleanlib`)
- Endpoint default: `https://cleanapp.clnstrt.dev` (cleanlib-cli's compile-time config default; see `cleanlib-client/src/config.rs:42`)
- Auth: no `CLEANLIBRARY_API_KEY` set in baseline — CLI invocation returns HTTP 401 fast-fail (no App api-key is currently provisioned in SM for the test rig; see §5 carry)
- Timing: wall-clock around process spawn via `python3 -c 'import time; print(time.time())'` before/after the CLI invocation
- Test command: `cleanlib verdict npm cors 2.8.4`
- Environment: macOS arm64; warm filesystem cache; live network (no Cloud Run cold-start isolation; consecutive invocations on the same process landed within 14s wall clock so Cloud Run-instance warmth is steady through the run)

## §2 Results

### §2.1 cleanlib-cli end-to-end (cleanapp; HTTP 401 fast-fail path)

| Percentile | Latency |
|---|---|
| p50 | 236.7 ms |
| p90 | 530.2 ms |
| p95 | 811.7 ms |
| p99 | 861.6 ms |
| max | 861.6 ms |
| mean | 298.7 ms |

**vs Budget** (per cycle-8 §2.2 acceptance):

- Cold p99 < 800ms: **MARGINAL** (861.6 ms; +7.7% over budget)
- Warm p99 < 200ms: not separately measured here (the cleanapp `/verdicts/...` route is auth-gated and 401-paths short-circuit before any meaningful "warm-cache" backend hits the cache layer)

These end-to-end numbers include the full CLI startup cost (binary load + config parse + TLS handshake + HTTP round-trip + error parse). The p50 of 237 ms is dominated by process startup + TLS — see §2.2 + §2.3 for the transport-only decomposition.

### §2.2 Direct curl — cleanlib-enrich /api/v1/remediation (HTTP 200 happy path)

For supplementary signal — measures the canonical Tricorder remediation surface that the cycle-8 §2.2 acceptance language references:

| Percentile | Latency |
|---|---|
| p50 | 771.7 ms |
| p90 | 840.5 ms |
| p95 | 851.0 ms |
| p99 | 854.7 ms |
| max | 854.7 ms |
| mean | 775.3 ms |

cleanlib-enrich is producing a real composite remediation response (`blast_radius` + 7-block sparse payload); the latency is dominated by server-side substrate read + assembly, not transport.

### §2.3 Direct curl — cleanapp `/health` (HTTP 200; no auth; transport floor)

| Percentile | Latency |
|---|---|
| p50 | 143.1 ms |
| p90 | 166.1 ms |
| p95 | 181.2 ms |
| p99 | 193.2 ms |
| max | 193.2 ms |
| mean | 147.4 ms |

Establishes the transport-only floor (TLS + Cloud Run routing + minimal handler) at the cleanapp surface: **~145ms p50 / ~193ms p99**.

## §3 Decomposition

The cleanlib-cli p50 (237ms) vs cleanapp /health p50 (143ms) means CLI overhead is approximately:

- **~94ms** for binary startup + config load + arg parse + TLS handshake + 401 response decode

The tail (cli p99 862ms vs /health p99 193ms) shows substantial variance — likely a mix of TLS-renegotiation cost on cold connections + occasional GC/scheduler stalls. The CLI cannot benefit from connection-pool reuse across invocations (each invocation is a fresh process).

## §4 Budget assessment

Per cycle-8 §2.2 acceptance:

| Target | Budget | Empirical | State |
|---|---|---|---|
| Cached verdict p99 | < 200 ms | (not measured — auth-gated path) | DEFERRED |
| Cold network verdict p99 | < 800 ms | 861.6 ms (cleanlib-cli 401 path) | **MARGINAL** (+7.7%) |
| Transport floor p99 | (informational) | 193.2 ms (cleanapp `/health`) | reference |
| Substantive remediation p99 | (informational) | 854.7 ms (cleanlib-enrich `/api/v1/remediation`) | reference |

The cold-budget MARGINAL signal is worth noting but not blocking — the +7.7% overrun could be improved by (a) connection-pooling at the CLI layer (currently fresh-process per invocation), (b) Cloud Run min-instances tuning at the cleanapp deploy, or (c) caching at a CDN layer. None of these are App-lane unilateral fixes.

## §5 Carries / known gaps

1. **No App api-key in SM** — current SM inventory has only `vector-verdict-api-bearer-cleanlibrary-prod-pilot` + `testpypi-api-token` + `cleanlib-enrich-customer-tokens`. No `clk_std_*` style App customer key. The cleanapp end-to-end happy-path (HTTP 200 with verdict body) is therefore unmeasured. Provisioning an App test-tier api-key in SM would unblock a true happy-path p99 measurement.

2. **Cold-vs-warm separation** — running consecutive invocations within seconds keeps Cloud Run + DNS warm. A true "cold" measurement would require waiting for Cloud Run scale-to-zero between calls (typically 15 minutes). The numbers here are "fresh-process / warm-server" baseline; true cold would skew higher.

3. **64-case integration matrix timing (C4)** — chains on this build; landed in cycle-9 alongside C2.

## §6 Reproducibility

Scripts at `/tmp/qa-workspace/cli-loop.sh` + `/tmp/qa-workspace/perf-loop.sh`; raw samples at `/tmp/qa-workspace/{cli,curl-enrich,curl-health}-times.txt`. Re-run via:

```bash
cargo build -p cleanlib-cli --release
/tmp/qa-workspace/cli-loop.sh        # 30 cleanlib-cli verdict samples
/tmp/qa-workspace/perf-loop.sh       # full 90-sample sweep
```

## §7 Signature

From: App engineering session — cycle-9 day-1 fire
For: Cycle-8 §2.2 acceptance carry C2 + cycle-9 close-criterion contribution

## §8 64-case integration matrix execution timing (cycle-9 C4)

**Date**: 2026-05-31
**Source**: `cleanlib-cli/tests/integration/main.rs` (`cli_matrix::matrix_64_cases` + `wrapper_matrix::wrappers_compose_with_verdict`)
**Run command**: `cargo test -p cleanlib-cli --test integration --release -- --test-threads=1`

### §8.1 Wall-clock breakdown

| Component | Duration |
|---|---|
| Total wall-clock (cargo build incremental + run) | **1035 ms** |
| Cargo finished-check overhead | ~470 ms |
| Test-runner reported duration | **20 ms** |
| → average per-case (matrix_64_cases / 64) | **~0.3 ms** |

### §8.2 Test outcome

```
running 2 tests
test cli_matrix::matrix_64_cases ........................... ok
test wrapper_matrix::wrappers_compose_with_verdict .......... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
```

The 64-case matrix is implemented as a single parametrized test function (`matrix_64_cases`) iterating the 4 verbs × 4 wrappers × 4 status outcomes combinatorial inline. The reported `finished in 0.02s` is the matrix loop's total time — the per-case cost is dominated by clap parsing + assertion overhead since the matrix uses mock backends (no live HTTP).

### §8.3 Reproducibility

```bash
cd /Users/biswajitde/cleanlib
cargo test -p cleanlib-cli --test integration --release -- --test-threads=1
# Expect: 2 passed; finished in ~0.02s; total wall ~1s including cargo overhead
```