parlov-analysis 0.5.0

Analysis engine trait and signal detection for parlov.
Documentation
# parlov-analysis

Signal detection and oracle classification for parlov. Pure synchronous computation — no I/O, no async, no network stack.

## trait

```rust
pub enum SampleDecision {
    Complete(OracleResult),
    NeedMore,
}

pub trait Analyzer: Send + Sync {
    fn evaluate(&self, data: &DifferentialSet) -> SampleDecision;
    fn oracle_class(&self) -> OracleClass;
}
```

The caller drives an adaptive loop — collect exchange pairs, call `evaluate()`, stop when `Complete`. All oracle semantics (how many samples, stability criteria, classification) live in the analyzer. The `DifferentialSet` carries technique context (vector, normative strength) so the analyzer can extract typed signals and score confidence.

## use it

```rust
use parlov_analysis::existence::ExistenceAnalyzer;
use parlov_analysis::{Analyzer, SampleDecision};

let analyzer = ExistenceAnalyzer;

// feed it a growing DifferentialSet
match analyzer.evaluate(&diff_set) {
    SampleDecision::Complete(result) => {
        println!("{:?} — {:?}", result.verdict, result.severity);
        println!("signals: {:?}", result.signals);
    }
    SampleDecision::NeedMore => {
        // collect another baseline + probe exchange and call again
    }
}
```

## signal extractors

The `signals` module provides typed signal extraction from differential data:

- **`status_code`** — extracts status code differentials between baseline and probe sides.
- **`header`** — extracts header presence/absence/value differences (ETag, Last-Modified, Content-Range, WWW-Authenticate, Allow, Accept-Ranges).
- **`metadata`** — extracts response metadata signals (Content-Range total size, ETag values).
- **`body`** — extracts response body content differentials. Detects existence leakage even when status codes are identical (e.g. different error message text, different JSON error schemas).

All extractors run unconditionally on every `DifferentialSet` — the tool detects unexpected leakage, not just expected signals.

## scoring pipeline

Confidence is computed from weighted signal scoring with diminishing returns within signal families:

- **Signal families** (Range, CacheValidator, Auth, Precondition, Negotiation, ErrorBody) prevent double-counting correlated signals from the same RFC mechanism.
- **Normative weighting** per signal: `Must` x1.0, `Should` x0.9, `May` x0.75.
- **Reproducibility weighting**: 3/3 stable x1.0, 2/3 stable x0.7, 1/3 stable x0.25.
- **Verdict thresholds**: >=80 Confirmed, >=60 Likely, <60 NotPresent.
- **Severity**: highest leak impact class among validated signals, gated by confidence floor.

## existence oracle

`ExistenceAnalyzer` implements two detection layers:

- **Layer 1 (code-blind):** same status on first sample → `NotPresent` immediately. Different status → collect up to 3 pairs and check stability.
- **Layer 2 (RFC-informed):** classifies stable differentials against a 30-pattern table — `403/404``Confirmed/High`, `409/201``Confirmed/High`, `304/404``Confirmed/High`, etc. Each pattern carries a label, leaks description, and RFC section. Unrecognized stable differentials → `Likely/Low`.

## adding an oracle

Implement `Analyzer` for a new oracle class:

```rust
pub struct TimingAnalyzer;

impl Analyzer for TimingAnalyzer {
    fn evaluate(&self, data: &DifferentialSet) -> SampleDecision {
        if data.baseline.len() < 30 {
            return SampleDecision::NeedMore;
        }
        // Mann-Whitney U test on timing_ns...
        SampleDecision::Complete(Box::new(result))
    }

    fn oracle_class(&self) -> OracleClass {
        OracleClass::Timing // once the variant exists
    }
}
```

The binary's adaptive loop works unchanged — it just calls `evaluate()` until `Complete`, regardless of how many samples the analyzer needs.

## license

MIT OR Apache-2.0