parlov-analysis 0.7.0

Analysis engine trait and signal detection for parlov.
Documentation

parlov-analysis

Signal detection and oracle classification for parlov. Pure synchronous computation — no I/O, no async, no network stack.

trait

pub enum SampleDecision {
    Complete(Box<OracleResult>, StrategyOutcome),
    NeedMore,
}

#[derive(Debug, thiserror::Error)]
pub enum AnalyzerError {
    #[error("insufficient samples — collect at least {required} before calling analyze")]
    InsufficientSamples { required: usize },
}

pub trait Analyzer: Send + Sync {
    fn evaluate(&self, data: &DifferentialSet) -> SampleDecision;
    fn oracle_class(&self) -> OracleClass;

    /// One-shot wrapper. Default impl calls `evaluate` and errors on `NeedMore`.
    fn analyze(&self, data: &DifferentialSet) -> Result<OracleResult, AnalyzerError> { /**/ }

    /// Minimum exchange pairs required before `analyze` returns `Ok`. Default: 1.
    fn required_samples(&self) -> usize { 1 }
}

The caller drives an adaptive loop — collect exchange pairs, call evaluate(), stop when Complete. All oracle semantics (how many samples, stability criteria, classification) live in the analyzer. The DifferentialSet carries technique context (vector, normative strength) so the analyzer can extract typed signals and score confidence. Complete carries both the OracleResult and the matching StrategyOutcome (Positive / Contradictory / NoSignal / Inapplicable) so endpoint-level aggregation has full classification, not just the verdict.

use it

use parlov_analysis::existence::ExistenceAnalyzer;
use parlov_analysis::{Analyzer, SampleDecision};

let analyzer = ExistenceAnalyzer;

// feed it a growing DifferentialSet
match analyzer.evaluate(&diff_set) {
    SampleDecision::Complete(result, outcome) => {
        println!("{:?}{:?}", result.verdict, result.severity);
        println!("signals: {:?}", result.signals);
        println!("outcome: {:?}", outcome);
    }
    SampleDecision::NeedMore => {
        // collect another baseline + probe exchange and call again
    }
}

signal extractors

The signals module provides typed signal extraction from differential data:

  • status_code — extracts status code differentials between baseline and probe sides.
  • header — extracts header presence/absence/value differences (ETag, Last-Modified, Content-Range, WWW-Authenticate, Allow, Accept-Ranges).
  • metadata — extracts response metadata signals (Content-Range total size, ETag values).
  • body — extracts response body content differentials. Detects existence leakage even when status codes are identical (e.g. different error message text, different JSON error schemas).

All extractors run unconditionally on every DifferentialSet — the tool detects unexpected leakage, not just expected signals.

scoring pipeline

Confidence is computed from weighted signal scoring with diminishing returns within signal families:

  • Signal families (Range, CacheValidator, Auth, Precondition, Negotiation, ErrorBody) prevent double-counting correlated signals from the same RFC mechanism.
  • Normative weighting per signal: Must x1.0, Should x0.9, May x0.75.
  • Reproducibility weighting: 3/3 stable x1.0, 2/3 stable x0.7, 1/3 stable x0.25.
  • Verdict thresholds: >=80 Confirmed, >=60 Likely, <60 NotPresent.
  • Severity: highest leak impact class among validated signals, gated by confidence floor.

aggregation modifiers

aggregation::* gates downgrade phantom Contradictory outcomes to Inapplicable with structured reasons before they reach the verdict. Three independent gates, multiplicative confidence (EvidenceModifiers):

  • precondition — auth-gate before technique (401 with no/wrong credential), method-gate (405 before resource lookup), parser-gate (400/422 on non-parser-relevant techniques), and per-technique applicability markers (e.g. If-None-Match requires ETag).
  • surface — mis-surfaced contradictions: when a Status-surface technique fires SameStatus but body/header divergence exceeds the threshold, the real signal is on a different surface and the contradiction is invalid.
  • control — route-mutation control destruction: when a strategy's mutated baseline fails but its unmutated canonical succeeds, the mutation broke routing.

aggregation::reducer is the offline log-odds aggregator. Groups events by (family, polarity), applies polarity-specific diminishing-returns, caps each group at ±0.75. Order-invariant — verdicts no longer depend on strategy execution order.

aggregation::auth_classifier is the two-stage auth-block classifier (Pass 5a). RFC 9110 §11.6.1 / RFC 6750 §3 challenge parsing, login-redirect heuristics, and bounded JSON/form auth-error body extraction across six families (OriginAuthentication, OriginAuthorization, ProxyAuthentication, NetworkAuthentication, LoginRedirect, AuthErrorEnvelope).

existence oracle

ExistenceAnalyzer implements two detection layers:

  • Layer 1 (code-blind): same status on first sample → outcome routed through aggregation gates; only verdicts as NotPresent when the coverage gate is satisfied (≥3 Contradictory techniques and total magnitude ≥ 0.20). Different status → collect up to 3 pairs and check stability.
  • Layer 2 (RFC-informed): classifies stable differentials against a 31-pattern table — 403/404Confirmed/High, 409/201Confirmed/High, 304/404Confirmed/High, 302/404Confirmed/Medium, etc. Each pattern carries a label, leaks description, and RFC section. Unrecognized stable differentials → Likely/Low. RedirectDiff probes where neither side is 3xx are dismissed before pattern classification with a "technique did not fire" signal annotation.

adding an oracle

Implement Analyzer for a new oracle class:

pub struct TimingAnalyzer;

impl Analyzer for TimingAnalyzer {
    fn evaluate(&self, data: &DifferentialSet) -> SampleDecision {
        if data.baseline.len() < 30 {
            return SampleDecision::NeedMore;
        }
        // Mann-Whitney U test on timing_ns...
        SampleDecision::Complete(Box::new(result), outcome)
    }

    fn oracle_class(&self) -> OracleClass {
        OracleClass::Timing // once the variant exists
    }

    fn required_samples(&self) -> usize { 30 }
}

The binary's adaptive loop works unchanged — it just calls evaluate() until Complete, regardless of how many samples the analyzer needs.

license

MIT OR Apache-2.0