Skip to main content

cellos_supervisor/dns_proxy/
dnssec.rs

1//! SEC-21 Phase 3h.1 — SEAM-1 in-netns DNS proxy DNSSEC validation.
2//!
3//! Closes the dataplane gap left by Phase 3h. The supervisor-side
4//! resolver-refresh path (P3h, see [`crate::resolver_refresh::dnssec`])
5//! validates DNSSEC for the *observability* signal (drift events). The
6//! workload-facing in-netns DNS proxy ([`super`]) — which the workload
7//! actually queries — did not validate at all before P3h.1: a workload
8//! asking for an allowlisted hostname would receive whatever the upstream
9//! returned, including potentially-spoofed unsigned/bogus records.
10//!
11//! This module adds enforcement on the dataplane.
12//!
13//! ## Mode mapping
14//!
15//! The Phase 3h.1 spec uses three logical modes — `require | best_effort
16//! | off`. We map onto the existing P3h two-boolean
17//! [`cellos_core::DnsResolverDnssecPolicy`] (`validate`, `fail_closed`)
18//! without inventing a new spec field:
19//!
20//! | Logical mode  | `validate` | `fail_closed` | Behaviour                              |
21//! |---------------|------------|---------------|----------------------------------------|
22//! | `off`         | n/a        | n/a           | `dnssec` block is `None`, OR `validate=false` — validator not constructed at all (`from_authority` returns `Ok(None)`). |
23//! | `best_effort` | `true`     | `false`       | Validator runs; bogus → SERVFAIL+event; unsigned → forward (no event). |
24//! | `require`     | `true`     | `true`        | Validator runs; bogus → SERVFAIL+event; unsigned → SERVFAIL+event.     |
25//!
26//! ## Behaviour matrix (load-bearing)
27//!
28//! For every workload query that passes the proxy's allowlist gate AND has
29//! a validator configured:
30//!
31//! | Mode          | Outcome  | Workload sees     | Event emitted?                                          |
32//! |---------------|----------|-------------------|---------------------------------------------------------|
33//! | `require`     | Validated| Forwarded answer  | No                                                      |
34//! | `require`     | Unsigned | SERVFAIL          | Yes — `reason: "unsigned_in_require_mode"`              |
35//! | `require`     | Failed   | SERVFAIL          | Yes — `reason: "validation_failed"`                     |
36//! | `best_effort` | Validated| Forwarded answer  | No                                                      |
37//! | `best_effort` | Unsigned | Forwarded answer  | No (explicit "tolerate unsigned zones")                  |
38//! | `best_effort` | Failed   | SERVFAIL          | Yes — `reason: "validation_failed"` (bogus always rejected)|
39//! | (any)         | Skip     | (see Skip below)  | (see Skip below)                                        |
40//!
41//! ### `Skip` outcome (post-A2: residual non-validating types)
42//!
43//! `validate()` returns `Skip` ONLY for query types this validator cannot
44//! evaluate via either backend (A/AAAA via
45//! [`crate::resolver_refresh::hickory_resolve::resolve_with_ttl_validated`]
46//! or CNAME/HTTPS/SVCB/MX/TXT via the typed-record backend introduced in
47//! the A2 slot). After A2 the validator covers seven query types
48//! end-to-end; `Skip` is now the residual fallback for types like
49//! `NS`, `PTR`, `SRV`, `SOA`, etc., that the proxy's
50//! [`super::DEFAULT_QUERY_TYPES`] does NOT admit by default. Operators
51//! who widen `allowed_query_types` to include such residuals get the
52//! same call-site policy as before:
53//!
54//! - `require` + `Skip` → SERVFAIL + event
55//!   `reason: "unsupported_query_type_in_require_mode"`. Strict reading:
56//!   if the operator asked for require, an unvalidatable response is not
57//!   acceptable.
58//! - `best_effort` + `Skip` → forward unvalidated. Documented honestly as
59//!   a known gap — the seven first-class types (A/AAAA/CNAME/HTTPS/SVCB/MX/TXT)
60//!   are validated; the residuals pass through.
61//!
62//! Off-mode cells never construct a validator; the proxy's hot path stays
63//! byte-identical to pre-P3h.1.
64//!
65//! ## Validation strategy — Option B (post-validate after raw forward)
66//!
67//! The Phase 3h.1 spec offers two implementation options:
68//!
69//! - **Option A**: hand the workload's query off to a validating hickory
70//!   resolver and synthesize the response back to the workload from the
71//!   resolver's `Vec<IpAddr>` output.
72//! - **Option B**: keep the existing raw forward, then re-resolve the same
73//!   `(qname, qtype)` through a validating hickory resolver to produce an
74//!   independent verdict, and dispatch on the verdict.
75//!
76//! **We chose B.** Reasoning:
77//!
78//! - [`crate::resolver_refresh::hickory_resolve::resolve_with_ttl_validated`]
79//!   returns `Vec<String>` IP targets — it discards CNAME chains and any
80//!   additional records (HTTPS/SVCB/EDNS) the upstream may have included.
81//!   Synthesizing the workload-facing wire response from just IPs would
82//!   silently lose information the workload may have asked for.
83//! - Option B preserves whatever the upstream actually said (CNAME chain,
84//!   answer additionals, the exact RCODE) when the verdict says "allow",
85//!   which is the contract the workload had before P3h.1.
86//!
87//! Cost: each workload query that has a validator triggers two upstream
88//! roundtrips (original raw forward + hickory's validation lookup).
89//! Acceptable — DNS volume per cell is low and the validator is per-cell.
90//! A future hickory release that exposes a public knob to inject the raw
91//! upstream answer into the validator (avoiding the second roundtrip) is
92//! a clean Option-A migration target.
93//!
94//! ## Trust anchors — reuse, not copy
95//!
96//! Trust-anchor loading reuses [`crate::resolver_refresh::dnssec::TrustAnchors`]
97//! verbatim — same `O_NOFOLLOW`, same 32 KiB ceiling, same env-var
98//! precedence (`CELLOS_DNSSEC_TRUST_ANCHORS_PATH`). The Hickory 0.24
99//! limitation persists at this surface too: operator-supplied anchors are
100//! observable via `trustAnchorSource` in events but the validator
101//! internally uses hickory's bundled IANA defaults (19036 + 20326). When
102//! hickory exposes a public anchor-injection knob, both validation
103//! surfaces (resolver-refresh and dataplane) gain it together.
104//!
105//! ## Event emission — additive `source` field
106//!
107//! Both surfaces emit the same CloudEvent type
108//! (`dev.cellos.events.cell.observability.v1.dns_authority_dnssec_failed`)
109//! and now stamp an additive `source` field discriminating which surface
110//! produced the event:
111//!
112//! - `source = "resolver_refresh"` — supervisor-side resolver-refresh.
113//! - `source = "dataplane"` — in-netns DNS proxy (this module).
114//!
115//! The field is **optional** in the v1 schema for one cycle so existing
116//! emitters that pre-date Phase 3h.1 still validate. The next major
117//! schema bump should make this required.
118
119use std::net::SocketAddr;
120use std::path::PathBuf;
121use std::sync::Arc;
122use std::time::Duration;
123
124use cellos_core::{CellosError, DnsAuthority, DnsQueryType};
125use hickory_proto::dnssec::Proof;
126use hickory_resolver::config::{
127    ConnectionConfig, NameServerConfig, ProtocolConfig, ResolveHosts, ResolverConfig, ResolverOpts,
128};
129use hickory_resolver::net::runtime::TokioRuntimeProvider;
130use hickory_resolver::net::{DnsError, NetError};
131use hickory_resolver::proto::op::ResponseCode;
132use hickory_resolver::proto::rr::RecordType;
133use hickory_resolver::Resolver;
134
135use super::parser::parse_query;
136use crate::resolver_refresh::hickory_resolve::{
137    extract_rrsig_metadata, proof_to_validation_result_with_rrsig,
138};
139use crate::resolver_refresh::DnssecValidationResult;
140
141// Re-export TrustAnchors so callers in this module's neighborhood can
142// reach it without naming the resolver_refresh path. This is a re-export
143// (not a copy) — there is exactly one TrustAnchors definition in the
144// supervisor crate.
145pub use crate::resolver_refresh::dnssec::TrustAnchors;
146
147/// Default upstream-validation timeout when [`from_authority`] does not
148/// have an operator-configured budget. Matches the
149/// [`super::DnsProxyConfig::upstream_timeout`] production default (400ms)
150/// so the validator's roundtrip budget aligns with the proxy's own
151/// upstream budget — avoids "validator timed out before upstream did"
152/// confusion in operator triage.
153const DEFAULT_VALIDATION_TIMEOUT: Duration = Duration::from_millis(400);
154
155/// Outcome of one dataplane DNSSEC validation attempt.
156///
157/// Matches the P3h vocabulary [`DnssecValidationResult`] one-for-one
158/// (`Validated` / `Unsigned` / `Failed`) plus a fourth `Skip` variant for
159/// query types this validator cannot evaluate (see module docs).
160///
161/// Post-A2: `Skip` is returned by [`DataplaneDnssecValidator::validate`]
162/// for query types outside the validator's first-class set
163/// `{A, AAAA, CNAME, HTTPS, SVCB, MX, TXT}`. The proxy call site decides
164/// what to do with it based on `fail_closed` (require vs best_effort) —
165/// see the behaviour matrix in module docs.
166#[derive(Debug, Clone, PartialEq, Eq)]
167pub enum DataplaneDnssecOutcome {
168    /// Backend was not dispatched because the query type is outside the
169    /// validator's first-class set (post-A2: A/AAAA/CNAME/HTTPS/SVCB/MX/TXT).
170    /// Caller dispatches based on policy: in `require` mode this
171    /// SERVFAILs with `unsupported_query_type_in_require_mode`; in
172    /// `best_effort` it forwards unvalidated.
173    Skip,
174    /// Chain of trust established back to the configured anchor set.
175    Validated,
176    /// Resolver returned answers but the zone is not signed. Operator
177    /// policy (`fail_closed`) decides allow/deny.
178    Unsigned,
179    /// Validator rejected the chain — RRSIG missing, signature bogus, or
180    /// chained to a key not in the configured trust anchors. `reason` is
181    /// a stable static string suitable for the
182    /// `dns_authority_dnssec_failed` event's `reason` field.
183    Failed {
184        /// Stable static reason string. Always one of the schema enum
185        /// values: `"validation_failed"`, `"unsigned_in_require_mode"`,
186        /// `"unsupported_query_type_in_require_mode"`, or
187        /// `"trust_anchor_missing"`.
188        reason: &'static str,
189    },
190}
191
192/// Pluggable A/AAAA backend used by [`DataplaneDnssecValidator`].
193///
194/// Production wiring constructs a backend that calls
195/// [`crate::resolver_refresh::resolve_with_ttl_validated`] inside a
196/// captured tokio runtime handle. Unit tests construct synthetic
197/// backends that return canned outcomes — this is what makes
198/// `validate()` testable without standing up a real DNSSEC upstream.
199///
200/// The backend receives `(qname, qtype)` and returns a
201/// [`DnssecValidationResult`] (the P3h vocabulary). The validator maps
202/// that to a [`DataplaneDnssecOutcome`].
203///
204/// `Send + Sync` because the validator is held behind an `Arc` and
205/// shared between the supervisor task and the proxy thread.
206pub type DataplaneDnssecBackend =
207    dyn Fn(&str, u16) -> std::io::Result<DnssecValidationResult> + Send + Sync;
208
209/// A2 — pluggable typed-record backend used by
210/// [`DataplaneDnssecValidator`] for the non-A/AAAA first-class types
211/// (CNAME / HTTPS / SVCB / MX / TXT).
212///
213/// Distinct from [`DataplaneDnssecBackend`] because it dispatches the
214/// hickory generic `Resolver::lookup(name, RecordType)` API rather than
215/// the `lookup_ip` strategy `resolve_with_ttl_validated` rides on; it
216/// does not need to merge A+AAAA answer sets, and the answer records
217/// inspected for `Proof` are the per-type set hickory returns rather
218/// than the IPv4+IPv6 union.
219///
220/// Production wiring builds this on top of an internal helper that
221/// constructs a fresh validating resolver per call (mirroring
222/// `resolve_with_ttl_validated`'s build pattern) and walks the answer
223/// records to produce the `DnssecValidationResult` verdict. Tests
224/// supply canned outcomes via [`DataplaneDnssecValidator::with_backends`].
225///
226/// Returns the same [`DnssecValidationResult`] taxonomy as the A/AAAA
227/// path so the existing `Validated → Validated`, `Unsigned → Unsigned`,
228/// `Failed → Failed{validation_failed}` mapping inside `validate()`
229/// applies uniformly. An I/O error from the resolver fail-safes to
230/// `Failed{validation_failed}` (require-mode contract: no answer unless
231/// the validator could establish a chain — extending to "or unless we
232/// couldn't even ask").
233pub type DataplaneTypedDnssecBackend =
234    dyn Fn(&str, RecordType) -> std::io::Result<DnssecValidationResult> + Send + Sync;
235
236/// Dataplane DNSSEC validator handed to the proxy via
237/// [`super::DnsProxyConfig::dnssec_validator`].
238///
239/// Construct via [`DataplaneDnssecValidator::from_authority`] in
240/// production (returns `Ok(None)` for `mode = off` so the proxy hot
241/// path is unchanged), or via the test-only
242/// [`DataplaneDnssecValidator::with_backend`] for unit tests.
243pub struct DataplaneDnssecValidator {
244    /// `true` when the cell's DnsAuthority requested `require` mode
245    /// (validate=true, fail_closed=true). Drives the SERVFAIL decision
246    /// for `Unsigned` and `Skip` outcomes.
247    fail_closed: bool,
248    /// Source descriptor for trust anchors stamped into events
249    /// (`"iana-default"` or the basename of an operator-supplied file).
250    trust_anchor_source: String,
251    /// A/AAAA backend callable. In production this captures a tokio
252    /// runtime handle and calls
253    /// [`crate::resolver_refresh::resolve_with_ttl_validated`]. In tests
254    /// this is a closure returning a canned [`DnssecValidationResult`].
255    backend: Arc<DataplaneDnssecBackend>,
256    /// A2 — typed-record backend for CNAME / HTTPS / SVCB / MX / TXT.
257    /// `None` preserves the pre-A2 behaviour (these types Skip; the
258    /// proxy call site dispatches based on require vs best_effort).
259    /// `Some(_)` engages per-type validation: CNAME/HTTPS/SVCB/MX/TXT
260    /// queries get a real Validated/Unsigned/Failed verdict instead of
261    /// the placeholder `unsupported_query_type_in_require_mode`
262    /// rejection. Production [`Self::from_authority`] populates this
263    /// for every `validate=true` policy; tests can supply a synthetic
264    /// implementation via [`Self::with_backends`].
265    typed_backend: Option<Arc<DataplaneTypedDnssecBackend>>,
266}
267
268impl std::fmt::Debug for DataplaneDnssecValidator {
269    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
270        f.debug_struct("DataplaneDnssecValidator")
271            .field("fail_closed", &self.fail_closed)
272            .field("trust_anchor_source", &self.trust_anchor_source)
273            .field("backend", &"<Arc<dyn Fn>>")
274            .field(
275                "typed_backend",
276                &self.typed_backend.as_ref().map(|_| "<Arc<dyn Fn>>"),
277            )
278            .finish()
279    }
280}
281
282impl DataplaneDnssecValidator {
283    /// SEC-21 Phase 3h.1 — construct a validator from a parsed
284    /// [`DnsAuthority`].
285    ///
286    /// Returns:
287    /// - `Ok(None)` when the first declared resolver has no `dnssec`
288    ///   block or has `dnssec.validate = false` — i.e. the cell opted
289    ///   out (`mode = off`). The proxy hot path stays byte-identical
290    ///   to today.
291    /// - `Ok(Some(..))` when `dnssec.validate = true`. The validator is
292    ///   ready to evaluate workload queries. `fail_closed` is taken
293    ///   from `dnssec.fail_closed` (true → require, false → best_effort).
294    /// - `Err(CellosError::InvalidSpec)` when `dnssec.trust_anchors_path`
295    ///   is set but the file is missing, symlinked, or oversized — the
296    ///   trust-anchor loader (reused from
297    ///   [`crate::resolver_refresh::dnssec::TrustAnchors`]) refuses to
298    ///   construct the validator silently. Surfacing this at activation
299    ///   time prevents a cell from running with a broken validator and
300    ///   a half-enforced policy.
301    ///
302    /// **Resolver selection**: the first entry of `auth.resolvers` is
303    /// consulted; this matches the proxy's existing
304    /// [`super::DnsProxyConfig::upstream_resolver_id`] convention (the
305    /// proxy talks to one upstream resolver per cell). Cells with multiple
306    /// resolvers and per-resolver DNSSEC policies are out of scope for
307    /// P3h.1.
308    ///
309    /// **Tokio context**: this constructor does NOT call `Handle::current()`
310    /// — the production backend captures the handle lazily on the first
311    /// `validate()` call. This keeps `from_authority` callable from unit
312    /// tests that don't have a runtime, while still letting production
313    /// (which always has a runtime) drive hickory.
314    pub fn from_authority(auth: &DnsAuthority) -> Result<Option<Self>, CellosError> {
315        let Some(resolver) = auth.resolvers.first() else {
316            return Ok(None);
317        };
318        let Some(policy) = resolver.dnssec.as_ref() else {
319            return Ok(None);
320        };
321        if !policy.validate {
322            return Ok(None);
323        }
324
325        // Trust-anchor loading — surfaces operator-misconfiguration as
326        // an InvalidSpec error rather than a silent degrade. Reuses the
327        // P3h loader (O_NOFOLLOW + 32 KiB ceiling) verbatim.
328        let anchors = TrustAnchors::load(policy.trust_anchors_path.as_deref())?;
329        let trust_anchor_source = anchors.source.clone();
330
331        // Parse the upstream resolver's endpoint into a SocketAddr for
332        // hickory. Phase 3h.1 only supports do53-udp/tcp resolvers (the
333        // dataplane proxy itself only does do53). Other protocols would
334        // require additional hickory configuration and are out of scope.
335        let upstream_addr: SocketAddr = resolver.endpoint.parse().map_err(|e| {
336            CellosError::InvalidSpec(format!(
337                "dns_proxy::dnssec: cannot parse resolver.endpoint '{}' as SocketAddr: {e}",
338                resolver.endpoint
339            ))
340        })?;
341
342        let anchors = Arc::new(anchors);
343        let backend =
344            build_hickory_backend(upstream_addr, DEFAULT_VALIDATION_TIMEOUT, anchors.clone());
345        let typed_backend =
346            build_hickory_typed_backend(upstream_addr, DEFAULT_VALIDATION_TIMEOUT, anchors);
347
348        Ok(Some(Self {
349            fail_closed: policy.fail_closed,
350            trust_anchor_source,
351            backend,
352            typed_backend: Some(typed_backend),
353        }))
354    }
355
356    /// Test-only constructor — builds a validator with a synthetic
357    /// A/AAAA backend ONLY. Used by integration tests in
358    /// `tests/dataplane_dnssec_validation.rs` to drive the
359    /// `Validated/Unsigned/Failed` matrix without standing up a real
360    /// DNSSEC-signed upstream.
361    ///
362    /// Validators built via this constructor have NO typed-record backend
363    /// (`typed_backend: None`), preserving pre-A2 Skip behaviour for
364    /// CNAME/HTTPS/SVCB/MX/TXT — the existing P3h.1 test matrix that
365    /// asserts `Skip + require → SERVFAIL` against a TXT query continues
366    /// to fire the same code path it always has.
367    ///
368    /// For tests that need to exercise the typed-record validator path,
369    /// use [`Self::with_backends`].
370    ///
371    /// Production callers MUST go through [`Self::from_authority`].
372    #[doc(hidden)]
373    pub fn with_backend(
374        fail_closed: bool,
375        trust_anchor_source: String,
376        backend: Arc<DataplaneDnssecBackend>,
377    ) -> Self {
378        Self {
379            fail_closed,
380            trust_anchor_source,
381            backend,
382            typed_backend: None,
383        }
384    }
385
386    /// A2 test-only constructor — builds a validator with both an A/AAAA
387    /// backend AND a typed-record backend. The typed backend is invoked
388    /// for CNAME / HTTPS / SVCB / MX / TXT queries; the A/AAAA backend
389    /// for A and AAAA. All other query types still Skip.
390    ///
391    /// This shape is what the per-type integration tests at
392    /// `tests/dataplane_dnssec_per_type_validation.rs` use to drive the
393    /// post-A2 `Validated/Unsigned/Failed` matrix per type without
394    /// standing up five real DNSSEC-signed zones.
395    ///
396    /// Production callers MUST go through [`Self::from_authority`].
397    #[doc(hidden)]
398    pub fn with_backends(
399        fail_closed: bool,
400        trust_anchor_source: String,
401        backend: Arc<DataplaneDnssecBackend>,
402        typed_backend: Arc<DataplaneTypedDnssecBackend>,
403    ) -> Self {
404        Self {
405            fail_closed,
406            trust_anchor_source,
407            backend,
408            typed_backend: Some(typed_backend),
409        }
410    }
411
412    /// `true` when the validator was constructed in `require` mode
413    /// (cell's DnsAuthority set `dnssec.fail_closed = true`). Drives the
414    /// proxy's SERVFAIL decision for `Unsigned` and `Skip` outcomes.
415    #[must_use]
416    pub fn is_require_mode(&self) -> bool {
417        self.fail_closed
418    }
419
420    /// Trust-anchor source descriptor stamped into emitted events
421    /// (`"iana-default"` or the basename of an operator-supplied file).
422    #[must_use]
423    pub fn trust_anchor_source(&self) -> &str {
424        &self.trust_anchor_source
425    }
426
427    /// Validate a single workload query.
428    ///
429    /// `query` is the raw UDP payload the workload sent. `_upstream_answer`
430    /// is the raw upstream response — **ignored under Option B**: the
431    /// validator re-resolves the same `(qname, qtype)` through hickory
432    /// to produce an independent verdict (see module docs for why we
433    /// chose Option B over Option A).
434    ///
435    /// The parameter is plumbed for forward-compat to a future Option-A
436    /// migration that injects the raw answer into hickory's validator
437    /// (when hickory exposes such a knob).
438    ///
439    /// ## Type dispatch (post-A2)
440    ///
441    /// `validate()` switches over the parsed `qtype`:
442    ///
443    /// | Query type             | Backend dispatched                                    |
444    /// |------------------------|-------------------------------------------------------|
445    /// | A, AAAA                | `self.backend` (hickory `lookup_ip` strategy)         |
446    /// | CNAME, HTTPS, SVCB, MX, TXT | `self.typed_backend` (hickory generic `lookup`)   |
447    /// | All others             | `Skip` (call site policy: SERVFAIL in require, forward in best_effort) |
448    ///
449    /// When `self.typed_backend` is `None` (validators built via the
450    /// pre-A2 [`Self::with_backend`] test constructor), CNAME/HTTPS/SVCB/MX/TXT
451    /// also Skip — preserving backward compatibility with the existing
452    /// P3h.1 test surface that asserts `TXT + require → SERVFAIL`.
453    ///
454    /// ## Outcomes
455    ///
456    /// - [`DataplaneDnssecOutcome::Skip`] for query types outside the
457    ///   first-class set, or for first-class types when the
458    ///   typed-record backend is absent. Caller policy decides what to
459    ///   do (see module docs).
460    /// - [`DataplaneDnssecOutcome::Validated`] when hickory's bundled
461    ///   validator chained the response back to the configured anchors.
462    /// - [`DataplaneDnssecOutcome::Unsigned`] when the zone has no
463    ///   DNSSEC chain.
464    /// - [`DataplaneDnssecOutcome::Failed`] when the validator rejected
465    ///   the chain (RRSIG missing, signature bogus, or chained to a
466    ///   non-anchor key) OR the backend returned an I/O error (network
467    ///   timeout / SERVFAIL from upstream — fail-safe to Failed).
468    pub fn validate(&self, query: &[u8], _upstream_answer: &[u8]) -> DataplaneDnssecOutcome {
469        // Parse just enough to extract qname + qtype. If the query is
470        // malformed, the proxy already dropped it before calling us;
471        // defensive `Skip` if we somehow get here with garbage.
472        let view = match parse_query(query) {
473            Ok(v) => v,
474            Err(_) => return DataplaneDnssecOutcome::Skip,
475        };
476
477        // Switch over the typed query enum. A/AAAA dispatch onto the
478        // legacy `lookup_ip`-based backend; CNAME/HTTPS/SVCB/MX/TXT
479        // dispatch onto the typed-record backend (post-A2). Everything
480        // else Skips — call site policy decides.
481        let typed = cellos_core::qtype_to_dns_query_type(view.qtype);
482        let dispatch = match typed {
483            Some(DnsQueryType::A) | Some(DnsQueryType::AAAA) => Dispatch::Legacy,
484            Some(DnsQueryType::CNAME) => Dispatch::Typed(RecordType::CNAME),
485            Some(DnsQueryType::HTTPS) => Dispatch::Typed(RecordType::HTTPS),
486            Some(DnsQueryType::SVCB) => Dispatch::Typed(RecordType::SVCB),
487            Some(DnsQueryType::MX) => Dispatch::Typed(RecordType::MX),
488            Some(DnsQueryType::TXT) => Dispatch::Typed(RecordType::TXT),
489            _ => return DataplaneDnssecOutcome::Skip,
490        };
491
492        let result = match dispatch {
493            Dispatch::Legacy => (self.backend)(&view.qname, view.qtype),
494            Dispatch::Typed(rtype) => match self.typed_backend.as_ref() {
495                Some(typed_backend) => (typed_backend)(&view.qname, rtype),
496                // Pre-A2 shape: no typed backend wired (test constructor
497                // `with_backend` or future operator-toggle). Preserve
498                // legacy Skip behaviour so the call site policy still
499                // decides SERVFAIL vs forward.
500                None => return DataplaneDnssecOutcome::Skip,
501            },
502        };
503
504        match result {
505            Ok(DnssecValidationResult::Validated { .. }) => DataplaneDnssecOutcome::Validated,
506            Ok(DnssecValidationResult::Unsigned) => DataplaneDnssecOutcome::Unsigned,
507            Ok(DnssecValidationResult::Failed { .. }) => DataplaneDnssecOutcome::Failed {
508                reason: "validation_failed",
509            },
510            // Fail-safe: any I/O error from the validating resolver
511            // (timeout, transport, refused) is treated as a validation
512            // failure rather than passing through the unvalidated
513            // upstream answer. The require-mode contract is "no answer
514            // unless validated" — extending that to "or unless we
515            // couldn't even ask" is the only safe choice. best_effort
516            // gets the same SERVFAIL because the validator's verdict
517            // is "couldn't establish chain".
518            Err(_) => DataplaneDnssecOutcome::Failed {
519                reason: "validation_failed",
520            },
521        }
522    }
523}
524
525/// Internal dispatch tag — which backend `validate()` should call.
526/// Doesn't escape this module.
527enum Dispatch {
528    /// A/AAAA — hickory `lookup_ip` strategy via `self.backend`.
529    Legacy,
530    /// Typed record (CNAME/HTTPS/SVCB/MX/TXT) — hickory generic
531    /// `lookup(name, RecordType)` via `self.typed_backend`.
532    Typed(RecordType),
533}
534
535/// Build the production A/AAAA backend — captures the upstream addr,
536/// timeout, and trust anchors and dispatches each call onto the current
537/// tokio runtime via `Handle::block_on`.
538///
539/// On the first invocation, captures `tokio::runtime::Handle::current()`.
540/// Production call sites are inside the proxy thread which runs after
541/// the supervisor's tokio runtime is alive; the handle resolution
542/// succeeds. If somehow called outside a runtime (programming error),
543/// the backend returns an `io::Error` — the validator maps that to
544/// `Failed { reason: "validation_failed" }`, fail-closed.
545fn build_hickory_backend(
546    upstream: SocketAddr,
547    timeout: Duration,
548    anchors: Arc<TrustAnchors>,
549) -> Arc<DataplaneDnssecBackend> {
550    Arc::new(move |hostname: &str, _qtype: u16| {
551        let handle = tokio::runtime::Handle::try_current().map_err(|e| {
552            std::io::Error::other(format!(
553                "dns_proxy::dnssec backend: no tokio runtime in scope: {e}"
554            ))
555        })?;
556        let anchors = anchors.clone();
557        let hostname = hostname.to_string();
558        let result = handle.block_on(async move {
559            crate::resolver_refresh::resolve_with_ttl_validated(
560                &hostname, upstream, timeout, &anchors,
561            )
562            .await
563        })?;
564        Ok(result.validation)
565    })
566}
567
568/// A2 — build the production typed-record backend for
569/// CNAME / HTTPS / SVCB / MX / TXT validation.
570///
571/// Mirrors [`build_hickory_backend`]'s shape (capture addr/timeout/anchors,
572/// dispatch onto current tokio runtime) but uses hickory's generic
573/// `Resolver::lookup(name, RecordType)` API rather than `lookup_ip`.
574fn build_hickory_typed_backend(
575    upstream: SocketAddr,
576    timeout: Duration,
577    anchors: Arc<TrustAnchors>,
578) -> Arc<DataplaneTypedDnssecBackend> {
579    Arc::new(move |hostname: &str, record_type: RecordType| {
580        let handle = tokio::runtime::Handle::try_current().map_err(|e| {
581            std::io::Error::other(format!(
582                "dns_proxy::dnssec typed_backend: no tokio runtime in scope: {e}"
583            ))
584        })?;
585        let anchors = anchors.clone();
586        let hostname = hostname.to_string();
587        handle.block_on(async move {
588            resolve_typed_validated(&hostname, record_type, upstream, timeout, &anchors).await
589        })
590    })
591}
592
593/// A2 — async typed-record validating resolution helper.
594///
595/// Builds a fresh validating resolver per call (the same pattern
596/// [`crate::resolver_refresh::resolve_with_ttl_validated`] uses for
597/// A/AAAA) and issues a generic `lookup(name, record_type)`. Walks the
598/// answer records' per-record [`Proof`] to derive the worst-case
599/// verdict, then maps to [`DnssecValidationResult`] via the same
600/// [`proof_to_validation_result_with_rrsig`] mapper.
601///
602/// Returns:
603/// - `Ok(DnssecValidationResult::Validated { .. })` when every record
604///   in the answer set proved Secure. RRSIG metadata is extracted via
605///   [`extract_rrsig_metadata`] when present; the documented placeholder
606///   (`algorithm: "unknown"`, `key_tag: 0`) is stamped otherwise.
607/// - `Ok(DnssecValidationResult::Unsigned)` when at least one record
608///   was Insecure (and none Bogus or Indeterminate).
609/// - `Ok(DnssecValidationResult::Failed { reason })` when any record
610///   was Bogus or Indeterminate (the "validator did not say Secure"
611///   axis). Maps Bogus → `validation_failed`, Indeterminate →
612///   `validation_indeterminate` per the canonical mapper.
613/// - `Err(io::Error)` for transport-class failures (timeout, SERVFAIL,
614///   network unreachable). The validator's call-site fail-safes any
615///   I/O error to `Failed { reason: "validation_failed" }`.
616///
617/// NXDOMAIN / NOERROR-empty: a query with no answers comes back as
618/// `Failed { reason: "validation_indeterminate" }` because
619/// [`worst_proof_local`] returns `Indeterminate` for an empty record
620/// set. Fail-safe per the require-mode contract — an unanswerable
621/// query is not a chain-of-trust establishment.
622async fn resolve_typed_validated(
623    hostname: &str,
624    record_type: RecordType,
625    upstream: SocketAddr,
626    timeout: Duration,
627    trust_anchors: &TrustAnchors,
628) -> std::io::Result<DnssecValidationResult> {
629    let mut config = ResolverConfig::from_parts(None, Vec::new(), Vec::new());
630    config.add_name_server(build_typed_nameserver_config(upstream));
631
632    let mut opts = ResolverOpts::default();
633    opts.cache_size = 0;
634    opts.attempts = 1;
635    opts.timeout = timeout;
636    opts.use_hosts_file = ResolveHosts::Never;
637    opts.edns0 = false;
638    opts.validate = true;
639
640    if let Some(path) = trust_anchors.path() {
641        opts.trust_anchor = Some(PathBuf::from(path));
642    }
643
644    let mut builder =
645        Resolver::builder_with_config(config, TokioRuntimeProvider::default()).with_options(opts);
646
647    if let Some(path) = trust_anchors.path() {
648        match hickory_proto::dnssec::TrustAnchors::from_file(path) {
649            Ok(loaded) => {
650                builder = builder.with_trust_anchor(std::sync::Arc::new(loaded));
651            }
652            Err(e) => {
653                return Err(std::io::Error::other(format!(
654                    "dns_proxy::dnssec typed_backend: trust anchor parse failed for {}: {e}",
655                    path.display()
656                )));
657            }
658        }
659    }
660
661    let resolver = builder.build().map_err(|e| {
662        std::io::Error::other(format!(
663            "dns_proxy::dnssec typed_backend: hickory-resolver build (validating): {e}"
664        ))
665    })?;
666
667    let work = async {
668        let lookup_result = resolver.lookup(hostname, record_type).await;
669        let lookup = match lookup_result {
670            Ok(l) => l,
671            Err(e) => {
672                if let Some(rc) = no_records_response_code(&e) {
673                    if matches!(rc, ResponseCode::NXDomain | ResponseCode::NoError) {
674                        return Ok(DnssecValidationResult::Failed {
675                            reason: "validation_indeterminate".to_string(),
676                        });
677                    }
678                }
679                return Err(map_typed_net_error(e));
680            }
681        };
682
683        let answers = lookup.answers();
684        let proof = worst_proof_local(answers);
685        let rrsig_metadata = extract_rrsig_metadata(answers, &[record_type]);
686        Ok(proof_to_validation_result_with_rrsig(proof, rrsig_metadata))
687    };
688
689    match tokio::time::timeout(timeout, work).await {
690        Ok(inner) => inner,
691        Err(_) => Err(std::io::Error::new(
692            std::io::ErrorKind::TimedOut,
693            format!(
694                "dns_proxy::dnssec typed_backend: hickory-resolver timed out after {timeout:?} for {hostname} {record_type:?}"
695            ),
696        )),
697    }
698}
699
700/// A2 — build a `NameServerConfig` for the typed-record validating
701/// resolver. Local copy because the helper in `hickory_resolve.rs` is
702/// private to the resolver_refresh module.
703fn build_typed_nameserver_config(upstream: SocketAddr) -> NameServerConfig {
704    let mut udp = ConnectionConfig::new(ProtocolConfig::Udp);
705    udp.port = upstream.port();
706    let mut tcp = ConnectionConfig::new(ProtocolConfig::Tcp);
707    tcp.port = upstream.port();
708    NameServerConfig::new(upstream.ip(), true, vec![udp, tcp])
709}
710
711/// A2 — local copy of `worst_proof` from `hickory_resolve.rs` (private
712/// there). Combines per-record `Proof` flags into a single outcome
713/// using the same conservative ordering: any `Bogus` poisons the set,
714/// then `Indeterminate`, then `Insecure`; only when every record is
715/// `Secure` does the set come back `Secure`. An empty answer set
716/// returns `Indeterminate`.
717fn worst_proof_local(records: &[hickory_resolver::proto::rr::Record]) -> Proof {
718    let mut have_any = false;
719    let mut all_secure = true;
720    let mut any_insecure = false;
721    let mut any_indeterminate = false;
722    for record in records {
723        have_any = true;
724        match record.proof {
725            Proof::Bogus => return Proof::Bogus,
726            Proof::Indeterminate => {
727                any_indeterminate = true;
728                all_secure = false;
729            }
730            Proof::Insecure => {
731                any_insecure = true;
732                all_secure = false;
733            }
734            Proof::Secure => {}
735        }
736    }
737    if !have_any {
738        return Proof::Indeterminate;
739    }
740    if all_secure {
741        Proof::Secure
742    } else if any_indeterminate {
743        Proof::Indeterminate
744    } else if any_insecure {
745        Proof::Insecure
746    } else {
747        Proof::Indeterminate
748    }
749}
750
751/// A2 — local copy of `no_records_response_code` from `hickory_resolve.rs`.
752fn no_records_response_code(e: &NetError) -> Option<ResponseCode> {
753    match e {
754        NetError::Dns(DnsError::NoRecordsFound(no_records)) => Some(no_records.response_code),
755        _ => None,
756    }
757}
758
759/// A2 — local copy of `map_net_error` from `hickory_resolve.rs`.
760fn map_typed_net_error(e: NetError) -> std::io::Error {
761    let kind = match &e {
762        NetError::Timeout => std::io::ErrorKind::TimedOut,
763        NetError::Io(io_err) => io_err.kind(),
764        _ => std::io::ErrorKind::Other,
765    };
766    std::io::Error::new(
767        kind,
768        format!("dns_proxy::dnssec typed_backend: hickory-resolver error: {e}"),
769    )
770}
771
772#[cfg(test)]
773mod tests {
774    use super::*;
775    use crate::resolver_refresh::ENV_TRUST_ANCHORS_PATH;
776    use cellos_core::{DnsResolver, DnsResolverDnssecPolicy, DnsResolverProtocol};
777    use tempfile::tempdir;
778
779    /// T2-5 — Per-test env-var guard.
780    ///
781    /// `TrustAnchors::load` (the loader behind `from_authority` below)
782    /// gives `CELLOS_DNSSEC_TRUST_ANCHORS_PATH` precedence over the spec
783    /// path. Under `cargo test` with `--test-threads >= 2`, a *different*
784    /// test in the same process — e.g.
785    /// `resolver_refresh::dnssec::tests::loads_path_with_o_nofollow_unix`
786    /// — can `set_var` the env var while one of these path-rejection
787    /// tests is mid-flight. `std::env::set_var` is process-global, so
788    /// the dns_proxy test then sees a *valid* env-supplied trust-anchor
789    /// path, `from_authority` succeeds, and the rejection assertion
790    /// fails. That is the race the cleanup-bundle-v1 / T2-5 entry calls
791    /// out as the "shared-tempdir" symptom.
792    ///
793    /// Mirrors the `EnvGuard` pattern in
794    /// `crate::resolver_refresh::dnssec::tests`: clear on construction,
795    /// restore on drop. Tests that route through `from_authority` MUST
796    /// hold one of these for the duration of the assertion.
797    struct EnvGuard {
798        prior: Option<String>,
799    }
800    impl EnvGuard {
801        fn new() -> Self {
802            let prior = std::env::var(ENV_TRUST_ANCHORS_PATH).ok();
803            std::env::remove_var(ENV_TRUST_ANCHORS_PATH);
804            Self { prior }
805        }
806    }
807    impl Drop for EnvGuard {
808        fn drop(&mut self) {
809            match self.prior.take() {
810                Some(v) => std::env::set_var(ENV_TRUST_ANCHORS_PATH, v),
811                None => std::env::remove_var(ENV_TRUST_ANCHORS_PATH),
812            }
813        }
814    }
815
816    fn authority_with_resolver(resolver: DnsResolver) -> DnsAuthority {
817        DnsAuthority {
818            resolvers: vec![resolver],
819            ..Default::default()
820        }
821    }
822
823    fn make_resolver(dnssec: Option<DnsResolverDnssecPolicy>) -> DnsResolver {
824        DnsResolver {
825            resolver_id: "test-resolver".into(),
826            endpoint: "127.0.0.1:53".into(),
827            protocol: DnsResolverProtocol::Do53Udp,
828            trust_kid: None,
829            dnssec,
830        }
831    }
832
833    /// ISC-21 / ISC-40 — `from_authority` returns `Ok(None)` when the
834    /// authority has no DNSSEC block on the first resolver (mode = off).
835    /// The proxy's hot path must remain byte-identical to pre-P3h.1 in
836    /// this case.
837    #[test]
838    fn from_authority_returns_none_when_dnssec_block_absent() {
839        let auth = authority_with_resolver(make_resolver(None));
840        let v = DataplaneDnssecValidator::from_authority(&auth).expect("ok");
841        assert!(
842            v.is_none(),
843            "mode=off (no dnssec block) MUST yield None so proxy hot path is unchanged"
844        );
845    }
846
847    /// ISC-22 — `from_authority` returns `Ok(None)` when `validate=false`
848    /// even if the dnssec block is present. The block is informational
849    /// only in that case.
850    #[test]
851    fn from_authority_returns_none_when_validate_false() {
852        let auth = authority_with_resolver(make_resolver(Some(DnsResolverDnssecPolicy {
853            validate: false,
854            fail_closed: false,
855            trust_anchors_path: None,
856        })));
857        let v = DataplaneDnssecValidator::from_authority(&auth).expect("ok");
858        assert!(
859            v.is_none(),
860            "validate=false MUST yield None — the block is observational only"
861        );
862    }
863
864    /// `from_authority` returns `Ok(None)` when the authority has zero
865    /// resolvers. Defensive — the supervisor production path always has
866    /// at least one resolver, but the validator must not panic on the
867    /// empty-resolvers shape.
868    #[test]
869    fn from_authority_returns_none_when_no_resolvers() {
870        let auth = DnsAuthority::default();
871        let v = DataplaneDnssecValidator::from_authority(&auth).expect("ok");
872        assert!(v.is_none(), "no resolvers MUST yield None");
873    }
874
875    /// ISC-23 / ISC-41 — `from_authority` returns `Ok(Some)` when the
876    /// first resolver requests require mode (validate=true,
877    /// fail_closed=true) with bundled IANA anchors (no
878    /// trust_anchors_path override).
879    #[test]
880    fn from_authority_returns_some_for_require_with_iana_defaults() {
881        // T2-5: assertion depends on the env var being unset (env path
882        // would supersede the `trust_anchors_path: None` spec and yield
883        // a non-IANA source).
884        let _guard = EnvGuard::new();
885        let auth = authority_with_resolver(make_resolver(Some(DnsResolverDnssecPolicy {
886            validate: true,
887            fail_closed: true,
888            trust_anchors_path: None,
889        })));
890        let v = DataplaneDnssecValidator::from_authority(&auth)
891            .expect("ok")
892            .expect("Some");
893        assert!(v.is_require_mode(), "fail_closed=true → require");
894        assert_eq!(
895            v.trust_anchor_source(),
896            "iana-default",
897            "no path override → bundled IANA defaults"
898        );
899    }
900
901    /// ISC-24 — `from_authority` returns `Ok(Some)` for best_effort
902    /// (validate=true, fail_closed=false). `is_require_mode` returns
903    /// false.
904    #[test]
905    fn from_authority_returns_some_for_best_effort() {
906        let auth = authority_with_resolver(make_resolver(Some(DnsResolverDnssecPolicy {
907            validate: true,
908            fail_closed: false,
909            trust_anchors_path: None,
910        })));
911        let v = DataplaneDnssecValidator::from_authority(&auth)
912            .expect("ok")
913            .expect("Some");
914        assert!(!v.is_require_mode(), "fail_closed=false → best_effort");
915    }
916
917    /// ISC-25 / ISC-42 — `from_authority` rejects an unreadable
918    /// trust-anchor path. The validator MUST surface
919    /// CellosError::InvalidSpec rather than constructing a
920    /// half-broken validator that would silently fail every query.
921    #[test]
922    fn from_authority_rejects_missing_trust_anchor_path() {
923        // T2-5: this test was flaking under `cargo test --test-threads >=2`.
924        // Root cause: `TrustAnchors::load` gives the
925        // `CELLOS_DNSSEC_TRUST_ANCHORS_PATH` env var precedence over the
926        // spec path; a sibling test in `resolver_refresh::dnssec::tests`
927        // (`loads_path_with_o_nofollow_unix`) sets that env var to point
928        // at *its own* tempdir, and if it ran between this test's
929        // `tempdir()` and the `from_authority` call the bogus spec was
930        // silently superseded by the sibling's valid path → assertion
931        // failed. The `EnvGuard` clears + restores the env var so this
932        // test's spec is the only signal that reaches the loader.
933        // tempdirs themselves are already per-test (`tempfile::tempdir`
934        // hands each caller a uniquely-named directory).
935        let _guard = EnvGuard::new();
936        let dir = tempdir().expect("tempdir");
937        let bogus = dir.path().join("does-not-exist.bin");
938        let auth = authority_with_resolver(make_resolver(Some(DnsResolverDnssecPolicy {
939            validate: true,
940            fail_closed: true,
941            trust_anchors_path: Some(bogus.to_string_lossy().into_owned()),
942        })));
943        let err = DataplaneDnssecValidator::from_authority(&auth)
944            .expect_err("missing trust-anchor path MUST be rejected at activation");
945        let msg = format!("{err}");
946        assert!(
947            msg.contains("trust anchors") && msg.contains("does-not-exist.bin"),
948            "rejection must mention the path for operator triage; got {msg}"
949        );
950    }
951
952    /// ISC-26 / ISC-43 — `from_authority` rejects a symlinked
953    /// trust-anchor path (mirrors the W6 SEC-25 / P3h discipline —
954    /// O_NOFOLLOW refuses to follow a swapped-in symlink at the final
955    /// path component).
956    #[cfg(unix)]
957    #[test]
958    fn from_authority_rejects_symlinked_trust_anchor_path() {
959        // T2-5: same env-var race as
960        // `from_authority_rejects_missing_trust_anchor_path` above —
961        // see that test's comment for the full root-cause writeup.
962        let _guard = EnvGuard::new();
963        let dir = tempdir().expect("tempdir");
964        let real = dir.path().join("real-anchor.bin");
965        let link = dir.path().join("symlinked-anchor.bin");
966        std::fs::write(&real, b"REAL-KEY-BYTES").expect("write real anchor");
967        std::os::unix::fs::symlink(&real, &link).expect("create symlink");
968
969        let auth = authority_with_resolver(make_resolver(Some(DnsResolverDnssecPolicy {
970            validate: true,
971            fail_closed: true,
972            trust_anchors_path: Some(link.to_string_lossy().into_owned()),
973        })));
974        let err = DataplaneDnssecValidator::from_authority(&auth)
975            .expect_err("symlinked trust-anchor path MUST be rejected by O_NOFOLLOW");
976        let msg = format!("{err}");
977        assert!(
978            msg.contains(link.to_str().unwrap()),
979            "rejection must include the symlinked path; got {msg}"
980        );
981    }
982
983    /// `from_authority` rejects an unparseable resolver endpoint. The
984    /// supervisor production path puts a valid `host:port` in the
985    /// endpoint, but the validator must not panic on garbage input.
986    #[test]
987    fn from_authority_rejects_unparseable_endpoint() {
988        let mut resolver = make_resolver(Some(DnsResolverDnssecPolicy {
989            validate: true,
990            fail_closed: true,
991            trust_anchors_path: None,
992        }));
993        resolver.endpoint = "not-a-socket-addr".into();
994        let auth = authority_with_resolver(resolver);
995        let err = DataplaneDnssecValidator::from_authority(&auth)
996            .expect_err("unparseable endpoint MUST be rejected");
997        let msg = format!("{err}");
998        assert!(
999            msg.contains("resolver.endpoint"),
1000            "rejection must reference the field for operator triage; got {msg}"
1001        );
1002    }
1003
1004    /// `validate()` returns `Skip` for a TXT query (qtype=16) — the
1005    /// validator only handles A/AAAA today. The proxy's call-site
1006    /// policy decides what to do (forward in best_effort, SERVFAIL in
1007    /// require).
1008    #[test]
1009    fn validate_returns_skip_for_non_a_aaaa_qtype() {
1010        let backend: Arc<DataplaneDnssecBackend> =
1011            Arc::new(|_h, _t| panic!("backend MUST NOT be called for non-A/AAAA"));
1012        let v = DataplaneDnssecValidator::with_backend(true, "iana-default".into(), backend);
1013        let q = build_query("api.example.com", 16); // TXT
1014        let outcome = v.validate(&q, &[]);
1015        assert!(
1016            matches!(outcome, DataplaneDnssecOutcome::Skip),
1017            "TXT must Skip; got {outcome:?}"
1018        );
1019    }
1020
1021    /// `validate()` maps `Validated` → `Validated`.
1022    #[test]
1023    fn validate_maps_validated_outcome() {
1024        let backend: Arc<DataplaneDnssecBackend> = Arc::new(|_h, _t| {
1025            Ok(DnssecValidationResult::Validated {
1026                algorithm: "RSASHA256".into(),
1027                key_tag: 12345,
1028            })
1029        });
1030        let v = DataplaneDnssecValidator::with_backend(false, "iana-default".into(), backend);
1031        let q = build_query("api.example.com", 1); // A
1032        assert!(matches!(
1033            v.validate(&q, &[]),
1034            DataplaneDnssecOutcome::Validated
1035        ));
1036    }
1037
1038    /// `validate()` maps `Unsigned` → `Unsigned`.
1039    #[test]
1040    fn validate_maps_unsigned_outcome() {
1041        let backend: Arc<DataplaneDnssecBackend> =
1042            Arc::new(|_h, _t| Ok(DnssecValidationResult::Unsigned));
1043        let v = DataplaneDnssecValidator::with_backend(false, "iana-default".into(), backend);
1044        let q = build_query("api.example.com", 1);
1045        assert!(matches!(
1046            v.validate(&q, &[]),
1047            DataplaneDnssecOutcome::Unsigned
1048        ));
1049    }
1050
1051    /// `validate()` maps `Failed{..}` → `Failed{reason:"validation_failed"}`.
1052    #[test]
1053    fn validate_maps_failed_outcome() {
1054        let backend: Arc<DataplaneDnssecBackend> = Arc::new(|_h, _t| {
1055            Ok(DnssecValidationResult::Failed {
1056                reason: "synthetic".into(),
1057            })
1058        });
1059        let v = DataplaneDnssecValidator::with_backend(true, "iana-default".into(), backend);
1060        let q = build_query("api.example.com", 1);
1061        let outcome = v.validate(&q, &[]);
1062        assert!(
1063            matches!(outcome, DataplaneDnssecOutcome::Failed { reason } if reason == "validation_failed"),
1064            "Failed must map to validation_failed; got {outcome:?}"
1065        );
1066    }
1067
1068    /// `validate()` fails closed when the backend returns an I/O error
1069    /// (timeout, transport). The require-mode contract demands no
1070    /// unvalidated answer ever reaches the workload.
1071    #[test]
1072    fn validate_fails_closed_on_backend_io_error() {
1073        let backend: Arc<DataplaneDnssecBackend> =
1074            Arc::new(|_h, _t| Err(std::io::Error::other("synthetic-transport-failure")));
1075        let v = DataplaneDnssecValidator::with_backend(true, "iana-default".into(), backend);
1076        let q = build_query("api.example.com", 1);
1077        let outcome = v.validate(&q, &[]);
1078        assert!(
1079            matches!(outcome, DataplaneDnssecOutcome::Failed { .. }),
1080            "I/O error MUST fail closed; got {outcome:?}"
1081        );
1082    }
1083
1084    /// `validate()` returns `Skip` for a malformed query. Defensive —
1085    /// the proxy already drops malformed packets before calling us, but
1086    /// the validator must not panic.
1087    #[test]
1088    fn validate_returns_skip_for_malformed_query() {
1089        let backend: Arc<DataplaneDnssecBackend> =
1090            Arc::new(|_h, _t| panic!("backend MUST NOT be called for malformed query"));
1091        let v = DataplaneDnssecValidator::with_backend(false, "iana-default".into(), backend);
1092        let outcome = v.validate(&[0u8; 4], &[]); // < 12-byte header
1093        assert!(matches!(outcome, DataplaneDnssecOutcome::Skip));
1094    }
1095
1096    /// Builds a minimal DNS query packet for tests — same shape as the
1097    /// proxy module's existing `build_query_packet` helper but local
1098    /// here so the validator's tests are self-contained.
1099    fn build_query(qname: &str, qtype: u16) -> Vec<u8> {
1100        let mut p = Vec::new();
1101        p.extend_from_slice(&[
1102            0xab, 0xcd, 0x01, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
1103        ]);
1104        for label in qname.split('.') {
1105            p.push(label.len() as u8);
1106            p.extend_from_slice(label.as_bytes());
1107        }
1108        p.push(0);
1109        p.extend_from_slice(&qtype.to_be_bytes());
1110        p.extend_from_slice(&[0x00, 0x01]);
1111        p
1112    }
1113
1114    // ========================================================================
1115    // A2 — typed-record validation unit tests
1116    // ========================================================================
1117    //
1118    // These tests exercise `validate()`'s post-A2 dispatch shape: when the
1119    // validator was built via `with_backends(_, _, backend, typed_backend)`,
1120    // queries for CNAME/HTTPS/SVCB/MX/TXT route to `typed_backend` and
1121    // produce typed `Validated/Unsigned/Failed` outcomes instead of `Skip`.
1122    //
1123    // The legacy `with_backend` (no typed backend) shape is exercised by
1124    // `validate_returns_skip_for_non_a_aaaa_qtype` above — TXT continues
1125    // to Skip when no typed backend is wired, preserving backward
1126    // compatibility for tests/operators who only configured A/AAAA.
1127
1128    /// A2 — `validate()` dispatches CNAME (qtype=5) to the typed backend
1129    /// and maps `Validated → Validated`. The legacy A/AAAA backend MUST
1130    /// NOT be called.
1131    #[test]
1132    fn validate_routes_cname_to_typed_backend() {
1133        let aaaa_backend: Arc<DataplaneDnssecBackend> =
1134            Arc::new(|_h, _t| panic!("A/AAAA backend MUST NOT be called for CNAME"));
1135        let typed_backend: Arc<DataplaneTypedDnssecBackend> = Arc::new(|_h, rt| {
1136            assert_eq!(rt, RecordType::CNAME, "typed backend MUST receive CNAME");
1137            Ok(DnssecValidationResult::Validated {
1138                algorithm: "RSASHA256".into(),
1139                key_tag: 12345,
1140            })
1141        });
1142        let v = DataplaneDnssecValidator::with_backends(
1143            true,
1144            "iana-default".into(),
1145            aaaa_backend,
1146            typed_backend,
1147        );
1148        let q = build_query("api.example.com", 5); // CNAME
1149        assert!(matches!(
1150            v.validate(&q, &[]),
1151            DataplaneDnssecOutcome::Validated
1152        ));
1153    }
1154
1155    /// A2 — `validate()` dispatches HTTPS (qtype=65) to the typed backend
1156    /// and maps `Unsigned → Unsigned`.
1157    #[test]
1158    fn validate_routes_https_to_typed_backend_unsigned() {
1159        let aaaa_backend: Arc<DataplaneDnssecBackend> =
1160            Arc::new(|_h, _t| panic!("A/AAAA backend MUST NOT be called for HTTPS"));
1161        let typed_backend: Arc<DataplaneTypedDnssecBackend> = Arc::new(|_h, rt| {
1162            assert_eq!(rt, RecordType::HTTPS, "typed backend MUST receive HTTPS");
1163            Ok(DnssecValidationResult::Unsigned)
1164        });
1165        let v = DataplaneDnssecValidator::with_backends(
1166            false,
1167            "iana-default".into(),
1168            aaaa_backend,
1169            typed_backend,
1170        );
1171        let q = build_query("api.example.com", 65); // HTTPS
1172        assert!(matches!(
1173            v.validate(&q, &[]),
1174            DataplaneDnssecOutcome::Unsigned
1175        ));
1176    }
1177
1178    /// A2 — `validate()` dispatches SVCB (qtype=64) to the typed backend
1179    /// and maps `Failed → Failed{validation_failed}`.
1180    #[test]
1181    fn validate_routes_svcb_to_typed_backend_failed() {
1182        let aaaa_backend: Arc<DataplaneDnssecBackend> =
1183            Arc::new(|_h, _t| panic!("A/AAAA backend MUST NOT be called for SVCB"));
1184        let typed_backend: Arc<DataplaneTypedDnssecBackend> = Arc::new(|_h, rt| {
1185            assert_eq!(rt, RecordType::SVCB, "typed backend MUST receive SVCB");
1186            Ok(DnssecValidationResult::Failed {
1187                reason: "synthetic-bogus".into(),
1188            })
1189        });
1190        let v = DataplaneDnssecValidator::with_backends(
1191            true,
1192            "iana-default".into(),
1193            aaaa_backend,
1194            typed_backend,
1195        );
1196        let q = build_query("api.example.com", 64); // SVCB
1197        let outcome = v.validate(&q, &[]);
1198        assert!(
1199            matches!(outcome, DataplaneDnssecOutcome::Failed { reason } if reason == "validation_failed"),
1200            "SVCB Failed → validation_failed; got {outcome:?}"
1201        );
1202    }
1203
1204    /// A2 — `validate()` dispatches MX (qtype=15) to the typed backend
1205    /// and fails closed when the backend returns an I/O error. Mirrors
1206    /// the existing A/AAAA fail-closed contract.
1207    #[test]
1208    fn validate_routes_mx_to_typed_backend_io_error_fails_closed() {
1209        let aaaa_backend: Arc<DataplaneDnssecBackend> =
1210            Arc::new(|_h, _t| panic!("A/AAAA backend MUST NOT be called for MX"));
1211        let typed_backend: Arc<DataplaneTypedDnssecBackend> = Arc::new(|_h, rt| {
1212            assert_eq!(rt, RecordType::MX, "typed backend MUST receive MX");
1213            Err(std::io::Error::other("synthetic-transport-failure"))
1214        });
1215        let v = DataplaneDnssecValidator::with_backends(
1216            true,
1217            "iana-default".into(),
1218            aaaa_backend,
1219            typed_backend,
1220        );
1221        let q = build_query("mail.example.com", 15); // MX
1222        let outcome = v.validate(&q, &[]);
1223        assert!(
1224            matches!(outcome, DataplaneDnssecOutcome::Failed { reason } if reason == "validation_failed"),
1225            "MX I/O error MUST fail closed; got {outcome:?}"
1226        );
1227    }
1228
1229    /// A2 — `validate()` dispatches TXT (qtype=16) to the typed backend
1230    /// when one is wired. Confirms that the legacy
1231    /// `validate_returns_skip_for_non_a_aaaa_qtype` Skip behaviour was
1232    /// the consequence of the test using the no-typed-backend constructor,
1233    /// not a hardcoded TXT skip.
1234    #[test]
1235    fn validate_routes_txt_to_typed_backend_validated() {
1236        let aaaa_backend: Arc<DataplaneDnssecBackend> =
1237            Arc::new(|_h, _t| panic!("A/AAAA backend MUST NOT be called for TXT"));
1238        let typed_backend: Arc<DataplaneTypedDnssecBackend> = Arc::new(|_h, rt| {
1239            assert_eq!(rt, RecordType::TXT, "typed backend MUST receive TXT");
1240            Ok(DnssecValidationResult::Validated {
1241                algorithm: "ED25519".into(),
1242                key_tag: 60_999,
1243            })
1244        });
1245        let v = DataplaneDnssecValidator::with_backends(
1246            true,
1247            "iana-default".into(),
1248            aaaa_backend,
1249            typed_backend,
1250        );
1251        let q = build_query("api.example.com", 16); // TXT
1252        assert!(matches!(
1253            v.validate(&q, &[]),
1254            DataplaneDnssecOutcome::Validated
1255        ));
1256    }
1257
1258    /// A2 — `validate()` continues to `Skip` for query types outside the
1259    /// first-class set (NS, PTR, SRV, etc.) even when both backends are
1260    /// wired. The first-class set is bounded; residuals stay on the
1261    /// pre-A2 Skip path.
1262    #[test]
1263    fn validate_skips_residual_types_even_with_typed_backend() {
1264        let aaaa_backend: Arc<DataplaneDnssecBackend> =
1265            Arc::new(|_h, _t| panic!("A/AAAA backend MUST NOT be called for NS"));
1266        let typed_backend: Arc<DataplaneTypedDnssecBackend> =
1267            Arc::new(|_h, _rt| panic!("typed backend MUST NOT be called for NS"));
1268        let v = DataplaneDnssecValidator::with_backends(
1269            true,
1270            "iana-default".into(),
1271            aaaa_backend,
1272            typed_backend,
1273        );
1274        let q = build_query("ns1.example.com", 2); // NS — outside first-class set
1275        assert!(
1276            matches!(v.validate(&q, &[]), DataplaneDnssecOutcome::Skip),
1277            "NS (qtype=2) MUST Skip even when typed backend is wired"
1278        );
1279    }
1280
1281    /// A2 — `from_authority` populates BOTH backends so the production
1282    /// validator covers all seven first-class types end-to-end. We can't
1283    /// observe the backend Arc directly through the public API, but the
1284    /// debug impl exposes whether `typed_backend` is `Some` — assert the
1285    /// shape via the rendered debug string.
1286    #[test]
1287    fn from_authority_populates_typed_backend_in_require_mode() {
1288        let auth = authority_with_resolver(make_resolver(Some(DnsResolverDnssecPolicy {
1289            validate: true,
1290            fail_closed: true,
1291            trust_anchors_path: None,
1292        })));
1293        let v = DataplaneDnssecValidator::from_authority(&auth)
1294            .expect("ok")
1295            .expect("Some");
1296        let dbg = format!("{v:?}");
1297        assert!(
1298            dbg.contains("typed_backend: Some"),
1299            "from_authority MUST populate typed_backend so post-A2 validators \
1300             validate CNAME/HTTPS/SVCB/MX/TXT end-to-end; got {dbg}"
1301        );
1302    }
1303}