Struct SearchConfig

Source

pub struct SearchConfig {Show 18 fields
    pub enabled: bool,
    pub searxng_url: Option<String>,
    pub timeout_ms: u64,
    pub default_limit: u32,
    pub max_limit: u32,
    pub research_engines: Vec<String>,
    pub github_engines: Vec<String>,
    pub rerank_enabled: bool,
    pub query_expand: bool,
    pub query_expand_variants: usize,
    pub multi_round: bool,
    pub passage_select: bool,
    pub page2_fallback: bool,
    pub answer_calibrated: bool,
    pub answer_guarded: bool,
    pub use_structured_sources: bool,
    pub wikidata_lookup: bool,
    pub snippet_fallback: bool,
}

Expand description

Configuration for the /v1/search endpoint and its SearXNG backend.

When searxng_url is unset the endpoint returns HTTP 503 with error_code: "search_disabled" — the route remains mounted so that startup doesn’t have to know whether search will ever be configured.

Fields§

§enabled: bool

Master switch. Defaults to true; set to false to refuse all /v1/search requests even if searxng_url is configured.

§searxng_url: Option<String>

Base URL of the SearXNG instance (e.g. http://searxng:8080). None (the default) disables the endpoint with a clear error.

§timeout_ms: u64

End-to-end timeout for the SearXNG call in milliseconds.

§default_limit: u32

Default limit when the request omits it.

§max_limit: u32

Hard cap on limit per request. SaaS uses 20.

§research_engines: Vec<String>

SearXNG engines invoked when the request includes categories: ["research"]. Defaults match the SaaS implementation.

§github_engines: Vec<String>

SearXNG engines invoked when the request includes categories: ["github"].

§rerank_enabled: bool

Re-rank the flat result pool for the LLM answer / summarize path (RRF + junk/coverage/geo filter + BM25 + domain dedupe) instead of the raw SearXNG-score sort. Defaults to true. The plain (non-LLM) path is unaffected and keeps SaaS byte-parity regardless of this flag.

§query_expand: bool

Multi-query expansion for the LLM answer / summarize path: before the SearXNG fetch, generate an entity/keyword-focused rewrite of the query, fetch both the original and the rewrite, and UNION the candidate pools (recall can only increase — the original’s results are always kept). Targets “retrieval-miss” failures where the answer’s source never surfaced for the user’s phrasing. Costs one extra small LLM call + one extra SearXNG fetch. Defaults to false (gated); the plain path and the answer layer are untouched, so precision/SaaS-parity are preserved.

§query_expand_variants: usize

Number of LLM-generated query rewrites to fetch + union when query_expand is on. 1 reproduces the original single-variant behavior. Higher values request more DIVERSE reformulations (abbreviation/acronym-expanded, keyword-focused) and fetch their pools in parallel, raising recall on retrieval-miss queries (e.g. an unexpanded acronym whose page never surfaced) at the cost of one extra SearXNG fetch each. Clamped to MAX_QUERY_EXPAND_VARIANTS in the route.

§multi_round: bool

Adaptive multi-round retrieval (the “evidence-scout” loop). When the round-1 answer ABSTAINS (sources lacked the fact), an LLM scout reads the round-1 evidence and emits targeted follow-up queries (acronym-expanded, exact-entity, predicate/date-specific); their results are scraped, unioned into the pool, and the answer is re-synthesized ONCE. Bounded (one extra round, capped follow-up queries) so worst-case stays within the request deadline. Only fires on abstention, so ~most queries keep the single-shot fast path. Recall-only + monotone-safe: a still-abstaining round-2 is discarded, keeping round-1. Targets “the answer page never entered the first pool” — the dominant remaining miss. Defaults to false (gated).

§passage_select: bool

Passage-level relevance gate for the LLM answer path: split each scraped source into passages and feed the answer LLM only the query-relevant ones (DeepSeek-scored, no new ML deps). Subtractive — removes noise, never adds sources or forces commits; falls back to the full source on any failure (byte-identical to off), so it is monotone-safe. Defaults to false (gated); answer prompt + plain path untouched.

§page2_fallback: bool

Page-2 fallback for the LLM answer / summarize path: if the reranked (junk-filtered, deduped) candidate pool comes back thinner than the answer needs (< answer_top_n), fetch the SAME query’s SearXNG page 2 once and union it in, then re-rank. The trigger is evaluated POST-rerank, so a junk-heavy first page does not suppress it; the extra fetch only fires on already-under-yielding queries (QPS never doubles across the corpus). Recall-only + abstention is untouched (a sparse page1+page2 pool still abstains). Defaults to false (gated); requires rerank_enabled.

§answer_calibrated: bool

Calibrated answer path (gated): reduce recoverable OVER-abstentions by (a) feeding more sources to the answer LLM by default (top_n 5->8, so the answer in result #6-8 or behind a failed top-5 scrape still reaches it) and (b) swapping the answer prompt’s abstention rule for an anti-hedge variant — commit when the sources DO contain the answer (even indirectly / one inference step), abstain ONLY when they genuinely lack it. The “use ONLY sources” grounding is untouched, so this is the precise inverse of the cycle-1 blunt “always commit” failure (which forced commits on no-source cases). Default false; A/B with an INCORRECT-guard before flip.

§answer_guarded: bool

Moat-hardening abstention (gated). Appends a clause making the answer model (a) REJECT a false/unverifiable premise instead of answering as though it were true, (b) report when sources CONFLICT rather than picking one confidently, and (c) abstain when not confident. Targets the adversarial failure SealQA Seal-0 exposed: 32% confident-WRONG (hallucination) on conflicting-source / false-premise questions, where the “use ONLY sources” rule alone is insufficient. Complements (does not replace) answer_calibrated. Default false; A/B requires Seal-0 hallucination DOWN with SimpleQA accuracy NOT regressed before flip.

§use_structured_sources: bool

Use SearXNG structured sources (gated, W0). SearXNG’s infoboxes[] / answers[] arrays carry Wikidata/Wikipedia knowledge-panel facts (entity attributes like religion/capital/director) that the results[] transform path discards. With this on, those facts are parsed and pinned as a high-trust source at the FRONT of the answer pool (still UNTRUSTED-wrapped — widens evidence, never bypasses the safety wrapper). Targets the obscure-entity recall gap (PopQA). Default false; A/B on diag500 gold-in-sources with the wrong-non-abstain invariant before flip.

§wikidata_lookup: bool

Deterministic Wikidata entity-relation lookup (gated, W3). For <relation> of <entity> questions (PopQA’s obscure long tail that web search can’t surface), classify -> wbsearchentities -> property fetch and pin the fact as a structured source (UNTRUSTED-wrapped, runs in parallel with SearXNG, 3s-bounded, any error falls through). Free open data, no AI, no SPARQL hot-path. Default false; A/B on diag500 PopQA accuracy + the wrong-non-abstain invariant before flip.

§snippet_fallback: bool

Snippet fallback for the LLM answer path (gated): when a top-N result’s scrape failed (empty markdown), the result is normally dropped from the answer pool — if it was the answer-bearing page, crw abstains though retrieval succeeded (diagnosed Pattern A). With this on, such results fall back to their SearXNG description snippet as a thin source instead of vanishing. The snippet is verbatim upstream text, so it cannot inject a fact not already present — near-zero INCORRECT exposure. Default false.

SearchConfig

Struct SearchConfig Copy item path

Fields§

Trait Implementations§

impl Clone for SearchConfig

fn clone(&self) -> SearchConfig

fn clone_from(&mut self, source: &Self)

impl Debug for SearchConfig

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Default for SearchConfig

fn default() -> Self

impl<'de> Deserialize<'de> for SearchConfig

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where __D: Deserializer<'de>,

Auto Trait Implementations§

impl Freeze for SearchConfig

impl RefUnwindSafe for SearchConfig

impl Send for SearchConfig

impl Sync for SearchConfig

impl Unpin for SearchConfig

impl UnsafeUnpin for SearchConfig

impl UnwindSafe for SearchConfig

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> DeserializeOwned for Twhere T: for<'de> Deserialize<'de>,

impl<T> From<T> for T

fn from(t: T) -> T

impl<T> Instrument for T

fn instrument(self, span: Span) -> Instrumented<Self>

fn in_current_span(self) -> Instrumented<Self>

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> PolicyExt for Twhere T: ?Sized,

fn and<P, B, E>(self, other: P) -> And<T, P>where T: Sized + Policy<B, E>, P: Policy<B, E>,

fn or<P, B, E>(self, other: P) -> Or<T, P>where T: Sized + Policy<B, E>, P: Policy<B, E>,

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<T> WithSubscriber for T

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>where S: Into<Dispatch>,

fn with_current_subscriber(self) -> WithDispatch<Self>

Struct SearchConfig

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where __D: Deserializer<'de>,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> PolicyExt for T
where T: ?Sized,

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Sized + Policy<B, E>, P: Policy<B, E>,

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Sized + Policy<B, E>, P: Policy<B, E>,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,