pub struct OutputSafetyConfig {
pub enabled: bool,
pub toxicity_enabled: bool,
pub toxicity_threshold: f32,
pub block_on_critical: bool,
pub hallucination_enabled: bool,
pub hallucination_model: String,
pub hallucination_threshold: f32,
pub hallucination_min_response_length: usize,
}Expand description
Output safety configuration for response content analysis.
When enabled, the proxy analyses LLM response content for toxicity, PII leakage, secret exposure, and hallucination detection. This is a post-processing step that runs after the upstream response is received.
§Example (YAML)
output_safety:
enabled: true
toxicity_enabled: true
toxicity_threshold: 0.7
block_on_critical: false
hallucination_enabled: false
hallucination_model: "vectara/hallucination_evaluation_model"
hallucination_threshold: 0.5
hallucination_min_response_length: 50Fields§
§enabled: boolEnable output safety analysis on LLM responses.
toxicity_enabled: boolEnable toxicity detection on response content.
toxicity_threshold: f32Confidence threshold for toxicity detection (0.0–1.0).
block_on_critical: boolBlock (replace) the response if critical toxicity is detected.
hallucination_enabled: boolEnable hallucination detection on response content.
When enabled, response sentences are scored against the user’s prompt for factual consistency using a cross-encoder model.
hallucination_model: StringHuggingFace model ID for hallucination detection.
hallucination_threshold: f32Threshold below which a sentence is considered potentially hallucinated (0.0–1.0). Sentences scoring below this are flagged.
hallucination_min_response_length: usizeMinimum response length (in characters) to run hallucination detection. Responses shorter than this are skipped to save compute.
Trait Implementations§
Source§impl Clone for OutputSafetyConfig
impl Clone for OutputSafetyConfig
Source§fn clone(&self) -> OutputSafetyConfig
fn clone(&self) -> OutputSafetyConfig
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more