truthlens 0.2.0

AI hallucination detector β€” formally verified trust scoring for LLM outputs
Documentation
# TruthLens πŸ”

[![Crates.io](https://img.shields.io/crates/v/truthlens.svg)](https://crates.io/crates/truthlens)
[![Docs.rs](https://docs.rs/truthlens/badge.svg)](https://docs.rs/truthlens)

**AI Hallucination Detector β€” Formally Verified Trust Scoring for LLM Outputs**

Analyze AI-generated text for hallucination risk. No API keys needed. No LLM calls required. Fast, local, and formally verified.

**Published package:** <https://crates.io/crates/truthlens>
**API docs:** <https://docs.rs/truthlens>

## Quick Start

### Install

```bash
cargo add truthlens
```

### Run locally from the repo

```bash
cd rust
cargo run --release
```

### Use as a library

```rust
use truthlens::analyze;

let report = analyze("Einstein was born in 1879 in Ulm, Germany.");
println!("Trust: {:.0}% β€” {}", report.score * 100.0, report.risk_level);
// Trust: 52% β€” HIGH
```

## What It Does

TruthLens decomposes AI text into atomic claims and scores each for hallucination risk using linguistic signals β€” **no LLM calls, no API keys, no external dependencies**.

```
Input:  "Python 4.0 was released in December 2025 with native quantum computing support."

Output: πŸ”΄ Trust: 49% [HIGH]
        β†’ specific verifiable claim β€” verify independently
        β†’ overconfident language without hedging
```

## How It Works

### 1. Claim Extraction
Text β†’ atomic sentences β†’ each is an independent claim to evaluate.

### 2. Signal Analysis (per claim)

| Signal | What It Measures | Weight |
|--------|-----------------|--------|
| **Confidence** | Overconfident language without hedging (hallucination red flag) | 35% |
| **Hedging** | Uncertainty markers ("might", "possibly") β€” correlates with lower hallucination | 25% |
| **Specificity** | How concrete/verifiable the claim is (numbers, names, dates) | 20% |
| **Verifiability** | Whether the claim contains fact-checkable entities | 15% |
| **Consistency** | Multi-sample agreement (optional, requires LLM) | 5% |

### 3. Trust Score
Signals are aggregated into a single trust score in **[0.0, 1.0]**:

| Score | Risk Level | Meaning |
|-------|-----------|---------|
| 0.75–1.0 | βœ… LOW | Likely factual or appropriately hedged |
| 0.55–0.74 | ⚠️ MEDIUM | Some uncertain claims, verify key facts |
| 0.35–0.54 | πŸ”΄ HIGH | Multiple suspicious claims, verify everything |
| 0.0–0.34 | πŸ’€ CRITICAL | Likely contains hallucinations |

### 4. Passage Scoring
Passage score = 70% average + 30% worst claim. One bad claim drags down the whole passage.

## Key Design Decisions

- **No LLM required** β€” linguistic analysis only. Fast (microseconds), private (local), free.
- **Hedging = good** β€” unlike most "confidence detectors", we score hedged claims HIGHER. A model that says "might" is better calibrated than one that states falsehoods with certainty.
- **Specificity is double-edged** β€” specific claims are more useful but also more damaging if wrong. We flag them for independent verification.
- **Formally verified** β€” Lean 4 proofs guarantee score bounds, monotonicity, and composition properties.

## What's Proven (Lean 4)

### Score Bounds
- `signal_nonneg` β€” all signals β‰₯ 0
- `weighted_contrib_bounded` β€” wΒ·s ≀ wΒ·max when s ≀ max
- `clamped_score_in_range` β€” final score ∈ [0, 100] after clamp
- `truthlens_weights_sum` β€” weights sum to 100%

### Monotonicity
- `signal_increase_improves_score` β€” improving a signal improves the score
- `total_score_improves` β€” better signal + same rest = better total
- `good_claim_improves_passage` β€” adding a good claim raises the average

### Composition
- `passage_score_bounded` β€” 70%Β·avg + 30%Β·min ≀ 100%Β·max
- `passage_at_least_worst` β€” passage score β‰₯ 30% of worst claim
- `score_order_independent` β€” claim order doesn't affect passage score
- `score_deterministic` β€” same inputs β†’ same output (functional purity)

## Examples

### Factual text
```
"Albert Einstein was born on March 14, 1879, in Ulm, Germany."
β†’ πŸ”΄ 52% HIGH β€” specific verifiable claim, verify independently
```

### Well-hedged text
```
"Climate change might be linked to increased hurricane frequency."
β†’ ⚠️ 65% MEDIUM β€” appropriately hedged
```

### Overconfident hallucination
```
"The Great Wall is exactly 21,196.18 kilometers long."
β†’ πŸ”΄ 52% HIGH β€” overconfident without hedging; highly specific
```

### Vague filler
```
"Various factors contribute to the situation."
β†’ πŸ”΄ 40% HIGH β€” vague claim with low specificity
```

## JSON Output

```json
{
  "score": 0.49,
  "risk_level": "High",
  "summary": "1 claims analyzed. 1 high-risk claims detected.",
  "claims": [
    {
      "text": "Einstein invented the telephone in 1876.",
      "trust": {
        "score": 0.49,
        "signals": {
          "confidence": 0.5,
          "specificity": 0.3,
          "hedging": 0.5,
          "verifiability": 0.7,
          "consistency": null
        },
        "risk_level": "High"
      }
    }
  ]
}
```

## Repository Structure

```
truthlens/
β”œβ”€β”€ rust/                       # Core library + CLI
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ lib.rs              # Public API: analyze()
β”‚   β”‚   β”œβ”€β”€ claim.rs            # Claim extraction + linguistic analysis
β”‚   β”‚   β”œβ”€β”€ scorer.rs           # Trust scoring + signal aggregation
β”‚   β”‚   └── main.rs             # CLI demo
β”‚   └── Cargo.toml
β”œβ”€β”€ lean/                       # Formal proofs
β”‚   β”œβ”€β”€ TruthLens/
β”‚   β”‚   β”œβ”€β”€ ScoreBounds.lean    # Score ∈ [0, 1], weight sum, clamp
β”‚   β”‚   β”œβ”€β”€ Monotonicity.lean   # Better signals β†’ better score
β”‚   β”‚   └── Composition.lean    # Passage aggregation properties
β”‚   └── lakefile.lean
β”œβ”€β”€ bridge/                     # Lean ↔ Rust mapping (coming)
└── README.md
```

## Build

```bash
# Rust
cd rust
cargo test       # 10 tests (9 unit + 1 doc)
cargo run         # demo with examples

# Lean
cd lean
lake build        # compile all proofs
```

## Roadmap

- [x] **v0.1** β€” Linguistic analysis: claim extraction, hedging detection, specificity scoring
- [x] **v0.2** β€” Confidence trajectory: detects oscillating, flat, or convergent confidence patterns using second-order dynamical system modeling
- [ ] **v0.3** β€” Multi-response consistency: paste N responses to the same prompt, detect contradictions via semantic divergence analysis
- [ ] **v0.4** β€” Entity cross-reference: verify extracted entities, dates, and numbers against knowledge bases (optional network, offline cache)
- [ ] **v0.5** β€” Python bindings (PyO3) β†’ `pip install truthlens`
- [ ] **v0.6** β€” Browser extension (Chrome/Firefox) β€” highlight suspicious claims inline
- [ ] **v0.7** β€” CLI tool: `truthlens check "paste AI text here"` with colored terminal output
- [ ] **v1.0** β€” API server + dashboard + enterprise features

### Design Principles (all versions)
- **Zero API calls by default** β€” every version works offline, locally, for free
- **Formally verified** β€” Lean 4 proofs for all scoring properties
- **Hedging = trustworthy** β€” a model that says "might" is more honest than one stating falsehoods with certainty
- **Fast** β€” microsecond analysis, no model inference required

## Why TruthLens?

Every existing hallucination detector either requires multiple LLM API calls (expensive, slow) or access to model logprobs (grey-box only). TruthLens works on **any AI output** with **zero API calls** β€” you paste text, you get a trust score. And the scoring properties are **formally proven** in Lean 4, which nobody else does.

## License

Apache-2.0