truthlens 0.5.0

AI hallucination detector β€” formally verified trust scoring for LLM outputs
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
# TruthLens πŸ”

[![Crates.io](https://img.shields.io/crates/v/truthlens.svg)](https://crates.io/crates/truthlens)
[![Docs.rs](https://docs.rs/truthlens/badge.svg)](https://docs.rs/truthlens)

**AI Hallucination Detector β€” Formally Verified Trust Scoring for LLM Outputs**

Analyze AI-generated text for hallucination risk. No API keys needed. No LLM calls required. Fast, local, formally verified, and color-coded terminal output.

**Published package:** <https://crates.io/crates/truthlens>
**API docs:** <https://docs.rs/truthlens>

## Quick Start

### Install as CLI

```bash
cargo install truthlens
```

### Usage

```bash
# Analyze text directly
truthlens "Einstein invented the telephone in 1876."
#  Trust: 49% [β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘] HIGH
#  πŸ”΄ Claim 1: 49% β€” specific verifiable claim β€” verify independently

# JSON output (for scripts/API integration)
truthlens --json "Python 4.0 has quantum computing support."

# Pipe from file or other commands
cat ai_response.txt | truthlens

# Pipe from clipboard (macOS)
pbpaste | truthlens

# Analyze ChatGPT/Claude output saved to file
curl -s "https://api.example.com/chat" | truthlens --json

# Compare multiple AI responses for contradictions
truthlens --consistency "response 1" "response 2" "response 3"

# Run built-in demo examples
truthlens --demo
```

### Use as a Rust library

```rust
use truthlens::analyze;

let report = analyze("Einstein was born in 1879 in Ulm, Germany.");
println!("Trust: {:.0}% β€” {}", report.score * 100.0, report.risk_level);
// Trust: 52% β€” HIGH

// Access per-claim breakdown
for claim in &report.claims {
    println!("  {} β€” {}", claim.text, claim.trust.risk_level);
}

// Access trajectory analysis
println!("Pattern: {}", report.trajectory.pattern);
println!("Damping: ΞΆβ‰ˆ{:.2}", report.trajectory.damping_estimate);

// JSON serialization
let json = serde_json::to_string_pretty(&report).unwrap();
```

### Multi-response consistency check (v0.3)

Paste N responses to the same prompt β€” TruthLens detects contradictions between them.

```rust
use truthlens::check_consistency;

let report = check_consistency(&[
    "Einstein was born in 1879 in Ulm, Germany.",
    "Einstein was born in 1879 in Munich, Germany.",  // ← contradiction
    "Einstein was born in 1879 in Ulm, Germany.",
]);

println!("Consistency: {:.0}%", report.consistency_score * 100.0);
// Consistency: 75%

// Contradictions detected
for c in &report.contradictions {
    println!("⚠️  {} vs {} β€” {}", c.claim_a, c.claim_b, c.conflict);
}
// ⚠️  "Ulm, Germany" vs "Munich, Germany"

// Claims unique to one response (potential hallucination)
for u in &report.unique_claims {
    println!("πŸ” Unique to response {}: {}", u.response_idx, u.text);
}
```

```bash
# CLI: compare multiple responses as separate arguments
truthlens --consistency \
  "Einstein was born in 1879 in Ulm, Germany." \
  "Einstein was born in 1879 in Munich, Germany." \
  "Einstein was born in 1879 in Ulm, Germany."
#  Consistency: 70% [β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘]
#  ❌ Contradictions:
#     Response 1 vs 2: "Ulm, Germany" vs "Munich, Germany"
#  βœ… Consistent claims:
#     3/3 agree: einstein was born in: 1879

# JSON output
truthlens --consistency --json "resp1" "resp2" "resp3"

# Pipe JSON array from stdin
echo '["Python was created in 1991.", "Python was created in 1989."]' \
  | truthlens --consistency
```

### Use as a Python library (v0.5)

```bash
pip install truthlens
```

```python
from truthlens import analyze, check_consistency, extract_claims, extract_entities

# Analyze text for hallucination risk
report = analyze("Einstein was born in 1879 in Ulm, Germany.")
print(f"Trust: {report['score']:.0%} β€” {report['risk_level']}")

# Per-claim breakdown
for claim in report["claims"]:
    print(f"  {claim['text']} β€” {claim['trust']['risk_level']}")

# Multi-response consistency check
result = check_consistency([
    "Einstein was born in 1879 in Ulm.",
    "Einstein was born in 1879 in Munich.",
])
print(f"Consistency: {result['consistency_score']:.0%}")

# Extract atomic claims
claims = extract_claims("Python was created in 1991. It is widely used.")

# Extract named entities
entities = extract_entities("Marie Curie won the Nobel Prize in 1903.")
print(entities)  # ['1903', 'Marie Curie']
```

### Install via Snap (v0.5)

```bash
# Install from Snap Store (Ubuntu/Linux)
sudo snap install truthlens

# Analyze text
truthlens "Einstein invented the telephone in 1876."

# JSON output
truthlens --json "Python was created in 1991."

# Compare multiple AI responses
truthlens --consistency \
  "Einstein was born in Ulm." \
  "Einstein was born in Munich."

# Entity verification (requires network)
truthlens --verify "Marie Curie won the Nobel Prize in 1903."

# Run demo examples
truthlens --demo

# Show help
truthlens --help
```

### Entity verification (v0.4)

Cross-reference named entities (people, places, dates) against Wikidata to boost or reduce trust scores.

```bash
# Install with verification support
cargo install truthlens --features verify

# Verify entities in a claim
truthlens --verify "Albert Einstein was born in 1879 in Ulm, Germany."
#  Trust: 67% [β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘β–‘] MEDIUM
#  πŸ” Verified: Albert Einstein (Q937) β€” birth year: 1879, birthplace: Ulm βœ“

# Combine with JSON output
truthlens --verify --json "Marie Curie won the Nobel Prize in 1903."
```

> **Note:** The `--verify` flag requires the `verify` feature (adds the `ureq` HTTP dependency).
> Without `--features verify`, TruthLens works fully offline with no network dependencies.

```toml
# Cargo.toml
[dependencies]
truthlens = "0.5"

# With entity verification
# truthlens = { version = "0.5", features = ["verify"] }
```

## What It Does

TruthLens decomposes AI text into atomic claims and scores each for hallucination risk using linguistic signals β€” **no LLM calls, no API keys, no external dependencies**.

```
Input:  "Python 4.0 was released in December 2025 with native quantum computing support."

Output: πŸ”΄ Trust: 49% [HIGH]
        β†’ specific verifiable claim β€” verify independently
        β†’ overconfident language without hedging
```

## How It Works

### 1. Claim Extraction
Text β†’ atomic sentences β†’ each is an independent claim to evaluate.

### 2. Signal Analysis (per claim)

| Signal | What It Measures | Weight |
|--------|-----------------|--------|
| **Confidence** | Overconfident language without hedging (hallucination red flag) | 35% |
| **Hedging** | Uncertainty markers ("might", "possibly") β€” correlates with lower hallucination | 25% |
| **Specificity** | How concrete/verifiable the claim is (numbers, names, dates) | 20% |
| **Verifiability** | Whether the claim contains fact-checkable entities | 15% |
| **Consistency** | Multi-sample agreement (optional, requires LLM) | 5% |

### 3. Trust Score
Signals are aggregated into a single trust score in **[0.0, 1.0]**:

| Score | Risk Level | Meaning |
|-------|-----------|---------|
| 0.75–1.0 | βœ… LOW | Likely factual or appropriately hedged |
| 0.55–0.74 | ⚠️ MEDIUM | Some uncertain claims, verify key facts |
| 0.35–0.54 | πŸ”΄ HIGH | Multiple suspicious claims, verify everything |
| 0.0–0.34 | πŸ’€ CRITICAL | Likely contains hallucinations |

### 4. Passage Scoring
Passage score = 70% average + 30% worst claim. One bad claim drags down the whole passage.

## Key Design Decisions

- **No LLM required** β€” linguistic analysis only. Fast (microseconds), private (local), free.
- **Hedging = good** β€” unlike most "confidence detectors", we score hedged claims HIGHER. A model that says "might" is better calibrated than one that states falsehoods with certainty.
- **Specificity is double-edged** β€” specific claims are more useful but also more damaging if wrong. We flag them for independent verification.
- **Formally verified** β€” Lean 4 proofs guarantee score bounds, monotonicity, and composition properties.

## What's Proven (Lean 4)

### Score Bounds
- `signal_nonneg` β€” all signals β‰₯ 0
- `weighted_contrib_bounded` β€” wΒ·s ≀ wΒ·max when s ≀ max
- `clamped_score_in_range` β€” final score ∈ [0, 100] after clamp
- `truthlens_weights_sum` β€” weights sum to 100%

### Monotonicity
- `signal_increase_improves_score` β€” improving a signal improves the score
- `total_score_improves` β€” better signal + same rest = better total
- `good_claim_improves_passage` β€” adding a good claim raises the average

### Composition
- `passage_score_bounded` β€” 70%Β·avg + 30%Β·min ≀ 100%Β·max
- `passage_at_least_worst` β€” passage score β‰₯ 30% of worst claim
- `score_order_independent` β€” claim order doesn't affect passage score
- `score_deterministic` β€” same inputs β†’ same output (functional purity)

### Trajectory (v0.2)
- `adjusted_score_bounded` β€” score + modifier stays bounded after clamp
- `transitions_bounded` β€” direction changes ≀ n_claims βˆ’ 2
- `damping_positive` β€” damping estimate is always positive (stable system)
- `penalty_still_nonneg` β€” score after penalty β‰₯ 0 after clamp

### Consistency (v0.3)
- `consistency_bounded` β€” consistency score ∈ [0, 100] after clamp
- `contradictions_bounded` β€” contradiction count ≀ comparison pairs
- `agreement_ratio_valid` β€” agreement ≀ total responses
- `agreeing_response_improves` β€” adding agreement increases count
- `contradiction_symmetric` β€” if A contradicts B, B contradicts A
- `unique_bounded` β€” unique claims ≀ total claims

### Verification (v0.4)
- `verification_modifier_bounded` β€” modifier ∈ [0, 15] scaled after clamp
- `combined_modifier_bounded` β€” combined modifier ∈ [-15, +15]
- `adjusted_score_with_verification` β€” score + verification modifier stays in [0, 100]
- `adjusted_score_with_both` β€” score + trajectory + verification modifier stays in [0, 100]
- `entity_partition` β€” verified + contradicted + unknown = total
- `verified_contradicted_disjoint` β€” verified + contradicted ≀ total
- `empty_verification_neutral` β€” no entities β†’ zero modifier
- `all_verified_max` β€” all verified β†’ maximum positive modifier
- `all_contradicted_max` β€” all contradicted β†’ maximum negative modifier
- `more_verified_improves` β€” adding verified entity increases modifier (monotonic)
- `more_contradicted_worsens` β€” adding contradicted entity decreases modifier (monotonic)

## Examples

### Factual text
```
"Albert Einstein was born on March 14, 1879, in Ulm, Germany."
β†’ πŸ”΄ 52% HIGH β€” specific verifiable claim, verify independently
```

### Well-hedged passage (βœ… LOW risk)
```
"Climate change might be linked to increased hurricane frequency.
 Some researchers believe ocean temperatures could affect storm intensity.
 It is possible that sea levels will rise over the next century."
β†’ βœ… 60% LOW β€” Trajectory: FLAT LOW (consistently cautious), trust bonus +10%
```

### Single hedged claim
```
"Climate change might be linked to increased hurricane frequency."
β†’ ⚠️ 65% MEDIUM β€” appropriately hedged
```

### Overconfident hallucination
```
"The Great Wall is exactly 21,196.18 kilometers long."
β†’ πŸ”΄ 52% HIGH β€” overconfident without hedging; highly specific
```

### Vague filler
```
"Various factors contribute to the situation."
β†’ πŸ”΄ 40% HIGH β€” vague claim with low specificity
```

## JSON Output

```json
{
  "score": 0.49,
  "risk_level": "High",
  "summary": "1 claims analyzed. 1 high-risk claims detected.",
  "claims": [
    {
      "text": "Einstein invented the telephone in 1876.",
      "trust": {
        "score": 0.49,
        "signals": {
          "confidence": 0.5,
          "specificity": 0.3,
          "hedging": 0.5,
          "verifiability": 0.7,
          "consistency": null
        },
        "risk_level": "High"
      }
    }
  ]
}
```

## Repository Structure

```
truthlens/
β”œβ”€β”€ rust/                       # Core library + CLI
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ lib.rs              # Public API: analyze(), check_consistency()
β”‚   β”‚   β”œβ”€β”€ claim.rs            # Claim extraction + linguistic analysis
β”‚   β”‚   β”œβ”€β”€ scorer.rs           # Trust scoring + signal aggregation
β”‚   β”‚   β”œβ”€β”€ trajectory.rs       # Confidence trajectory analysis (v0.2)
β”‚   β”‚   β”œβ”€β”€ consistency.rs      # Multi-response consistency checker (v0.3)
β”‚   β”‚   β”œβ”€β”€ entity.rs           # Entity cross-reference with Wikidata (v0.4)
β”‚   β”‚   └── main.rs             # CLI: analyze, --consistency, --verify, --demo
β”‚   β”œβ”€β”€ tests/
β”‚   β”‚   └── integration.rs      # End-to-end integration tests
β”‚   └── Cargo.toml
β”œβ”€β”€ python/                     # Python bindings (v0.5)
β”‚   β”œβ”€β”€ src/lib.rs              # PyO3 wrapper
β”‚   β”œβ”€β”€ truthlens/              # Python package
β”‚   β”‚   β”œβ”€β”€ __init__.py         # Re-exports + docstrings
β”‚   β”‚   β”œβ”€β”€ __init__.pyi        # Type stubs (PEP 561)
β”‚   β”‚   └── py.typed            # PEP 561 marker
β”‚   β”œβ”€β”€ tests/
β”‚   β”‚   └── test_truthlens.py   # Python test suite
β”‚   β”œβ”€β”€ Cargo.toml              # cdylib crate
β”‚   └── pyproject.toml          # maturin build config
β”œβ”€β”€ lean/                       # Formal proofs
β”‚   β”œβ”€β”€ TruthLens/
β”‚   β”‚   β”œβ”€β”€ ScoreBounds.lean    # Score ∈ [0, 1], weight sum, clamp
β”‚   β”‚   β”œβ”€β”€ Monotonicity.lean   # Better signals β†’ better score
β”‚   β”‚   β”œβ”€β”€ Composition.lean    # Passage aggregation properties
β”‚   β”‚   β”œβ”€β”€ Trajectory.lean     # Trajectory modifier bounds + correctness
β”‚   β”‚   β”œβ”€β”€ Consistency.lean    # Contradiction bounds, agreement, symmetry
β”‚   β”‚   └── Verification.lean   # Entity verification modifier bounds (v0.4)
β”‚   └── lakefile.lean
β”œβ”€β”€ snap/                       # Snap package config (v0.5)
β”‚   └── snapcraft.yaml
β”œβ”€β”€ bridge/                     # Lean ↔ Rust mapping (coming)
└── README.md
```

## Build

```bash
# Rust (default β€” no network dependencies)
cd rust
cargo test                    # unit + doc tests
cargo test --features verify  # includes entity verification tests

# Python bindings
cd python
pip install maturin pytest
maturin develop               # build + install locally
pytest tests/ -v               # run Python tests

# Lean
cd lean
lake build        # 6 proof modules, zero sorry
```

## Roadmap

- [x] **v0.1** β€” Linguistic analysis: claim extraction, hedging detection, specificity scoring
- [x] **v0.2** β€” Confidence trajectory: detects oscillating, flat, or convergent confidence patterns using second-order dynamical system modeling
- [x] **v0.3** β€” Multi-response consistency, CLI (`cargo install truthlens`), colored output
- [x] **v0.4** β€” Entity cross-reference: verify extracted entities against Wikidata SPARQL (optional `verify` feature flag)
- [x] **v0.5** β€” Python bindings (PyO3) β†’ `pip install truthlens`, Snap package
- [ ] **v0.6** β€” Claude Code / MCP integration: local stdio MCP server, `analyze_text` + `analyze_file` tools, auto-checks AI text claims in-context
- [ ] **v0.7** β€” VS Code extension: analyze selection/file, inline diagnostics for docs/comments/markdown, status bar trust score
- [ ] **v0.8** β€” CI/CD integration: GitHub Action, fail builds on low trust score, policy thresholds (`--min-score`)
- [ ] **v0.9** β€” Browser extension: highlight claims in ChatGPT/Claude UI with inline trust indicators
- [ ] **v1.0** β€” TruthLens Platform: unified trust layer across CLI, VS Code, MCP, and CI pipelines with policy enforcement and fully local execution
- [ ] **v2.0** β€” Enterprise Trust System: policy engine, dashboard, audit & compliance reporting, enterprise API, team governance

### Design Principles (all versions)
- **Zero API calls by default** β€” every version works offline, locally, for free
- **Formally verified** β€” Lean 4 proofs for all scoring properties
- **Hedging = trustworthy** β€” a model that says "might" is more honest than one stating falsehoods with certainty
- **Fast** β€” microsecond analysis, no model inference required

## Why TruthLens?

Every existing hallucination detector either requires multiple LLM API calls (expensive, slow) or access to model logprobs (grey-box only). TruthLens works on **any AI output** with **zero API calls** β€” you paste text, you get a trust score. And the scoring properties are **formally proven** in Lean 4, which nobody else does.

## License

Apache-2.0