1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
//! LLM provider abstraction.
//!
//! `Provider` is the trait every backend implements. Today there's one
//! production impl ([`anthropic::AnthropicProvider`]) and one test impl
//! ([`MockProvider`]). OpenAI and Ollama follow-ups will plug in here
//! without touching the rest of the crate (per Phase 7 plan Q4 —
//! Anthropic-first, others later).
//!
//! The trait is deliberately narrow: one method, sync, one prompt
//! shape in, one parsed response out. Schema-aware prompt construction
//! lives one layer up in `crate::prompt`, so providers stay generic
//! over what's being asked.
use crateAskError;
use crate;
pub use MockProvider;
/// One LLM call's worth of input. Mirrors the Anthropic Messages
/// request shape because it's the most expressive of the three
/// providers we'll support; OpenAI and Ollama adapters convert to
/// their native shapes inside their own `complete` impls.
/// What every provider returns. We keep this minimal — `text` is the
/// raw string the model produced (the caller parses it), `usage`
/// surfaces token counts so callers can verify cache hits.
/// Token-usage breakdown. Names match Anthropic's API field names so
/// the mapping stays obvious; OpenAI's `prompt_tokens` /
/// `completion_tokens` will fan into `input_tokens` / `output_tokens`
/// when that adapter lands.
///
/// **Verifying cache hits:** if `cache_read_input_tokens` is zero
/// across repeated `ask()` calls with the same schema, something in
/// the prefix is invalidating the cache (a silent invalidator —
/// `datetime.now()` in a system block, varying tool list, etc.).
/// A single one-shot call. Sync because every supported provider has
/// a sync HTTPS entry point and `ask()` itself is sync (matches the
/// engine's surface — `Connection::execute` etc. are all sync).