1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
//! `CostCalculator` — small trait that computes a monetary cost
//! from a request context, a model name, and a usage record.
//!
//! Sits in `entelix-core` so observability layers (`entelix-otel`)
//! and policy layers (`entelix-policy`) can both speak the same
//! abstraction without one depending on the other. Concrete
//! implementations live in `entelix-policy::CostMeter` (for billing
//! ledgers) and operator-supplied custom types (for non-pricing
//! computations such as carbon tracking).
//!
//! ## Why async, when the canonical impl is synchronous?
//!
//! `compute_cost` is async even though the shipped
//! `entelix_policy::CostMeter` impl is purely synchronous (`Decimal`
//! arithmetic over a `parking_lot::RwLock`). The trade-off is
//! deliberate: dynamic-pricing deployments need to consult remote
//! sources at compute time — vendor pricing APIs, per-tenant rate
//! sheets in a database, FX-rate services for cross-currency
//! billing — and standardising on async now means those impls slot
//! in without a breaking surface change later. The async overhead
//! on the synchronous path is one `async_trait` desugaring per
//! call, dwarfed by the tracing event the cost feeds into. Sync
//! callers wanting maximum throughput compose a `CostMeter` directly
//! via `entelix_policy::CostMeter::charge` (which is not async).
//!
//! ## Per-tenant pricing
//!
//! `compute_cost` accepts `&ExecutionContext` so calculators that
//! need the originating tenant — multi-tenant SaaS deployments
//! charging different per-token rates per customer tier — can read
//! `ctx.tenant_id()` at compute time. Calculators with a global
//! pricing table simply ignore the parameter; the surface stays
//! identical for both shapes.
//!
//! `cost` is `Option<f64>` rather than a fixed type:
//! - `None` — the calculator does not know the model. Telemetry
//! layers omit the `gen_ai.usage.cost` attribute rather than
//! surfacing a misleading zero.
//! - `Some(amount)` — the computed monetary cost in the currency
//! the operator chose at calculator construction time. Conversion
//! to a billing currency is the caller's responsibility.
//!
//! Width: returning `f64` accepts a small loss-of-precision risk in
//! exchange for trivial JSON serialisation. Callers that need
//! financial-grade exactness use the `entelix-policy` ledger
//! directly; the value emitted into telemetry is for dashboards.
use async_trait;
use Decimal;
use crateExecutionContext;
use crate;
/// Compute a monetary cost for one model invocation.
///
/// Implementors are pure and side-effect free with respect to the
/// caller's request — they may consult internal caches but must
/// not mutate the caller's state. Implementations are typically
/// shared across many calls, so they must be `Send + Sync + 'static`.
/// Compute a monetary cost for one tool dispatch.
///
/// Tool calls cost wall-clock time, sometimes external-API spend
/// (paid SaaS endpoints, per-call billing on third-party search
/// engines, MCP servers behind paywalls). A separate trait keeps
/// the model-cost path lightweight while letting deployments that
/// care about tool spend roll their own calculator.
///
/// `output` is the tool's serialised JSON output — calculators
/// that price by output size, response status, or response shape
/// (e.g. row counts) read it; flat per-call calculators ignore it.
/// Precision-accurate cost projection for [`crate::RunBudget`] axis
/// enforcement.
///
/// Distinct from [`CostCalculator`] (f64 telemetry) — budget
/// enforcement is currency arithmetic that demands `Decimal`
/// precision (invariant 12: cost is computed transactionally;
/// silent rounding drift in the cumulative ledger is not
/// recoverable). Two traits over one trade off: the telemetry path
/// stays JSON-friendly while the budget path stays exact.
///
/// Implementations typically live alongside the operator's pricing
/// catalogue. `entelix-policy::CostMeter` is the reference impl —
/// it implements both [`CostCalculator`] (for `gen_ai.usage.cost`)
/// and `BudgetCostEstimator` (for `RunBudget::check_pre_request_cost`
/// + `observe_cost`) over the same `PricingTable`.