# orion-error 0.8.0 Architecture
This document describes the ideal design architecture of orion-error `0.8.0`: the design constraints behind the public API, the core error flow, and the governance goals. Struct snippets are conceptual models, not exact source snapshots; the implementation in `src/` remains the source of truth for precise fields.
## The Problem
In large Rust services, error handling faces five unmet needs:
1. **Convergence without loss.** Lower-layer technical errors must be abstracted into upper-layer stable semantics — but the original cause (source chain, detail, context) must remain available for diagnostics.
2. **Cross-layer propagation.** An error passes through multiple layers (handler → service → repository → database). Each layer needs to attach its own context without discarding what came before.
3. **Boundary projection.** The same error must be presented differently to different audiences: end users (safe message), operators (component + retryability), protocol clients (stable code + structure), and developers (full chain).
4. **Governable identity.** Errors need stable, machine-readable identities that survive refactoring, across HTTP/RPC/log/CLI boundaries.
5. **Structured carrier.** Errors carry detail, source chain, operation context, and metadata — all as structured fields, not string concatenation.
Existing libraries solve a subset:
| `thiserror` | Local error enum modeling, `Display` + `From` generation | Cross-layer propagation, context attachment, protocol projection |
| `anyhow` | Application-level error unification, `context()` | Stable identity, protocol output, fine-grained category routing |
| `color-eyre` | Rich diagnostic reports | Same as anyhow — no protocol or identity layer |
**orion-error** targets the gap: **governance at scale** — what happens when errors travel through 3–5 layers and must emerge at a protocol boundary with stable structure.
---
## Core Insight: Reason/Carrier Separation
The central design decision: **separate the error's semantic classification (reason) from its propagation mechanism (carrier).**
```rust
// Reason = what kind of error
enum AppReason {
InvalidInput,
OrderNotFound,
General(UnifiedReason),
}
// Carrier = how it propagates
let err: StructError<AppReason> = AppReason::OrderNotFound
.to_err()
.with_detail("order #42 not found")
.with_source(db_error)
.with_context(ctx);
```
### Why separate?
If reason and carrier are combined — as in typical `thiserror` enum usage — every piece of runtime machinery (context attachment, source tracking, protocol projection) must be reimplemented for each enum. The carrier (`StructError<T>`) implements it once.
The reason stays thin — a `DomainReason` marker trait requiring only `PartialEq + Display + Debug + Send + Sync + 'static`. The carrier does the rest.
```rust
pub trait DomainReason: PartialEq + Display + Debug + Send + Sync + 'static {}
```
| `Display` + `Debug` | Errors must be printable for diagnostics and logging. |
| `PartialEq` | Enables assertion in tests. |
| `Send + Sync` | Required for `StructError` to cross async task boundaries. |
| `'static` | Enables type erasure via `dyn Error` and storage in `SourceFrame`. |
---
## Error Flow
```text
raw std error ──→ .source_err(reason, detail) ──→ first entry into structured system
│
conv_err()
(reason remap)
│
report / exposure / display_chain
```
### 1. Entry: `source_err(reason, detail)`
The unified entry point. Works for both raw `std::error::Error` and already-structured `StructError` sources:
```rust
let result = std::fs::read_to_string("config.toml")
.source_err(AppReason::system_error(), "read config failed")?;
```
- The raw error is stored as a source frame, preserving its `Display` and `Debug` output.
- The `reason` becomes the error's stable classification.
- The `detail` provides layer-specific explanation.
### 2. Cross-layer conversion: `conv_err()`
When the upstream error is already `StructError<R1>` and only the reason type needs to change:
```rust
fn upper_layer() -> Result<(), StructError<UpperReason>> {
lower_layer().conv_err()?;
Ok(())
}
```
Requires `UpperReason: From<LowerReason>`. All detail, context, source chain, and metadata survive the conversion.
A blanket `From<StructError<R1>> for StructError<R2>` is blocked by Rust's orphan rule (neither `From` nor `StructError` are local to the user's crate). An explicit trait method is the intended path.
### 3. First entry vs. cross-layer distinction
| `source_err(reason, detail)` | Creates a new semantic boundary | Wraps as unstructured or structured source |
| `conv_err()` | Only remaps reason type | Preserves all detail, context, source, metadata |
---
## Core Types
### `StructError<T: DomainReason>`
The universal runtime carrier. Conceptually, it stores the reason and the runtime propagation state behind a small carrier:
```rust
pub struct StructError<T: DomainReason> {
imp: Box<StructErrorImpl<T>>,
}
```
`Box` is used to keep `StructError` small (pointer-sized), as it is expected to be returned through `Result` frequently.
### `StructErrorImpl<T>`
Holds the data needed for error propagation. Simplified model:
```rust
struct StructErrorImpl<T> {
reason: T,
detail: Option<String>,
position: Option<String>,
context: Option<Arc<Vec<OperationContext>>>,
source_payload: Option<InternalSourcePayload>,
}
```
Key decisions:
- **`context: Option<Arc<Vec<...>>>`** — lazy allocation: no heap allocation for errors without context. `Arc` enables cheap clone of the context chain.
- **`Box<StructErrorImpl<T>>`** — `StructError` itself stays small (one pointer), minimizing `Result` size.
### `OperationContext`
Carries runtime context. Conceptually it describes what the current layer was doing, what it was accessing, which diagnostic fields were attached, and whether operation logging should be emitted:
```rust
pub struct OperationContext {
action: Option<String>,
locator: Option<String>,
fields: Vec<(String, String)>,
path: Vec<String>,
metadata: ErrorMetadata,
result: OperationResult,
exit_log: bool,
}
```
- `doing(...)` — what operation was running ("load config", "validate order")
- `at(...)` — what resource was being accessed ("config.toml", "order #42")
- `with_field(...)` — human-readable diagnostic fields
- `with_meta(...)` — machine-oriented metadata (serialization only)
- `success()` / `fail()` / `cancel()` and logging helpers — record operation outcome with little call-site code
### `SourceFrame`
Represents one element in the source chain. Simplified model:
```rust
pub struct SourceFrame {
pub index: usize,
pub message: SmolStr,
pub display: Option<SmolStr>,
pub debug: Option<SmolStr>,
pub type_name: Option<SmolStr>,
pub error_code: Option<i32>,
pub reason: Option<SmolStr>,
pub path: Option<SmolStr>,
pub detail: Option<SmolStr>,
pub metadata: ErrorMetadata,
pub is_root_cause: bool,
pub context_fields: Vec<(SmolStr, SmolStr)>,
}
```
String fields use `SmolStr` (zero-allocation for short strings) for fast clone in source chain traversal.
---
## Consumption Paths
Three independent consumption paths, each returning a different view of the same error:
### `report()` → `DiagnosticReport`
Human-readable diagnostics. Only requires `DomainReason`.
```rust
let report: DiagnosticReport = err.report();
println!("{}", report.render());
```
Output:
```text
reason: system error
detail: read config failed
context:
[0] place_order [user_id: 42]
```
### `exposure(&policy)` → `ErrorProtocolSnapshot`
Protocol-boundary projection. Requires `ErrorIdentityProvider` (provided by `#[derive(OrionError)]`).
```rust
let proto = err.exposure(&MyPolicy);
let http_json = proto.to_http_error_json()?; // {"status": 500, "code": "sys.io_error", ...}
let log_json = proto.to_log_error_json()?; // full structured log output
let cli_json = proto.to_cli_error_json()?; // operator-facing summary
let rpc_json = proto.to_rpc_error_json()?; // upstream-facing protocol
```
The `ExposurePolicy` trait controls the decision:
| `http_status()` | 500 | Most common |
| `visibility()` | `Internal` (Biz → `Public`) | Common |
| `retryable()` | `false` | Occasional |
| `default_hints()` | `[]` | Rare |
`Visibility` controls which error information reaches the external caller:
| HTTP `message` | Uses detail | Uses reason (hides detail) |
| RPC `detail` | Exposed | `null` |
### `display_chain()` → formatted string
Source chain expansion for debugging. No trait requirement beyond `DomainReason`.
```text
system error
-> Info: read config failed
-> Caused by:
1. outer source
2. inner source
```
### `identity_snapshot()` → `ErrorIdentity`
Stable identity inspection without protocol projection:
```rust
let id = err.identity_snapshot();
assert_eq!(id.code, "sys.io_error");
```
---
## UnifiedReason
`UnifiedReason` is the built-in universal reason classification. It covers the common error categories found in most services:
| Business | 100-105 | `validation_error`, `not_found` |
| Infrastructure | 200-204 | `system_error`, `network_error`, `timeout` |
| Configuration | 300-301 | `core_conf`, `external_error` |
Designed as a catch-all for errors that don't need a domain-specific reason. Domain enums typically include it as a transparent variant:
```rust
#[derive(OrionError)]
enum AppReason {
#[orion_error(identity = "biz.invalid")]
Invalid,
#[orion_error(transparent)]
General(UnifiedReason),
}
```
The `#[orion_error(transparent)]` attribute delegates `stable_code()`, `error_category()`, and `Display` to the inner `UnifiedReason`.
---
## Explicit StdError Bridge
`StructError<T>` does **not** implement `std::error::Error`. This is intentional:
1. **Prevents accidental type erasure.** If `StructError` implemented `StdError`, calling code could unintentionally erase the reason type with `.into()` or `Box<dyn Error>`, losing structured identity.
2. **Keeps boundary crossing explicit.** When interop with `StdError` ecosystem is needed, the conversion is explicit:
```rust
let std_ref: StdStructRef<'_, AppReason> = err.as_std();
let owned: OwnedStdStructError<AppReason> = err.into_std();
let dyn_owned: OwnedDynStdStructError = err.into_dyn_std();
```
---
## Derive Macro
`#[derive(OrionError)]` generates the core trait implementations:
| `Display` | Human-readable error message | From `message` attribute, or auto-generated from `identity` |
| `DomainReason` | Carrier compatibility | Empty marker impl |
| `ErrorCode` | Legacy numeric compatibility code | From `code` attribute, or default 500 |
| `ErrorIdentityProvider` | Stable code + category | From `identity` and `category` attributes |
### Attributes
| `identity = "biz.foo"` | Yes (unless `transparent`) | `stable_code()` returns `"biz.foo"` |
| `category = Biz` | No (inferred from `identity` prefix) | `error_category()` returns specified category |
| `transparent` | Alternative to `identity` | Delegates all methods to inner type |
| `message = "..."` | No (auto-generated from `identity`) | Custom `Display` output |
| `code = ...` | No (default 500) | Legacy numeric `error_code()` |
Protocol outputs, log aggregation, and monitoring should use `ErrorIdentity.code` / `stable_code()` as the stable identity. `ErrorCode` is a numeric compatibility layer, not the recommended primary key for new external contracts.
### Transparent Variant Constructor Delegation
When an enum has a transparent variant wrapping `UnifiedReason`, all `UnifiedReason` constructors are generated as methods on the enum:
```rust
#[derive(OrionError)]
enum AppReason {
#[orion_error(transparent)]
General(UnifiedReason),
}
// Generated automatically:
AppReason::system_error() // instead of AppReason::General(UnifiedReason::system_error())
AppReason::validation_error()
AppReason::not_found_error()
```
---
## Third-Party Error Integration
Third-party error types enter the structured system through `source_err()`. Supported types:
| `std::io::Error` | Built-in (no feature) | Direct `UnstructuredSource` impl |
| `serde_json::Error` | `serde_json` | Direct `UnstructuredSource` impl |
| `anyhow::Error` | `anyhow` | Attempts structured recovery, falls back to unstructured |
| `toml::de::Error` | `toml` | Direct `UnstructuredSource` impl |
| Custom types | — | Opt-in via `RawStdError` + `raw_source()` |
The opt-in design (`RawStdError`) prevents silent structured-to-unstructured downgrade:
```rust,ignore
impl RawStdError for MyError {}
let result: Result<(), MyError> = Err(MyError);
let err = result
.map_err(raw_source)
.source_err(AppReason::system_error(), "my operation failed")?;
```
---
## Design Evolution
### Naming: UvsReason → CommonReason → UnifiedReason
The built-in reason type went through three names:
- **`UvsReason`** — original name, meaning unclear to new users
- **`CommonReason`** — intermediate rename, but "common" sounded like "ordinary" rather than "unified"
- **`UnifiedReason`** — final name, reflecting its role: concrete errors converge (are unified) into this classification
The deprecated `pub type UvsReason = UnifiedReason;` alias is retained for migration compatibility.
### Variant name: Uvs → General
The transparent variant in domain enums was renamed to `General`:
```rust
// Before
Uvs(UnifiedReason),
// After
General(UnifiedReason),
```
`General` communicates "this is the catch-all for non-domain-specific errors" more clearly than the opaque `Uvs`.
### Consumption path convergence: snapshot is not the main path
The orion-error 0.8.0 architecture centers on `report()`, `exposure()`, `display_chain()`, and `identity_snapshot()`.
Stable machine identity is provided by `identity_snapshot()`. HTTP/RPC/CLI/log boundary output is handled by `exposure()` and `ErrorProtocolSnapshot`. Human diagnostics are handled by `report()`. This avoids making users learn a separate snapshot type hierarchy while preserving stable identity and protocol projection.
### API naming: exposure
Consistency with `report()`. The shorter name reflects the intent: expose this error at a boundary according to a policy, without making users first learn an internal snapshot model.
---
## Feature Gating
| `derive` | Proc-macro derive macros (`OrionError`, `ErrorCode`, `ErrorIdentityProvider`) | Yes |
| `log` | `OperationContext` log methods (`ctx.info()`, `.debug()`, `.warn()`, `.error()`) and `Drop` auto-logging | Yes |
| `tracing` | Tracing integration (preferred over `log` when both are enabled) | No |
| `serde` | `Serialize` / `Deserialize` on core types | No |
| `serde_json` | Protocol JSON projection methods (`to_http_error_json()`, etc.) | No |
| `anyhow` | `anyhow::Error` interop with structured source recovery | No |
| `toml` | `toml::de::Error` / `toml::ser::Error` interop | No |
---
## Project Structure
```
src/
lib.rs — Crate root, re-exports, layered modules
core/
domain.rs — DomainReason trait
reason.rs — ErrorCode trait, ErrorCategory enum, ErrorIdentityProvider trait
universal.rs — UnifiedReason enum (built-in classification)
error/
carrier.rs — StructError<T>, StructErrorImpl<T>
builder.rs — StructErrorBuilder<T>
identity.rs — ErrorIdentity struct, identity_snapshot()
source_chain.rs — SourceFrame, source payload infrastructure
std_bridge.rs — StdStructRef, OwnedStdStructError, OwnedDynStdStructError
context/
types.rs — OperationContext, OperationScope
convert.rs — ContextAdd trait
metadata.rs — ErrorMetadata, MetadataValue
report/
diagnostic.rs — DiagnosticReport, redaction
protocol.rs — ErrorProtocolSnapshot, ExposurePolicy, Visibility
traits/
contextual.rs — ErrorWith trait
conversion.rs — ConvErr, ConvStructError, ToStructError
source_err.rs — SourceErr, RawStdError, RawSource
testing.rs — Test assertion helpers
```
```
docs/
en/book.toml — English mdBook config
en/src/ — English mdBook source
zh/book.toml — Chinese mdBook config
zh/src/ — Chinese mdBook source
index.html — Language selector copied to site root
site/
en/ — Generated English book
zh/ — Generated Chinese book
```
---
## Constraints
### Orphan Rule
A blanket `From<StructError<R1>> for StructError<R2>` cannot be provided — neither `From` (std) nor `StructError` (this crate) are local to the user's crate. The explicit `conv_err()` method is the intended path:
```rust
let result: Result<(), StructError<UpperReason>> = lower_result.conv_err()?;
```
### Send + Sync
`DomainReason` requires `Send + Sync`. This is necessary for `StructError` to be used across async task boundaries and captured by `anyhow::Error` or `Box<dyn Error>`. For single-threaded use, this adds a small but unavoidable constraint.