1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
// SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
// SPDX-License-Identifier: Apache-2.0
//! LLM codec traits for bidirectional request translation.
use crateLlmRequest;
use crateResult;
use crateJson;
use AnnotatedLlmRequest;
use AnnotatedLlmResponse;
// ---------------------------------------------------------------------------
// LlmCodec trait
// ---------------------------------------------------------------------------
/// A bidirectional translator between opaque [`LlmRequest`] content and
/// structured [`AnnotatedLlmRequest`].
///
/// Codecs are implemented by integration patches (LangChain, LangChain-NVIDIA,
/// LangGraph, etc.) since each SDK has its own request format. They are
/// registered by name in the global codec registry.
///
/// # Design
///
/// - **Synchronous**: `decode`/`encode` are pure data transforms (JSON
/// restructuring), not I/O operations. This matches existing guardrails
/// and request intercepts.
/// - **`Send + Sync`**: Required because [`NemoFlowContextState`](crate::api::runtime::NemoFlowContextState)
/// is behind `Arc<RwLock<>>` and accessed from async contexts.
/// - **Trait object**: Codecs are registered at runtime (e.g., by Python
/// patches), so the Rust core cannot know concrete types at compile time.
/// Store as `Arc<dyn LlmCodec>`.
// ---------------------------------------------------------------------------
// LlmResponseCodec trait
// ---------------------------------------------------------------------------
/// Decode-only codec for LLM API responses.
///
/// Unlike [`LlmCodec`] (which is bidirectional for requests), response codecs
/// are introspection-only: they parse a raw response into structured form but
/// never need to encode back. This matches the pipeline design where responses
/// are observed, not modified.
///
/// # Design
///
/// - **Synchronous**: `decode_response` is a pure data transform (JSON parsing),
/// not an I/O operation.
/// - **`Send + Sync`**: Required for storage in `Arc` behind `RwLock`.
/// - **Trait object**: Codecs are registered at runtime, stored as
/// `Arc<dyn LlmResponseCodec>`.
/// - **Non-fatal**: Returns `Result` but callers treat errors as
/// "no annotation available" rather than pipeline failure.
///
/// # Two-Phase Decode
///
/// Implementations should use a two-phase decode pattern:
/// 1. Deserialize raw JSON into API-specific intermediate structs
/// 2. Map intermediate structs into the normalized `AnnotatedLlmResponse`