1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
//! Experimental context-management strategies applied before RLM compaction.
//!
//! The agentic loop re-sends the entire conversation every step, which
//! means two structural wastes dominate token usage:
//!
//! 1. **Duplicate tool outputs.** Agents frequently re-read the same
//! file, re-run the same `ls`, or re-grep for the same pattern
//! across many steps. The verbatim content appears multiple times
//! in the history. See [`dedup`].
//! 2. **Stale oversized tool outputs.** A 40 KB `read_file` result from
//! step 2 is rarely relevant at step 30, yet it still costs full
//! input tokens every turn. See [`snippet`].
//!
//! Both strategies are **lossy** in the strict sense but preserve
//! referenceability: the model can always ask the agent to re-run the
//! original tool call if it needs the full output back.
//!
//! # Composition
//!
//! [`apply_all`] runs every strategy in a fixed order against the live
//! [`Message`] buffer, mutating in place. Callers (the two prompt loops)
//! invoke it immediately before
//! [`enforce_context_window`](super::compression::enforce_context_window)
//! so the RLM compaction pass sees the already-shrunken buffer. The
//! returned [`ExperimentalStats`] is logged at `info` level for
//! observability.
//!
//! # Default-on, no config
//!
//! These strategies are always active — there is intentionally no env
//! flag to disable them. If a future regression requires an escape
//! hatch, add a field to [`crate::config::Config`] rather than a magic
//! env var so the setting is discoverable.
//!
//! # Examples
//!
//! ```rust
//! use codetether_agent::provider::{ContentPart, Message, Role};
//! use codetether_agent::session::helper::experimental::apply_all;
//!
//! let tool_result = ContentPart::ToolResult {
//! tool_call_id: "call_a".into(),
//! content: "file contents: hello world".repeat(40),
//! };
//! let duplicate = ContentPart::ToolResult {
//! tool_call_id: "call_b".into(),
//! content: "file contents: hello world".repeat(40),
//! };
//!
//! let mut msgs = vec![
//! Message { role: Role::Tool, content: vec![tool_result] },
//! Message { role: Role::Tool, content: vec![duplicate] },
//! ];
//!
//! let stats = apply_all(&mut msgs);
//! assert!(stats.total_bytes_saved > 0);
//! assert!(stats.dedup_hits >= 1);
//! ```
use crateMessage;
/// Aggregate outcome of every strategy in [`apply_all`].
/// Apply every experimental strategy in order, mutating `messages` in
/// place. Returns aggregate statistics suitable for logging.
///
/// Order matters:
///
/// 1. [`dedup::dedup_tool_outputs`] runs first because it can eliminate
/// a duplicate in full before [`snippet`] has to think about it.
/// 2. [`snippet::snippet_stale_tool_outputs`] runs second, snipping any
/// remaining oversized tool outputs older than the recency window.
///
/// # Examples
///
/// ```rust
/// use codetether_agent::provider::{ContentPart, Message, Role};
/// use codetether_agent::session::helper::experimental::apply_all;
///
/// let mut msgs: Vec<Message> = Vec::new();
/// let stats = apply_all(&mut msgs);
/// assert_eq!(stats.total_bytes_saved, 0);
/// ```