# cupel
Context window management pipeline for LLM applications.
Given a set of context items — messages, documents, tool outputs, memory — and a
token budget, Cupel determines the optimal subset and ordering for a model's
context window. The pipeline follows a fixed six-stage flow: classify inputs,
**score** every candidate for relevance, deduplicate, sort, **slice** the list
to fit the budget, and **place** the selected items in an attention-optimal
order. Each configurable stage (scorer, slicer, placer) is independently
swappable through trait implementations, and every inclusion/exclusion decision
carries a traceable reason.
Cupel is framework-agnostic. It accepts pre-counted token lengths and returns
plain `Vec<ContextItem>` — no LLM client, tokenizer, or async runtime required.
## Glossary
| **ContextItem** | An immutable record representing a single piece of context (message, document, tool output, etc.). Constructed via `ContextItemBuilder`. |
| **ContextBudget** | Token budget constraints: hard ceiling (`max_tokens`), soft goal (`target_tokens`), output reserve, per-kind reserved slots, and safety margin. Validated at construction. |
| **Scorer** | A trait that computes a relevance score for a context item. Eight built-in implementations cover common strategies. |
| **Slicer** | A trait that selects items from the scored list to fit within the budget. Built-in: `GreedySlice`, `KnapsackSlice`, `QuotaSlice`. |
| **Placer** | A trait that determines the final presentation order. Built-in: `ChronologicalPlacer`, `UShapedPlacer`. |
| **Pipeline** | The fixed six-stage executor: Classify, Score, Deduplicate, Sort, Slice, Place. Built via `Pipeline::builder()`. |
| **ContextKind** | Extensible string enum classifying item type (`Message`, `Document`, `ToolOutput`, `Memory`, `SystemPrompt`). Case-insensitive comparison. |
| **ContextSource** | Extensible string enum identifying item origin (`Chat`, `Tool`, `Rag`). Case-insensitive comparison. |
## Quickstart
A minimal pipeline that scores three items by recency, greedily fills a
500-token budget, and places results in chronological order:
```rust
use std::collections::HashMap;
use chrono::Utc;
use cupel::*;
# fn main() -> Result<(), CupelError> {
let now = Utc::now();
let items = vec![
ContextItemBuilder::new("System: you are a helpful assistant", 20)
.kind(ContextKind::new("SystemPrompt")?)
.timestamp(now - chrono::Duration::seconds(30))
.build()?,
ContextItemBuilder::new("User: summarize the RFC", 15)
.kind(ContextKind::new("Message")?)
.timestamp(now - chrono::Duration::seconds(20))
.build()?,
ContextItemBuilder::new("Tool: RFC-1234 full text (4k tokens)", 400)
.kind(ContextKind::new("ToolOutput")?)
.timestamp(now - chrono::Duration::seconds(10))
.build()?,
];
let budget = ContextBudget::new(
1024, // max_tokens (model context window)
500, // target_tokens (soft goal)
200, // output_reserve
HashMap::new(),
0.0,
)?;
let pipeline = Pipeline::builder()
.scorer(Box::new(RecencyScorer))
.slicer(Box::new(GreedySlice))
.placer(Box::new(ChronologicalPlacer))
.build()?;
let result = pipeline.run(&items, &budget)?;
assert!(!result.is_empty());
# Ok(())
# }
```
### Multi-scorer pipeline
A more realistic configuration combining `KindScorer` and `RecencyScorer`
through a weighted composite. `QuotaSlice` reserves at least 20% of the
budget for tool outputs while capping messages at 50%. `UShapedPlacer`
positions the highest-scored items at the start and end of the window where
LLM attention is strongest.
```rust
use std::collections::HashMap;
use chrono::Utc;
use cupel::*;
# fn main() -> Result<(), CupelError> {
let now = Utc::now();
let items = vec![
ContextItemBuilder::new("System: you are a code reviewer", 25)
.kind(ContextKind::new("SystemPrompt")?)
.timestamp(now - chrono::Duration::seconds(60))
.pinned(true)
.build()?,
ContextItemBuilder::new("User: review my PR", 10)
.kind(ContextKind::new("Message")?)
.timestamp(now - chrono::Duration::seconds(50))
.build()?,
ContextItemBuilder::new("Memory: user prefers concise feedback", 12)
.kind(ContextKind::new("Memory")?)
.timestamp(now - chrono::Duration::seconds(45))
.build()?,
ContextItemBuilder::new("Tool: git diff output (3k tokens)", 350)
.kind(ContextKind::new("ToolOutput")?)
.timestamp(now - chrono::Duration::seconds(5))
.build()?,
ContextItemBuilder::new("Tool: lint warnings", 80)
.kind(ContextKind::new("ToolOutput")?)
.timestamp(now - chrono::Duration::seconds(3))
.build()?,
];
// Weighted composite: 60% kind relevance, 40% recency
let scorer = CompositeScorer::new(vec![
(Box::new(ScaledScorer::new(Box::new(KindScorer::with_default_weights()))), 0.6),
(Box::new(ScaledScorer::new(Box::new(RecencyScorer))), 0.4),
])?;
// Reserve 20% for tool outputs, cap messages at 50%
let quotas = vec![
QuotaEntry::new(ContextKind::new("ToolOutput")?, 20.0, 80.0)?,
QuotaEntry::new(ContextKind::new("Message")?, 0.0, 50.0)?,
];
let slicer = QuotaSlice::new(quotas, Box::new(GreedySlice))?;
let budget = ContextBudget::new(2048, 800, 400, HashMap::new(), 0.0)?;
let pipeline = Pipeline::builder()
.scorer(Box::new(scorer))
.slicer(Box::new(slicer))
.placer(Box::new(UShapedPlacer))
.build()?;
let result = pipeline.run(&items, &budget)?;
// Pinned system prompt is always included regardless of scoring
# }
```
## Scorers
| `RecencyScorer` | Prefer recent items | Rank-based; requires timestamp |
| `PriorityScorer` | Explicit importance ranking | Rank-based; requires priority field |
| `KindScorer` | Weight by content type | Absolute; configurable weight map |
| `TagScorer` | Weighted tag matching | Absolute; configurable tag weights |
| `FrequencyScorer` | Boost commonly-tagged items | Relative; requires tags |
| `ReflexiveScorer` | Pass through future\_relevance\_hint | Absolute; requires hint field |
| `CompositeScorer` | Combine multiple strategies | Weighted average; meta-scorer |
| `ScaledScorer` | Normalize any scorer to \[0,1\] | Min-max normalization; wrapper |
## Serde support
Enable the `serde` feature for `Serialize`/`Deserialize` on all model types:
```toml
[dependencies]
cupel = { version = "1.1", features = ["serde"] }
```
`ContextBudget` and `ContextItem` validate constraints on deserialization —
invalid JSON is rejected at the boundary, not at pipeline runtime.
## License
MIT