cupel 1.2.0

Context window management pipeline for LLM applications
docs.rs failed to build cupel-1.2.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build: cupel-1.0.0

cupel

Context window management pipeline for LLM applications.

Given a set of context items — messages, documents, tool outputs, memory — and a token budget, Cupel determines the optimal subset and ordering for a model's context window. The pipeline follows a fixed six-stage flow: classify inputs, score every candidate for relevance, deduplicate, sort, slice the list to fit the budget, and place the selected items in an attention-optimal order. Each configurable stage (scorer, slicer, placer) is independently swappable through trait implementations, and every inclusion/exclusion decision carries a traceable reason.

Cupel is framework-agnostic. It accepts pre-counted token lengths and returns plain Vec<ContextItem> — no LLM client, tokenizer, or async runtime required.

Glossary

Term Description
ContextItem An immutable record representing a single piece of context (message, document, tool output, etc.). Constructed via ContextItemBuilder.
ContextBudget Token budget constraints: hard ceiling (max_tokens), soft goal (target_tokens), output reserve, per-kind reserved slots, and safety margin. Validated at construction.
Scorer A trait that computes a relevance score for a context item. Eight built-in implementations cover common strategies.
Slicer A trait that selects items from the scored list to fit within the budget. Built-in: GreedySlice, KnapsackSlice, QuotaSlice.
Placer A trait that determines the final presentation order. Built-in: ChronologicalPlacer, UShapedPlacer.
Pipeline The fixed six-stage executor: Classify, Score, Deduplicate, Sort, Slice, Place. Built via Pipeline::builder().
ContextKind Extensible string enum classifying item type (Message, Document, ToolOutput, Memory, SystemPrompt). Case-insensitive comparison.
ContextSource Extensible string enum identifying item origin (Chat, Tool, Rag). Case-insensitive comparison.

Quickstart

A minimal pipeline that scores three items by recency, greedily fills a 500-token budget, and places results in chronological order:

use std::collections::HashMap;
use chrono::Utc;
use cupel::*;

# fn main() -> Result<(), CupelError> {
let now = Utc::now();

let items = vec![
    ContextItemBuilder::new("System: you are a helpful assistant", 20)
        .kind(ContextKind::new("SystemPrompt")?)
        .timestamp(now - chrono::Duration::seconds(30))
        .build()?,
    ContextItemBuilder::new("User: summarize the RFC", 15)
        .kind(ContextKind::new("Message")?)
        .timestamp(now - chrono::Duration::seconds(20))
        .build()?,
    ContextItemBuilder::new("Tool: RFC-1234 full text (4k tokens)", 400)
        .kind(ContextKind::new("ToolOutput")?)
        .timestamp(now - chrono::Duration::seconds(10))
        .build()?,
];

let budget = ContextBudget::new(
    1024,  // max_tokens (model context window)
    500,   // target_tokens (soft goal)
    200,   // output_reserve
    HashMap::new(),
    0.0,
)?;

let pipeline = Pipeline::builder()
    .scorer(Box::new(RecencyScorer))
    .slicer(Box::new(GreedySlice))
    .placer(Box::new(ChronologicalPlacer))
    .build()?;

let result = pipeline.run(&items, &budget)?;
assert!(!result.is_empty());
# Ok(())
# }

Multi-scorer pipeline

A more realistic configuration combining KindScorer and RecencyScorer through a weighted composite. QuotaSlice reserves at least 20% of the budget for tool outputs while capping messages at 50%. UShapedPlacer positions the highest-scored items at the start and end of the window where LLM attention is strongest.

use std::collections::HashMap;
use chrono::Utc;
use cupel::*;

# fn main() -> Result<(), CupelError> {
let now = Utc::now();

let items = vec![
    ContextItemBuilder::new("System: you are a code reviewer", 25)
        .kind(ContextKind::new("SystemPrompt")?)
        .timestamp(now - chrono::Duration::seconds(60))
        .pinned(true)
        .build()?,
    ContextItemBuilder::new("User: review my PR", 10)
        .kind(ContextKind::new("Message")?)
        .timestamp(now - chrono::Duration::seconds(50))
        .build()?,
    ContextItemBuilder::new("Memory: user prefers concise feedback", 12)
        .kind(ContextKind::new("Memory")?)
        .timestamp(now - chrono::Duration::seconds(45))
        .build()?,
    ContextItemBuilder::new("Tool: git diff output (3k tokens)", 350)
        .kind(ContextKind::new("ToolOutput")?)
        .timestamp(now - chrono::Duration::seconds(5))
        .build()?,
    ContextItemBuilder::new("Tool: lint warnings", 80)
        .kind(ContextKind::new("ToolOutput")?)
        .timestamp(now - chrono::Duration::seconds(3))
        .build()?,
];

// Weighted composite: 60% kind relevance, 40% recency
let scorer = CompositeScorer::new(vec![
    (Box::new(ScaledScorer::new(Box::new(KindScorer::with_default_weights()))), 0.6),
    (Box::new(ScaledScorer::new(Box::new(RecencyScorer))), 0.4),
])?;

// Reserve 20% for tool outputs, cap messages at 50%
let quotas = vec![
    QuotaEntry::new(ContextKind::new("ToolOutput")?, 20.0, 80.0)?,
    QuotaEntry::new(ContextKind::new("Message")?, 0.0, 50.0)?,
];
let slicer = QuotaSlice::new(quotas, Box::new(GreedySlice))?;

let budget = ContextBudget::new(2048, 800, 400, HashMap::new(), 0.0)?;

let pipeline = Pipeline::builder()
    .scorer(Box::new(scorer))
    .slicer(Box::new(slicer))
    .placer(Box::new(UShapedPlacer))
    .build()?;

let result = pipeline.run(&items, &budget)?;
// Pinned system prompt is always included regardless of scoring
assert!(result.iter().any(|item| item.content().contains("code reviewer")));
# Ok(())
# }

Scorers

Scorer Use case Mechanism
RecencyScorer Prefer recent items Rank-based; requires timestamp
PriorityScorer Explicit importance ranking Rank-based; requires priority field
KindScorer Weight by content type Absolute; configurable weight map
TagScorer Weighted tag matching Absolute; configurable tag weights
FrequencyScorer Boost commonly-tagged items Relative; requires tags
ReflexiveScorer Pass through future_relevance_hint Absolute; requires hint field
CompositeScorer Combine multiple strategies Weighted average; meta-scorer
ScaledScorer Normalize any scorer to [0,1] Min-max normalization; wrapper

Serde support

Enable the serde feature for Serialize/Deserialize on all model types:

[dependencies]
cupel = { version = "1.2", features = ["serde"] }

ContextBudget and ContextItem validate constraints on deserialization — invalid JSON is rejected at the boundary, not at pipeline runtime.

License

MIT