Skip to main content

Module context_packing

Module context_packing 

Source
Expand description

Submodular context packing — choosing which items to include under a budget so the selected set maximises coverage of what the task cares about.

§Why submodular?

Picking a set of context items to maximise term coverage is the classic maximum coverage problem. Coverage is monotone submodular: each added item helps, but with diminishing returns (a second file covering the same terms adds little). For such objectives the simple greedy algorithm — repeatedly take the item with the largest marginal gain per unit cost — is provably within a 1 − 1/e ≈ 0.63 factor of the optimum (Nemhauser, Wolsey & Fisher 1978). That guarantee, plus O(n·k) cost, is exactly what we want for online context assembly.

This module is deliberately generic and side-effect free: callers map their domain objects (symbol bodies, ranked files, chunks) onto CoverageItems and get back the indices to keep, in selection order.

Structs§

CoverageItem
A candidate for inclusion: the set of weighted terms it covers and the budget cost of including it (e.g. its token count).

Functions§

greedy_max_coverage
Greedy submodular maximisation of weighted term coverage under budget.