Module statistical_filter

Module statistical_filter 

Source
Expand description

Statistical token importance filtering (LLMLingua-inspired, model-free)

This module implements a compression strategy similar to LLMLingua but using pure statistical heuristics instead of model-based perplexity scoring.

Enhanced with token-aware semantic preservation:

  • Protects code blocks, JSON, paths, identifiers
  • Contextual stopword filtering
  • Preserves negations, comparators, domain terms

Structsยง

StatisticalFilter
Statistical token filter (model-free alternative to LLMLingua)
StatisticalFilterConfig
Configuration for statistical filtering
WordImportance
Importance score for a word based on statistical features