Skip to main content

Module long_text

Module long_text 

Source
Expand description

Long text processing: truncation and chunked selection/summarization.

Strategy: head + tail (default), or head_tail_extract (Position + Discourse + Entity scoring). Configurable via SKILLLITE_LONG_TEXT_STRATEGY:

  • head_tail_only: take first N + last M chunks (existing behavior)
  • head_tail_extract: score all chunks, take top-K by score, preserve order
  • mapreduce_full: process ALL chunks (no filtering), Reduce merge; best with SKILLLITE_MAP_MODEL

MapReduce model: when SKILLLITE_MAP_MODEL is set, Map uses that cheaper model; Reduce uses main.

Env: SKILLLITE_CHUNK_SIZE, SKILLLITE_HEAD_CHUNKS, SKILLLITE_TAIL_CHUNKS, SKILLLITE_MAX_OUTPUT_CHARS, SKILLLITE_LONG_TEXT_STRATEGY, SKILLLITE_EXTRACT_TOP_K_RATIO, SKILLLITE_MAP_MODEL (optional, for Map stage)

Functionsยง

maybe_process_user_input
Guard user input against context overflow.
summarize_long_content
Summarize long content using LLM with configurable chunk selection strategy.
truncate_content
Simple truncation with notice.