Module span

Module span 

Source
Expand description

Span extraction from documents

This module handles extracting meaningful text spans from documents. Spans are the fundamental unit of retrieval in AvocadoDB.

§Key Principles

  • Spans should be 20-50 lines (roughly 200-500 tokens)
  • Respect natural boundaries (paragraphs, code blocks, sections)
  • Never split mid-sentence
  • Maintain line number accuracy for citations

Functions§

extract_spans
Extract spans from document text