Skip to main content

Module chunk_text

Module chunk_text 

Source
Expand description

CHUNK_TEXT(text, chunk_size, overlap, strategy) — deterministic text splitting.

Splits a text string into overlapping chunks using one of three strategies:

  • character: split at character boundaries, respecting chunk_size and overlap
  • sentence: split at sentence boundaries (. ! ? followed by whitespace)
  • paragraph: split at double-newline boundaries

All operations are UTF-8 safe (split on char boundaries, not byte boundaries). Shared between Origin and Lite.

Structs§

TextChunk
A single chunk produced by text splitting.

Enums§

ChunkError
Error returned when chunk parameters are invalid.
ChunkStrategy
Chunking strategy.

Functions§

chunk_text
Split text into chunks using the specified strategy.