Expand description
Content chunking: 100-line chunks with 10-line overlap. Ports logic from src/gobby/code_index/chunker.py.
This remains gcode-owned because BM25 content indexing stores
line-based ContentChunk records with project, path, line range, language,
and timestamp fields. The generic gobby_core::indexing::Chunk and
ChunkIdentity primitives model byte ranges with opaque metadata, so
composing them here would hide a domain-specific projection rather than
remove shared foundation logic. gcode also derives incremental state from
PostgreSQL indexed_files.content_hash rows instead of consuming core
IndexEvent snapshots.
Functions§
- chunk_
file_ content - Split file content into overlapping chunks for FTS indexing.