Expand description
Semantic boundary detection for BAR-RAG Semantic Boundary Detection for Boundary-Aware Chunking
This module implements intelligent detection of semantic boundaries in text, enabling chunking strategies that respect natural document structure.
Key capabilities:
- Sentence boundary detection (NLTK-style rules)
- Paragraph detection (newline patterns)
- Heading detection (Markdown, RST, plaintext)
- List boundary detection
- Code block detection
§References
- BAR-RAG Paper: “Boundary-Aware Retrieval-Augmented Generation”
- Target: +40% semantic coherence, -60% entity fragmentation
Structs§
- Boundary
- Represents a detected boundary in text
- Boundary
Detection Config - Configuration for boundary detection
- Boundary
Detector - Boundary detector for semantic text segmentation
Enums§
- Boundary
Type - Type of boundary detected