Expand description
Semantic coherence scoring for BAR-RAG Semantic Coherence Scoring for Boundary-Aware Chunking
This module implements semantic coherence analysis using sentence embeddings to optimize chunk boundaries for maximum semantic unity.
Key capabilities:
- Cosine similarity calculation between sentence embeddings
- Intra-chunk coherence scoring
- Optimal split-point detection via binary search
- Adaptive threshold based on embedding distances
§References
- BAR-RAG Paper: “Boundary-Aware Retrieval-Augmented Generation”
- Target: +40% semantic coherence improvement
Structs§
- Coherence
Config - Configuration for semantic coherence scoring
- Optimal
Split - Result of split-point optimization
- Scored
Chunk - Represents a candidate chunk with coherence score
- Semantic
Coherence Scorer - Semantic coherence scorer using sentence embeddings