slabs 0.1.2

Text chunking for RAG: fixed, sentence, recursive, and semantic strategies
Documentation

slabs

crates.io Documentation CI

Text chunking for RAG pipelines.

Dual-licensed under MIT or Apache-2.0.

Quickstart

[dependencies]
slabs = "0.1.0"
use slabs::{Chunker, RecursiveChunker};

let chunker = RecursiveChunker::prose(500);
let text = "Your long document here...";
let slabs = chunker.chunk(text);

for slab in slabs {
    println!("[{}..{}]: {}", slab.start, slab.end, slab.text);
}

Strategies

Strategy Use Case Complexity
Fixed Homogeneous content, baselines $O(n)$
Sentence Prose, articles $O(n)$
Recursive General-purpose $O(n \log n)$
Semantic Topic coherence (semantic feature) $O(nd)$
Late Contextual embeddings across chunk boundaries Depends on base chunker

Features

Feature What it enables
semantic Semantic chunker (requires fastembed, innr, textprep)
code Code-aware chunker via tree-sitter (Rust, Python, TypeScript, Go)
cli slabs CLI binary