pub struct RecursiveCharacterTextSplitter { /* private fields */ }Expand description
Recursive character-based text splitter
Tries to split on natural boundaries in this order:
- Double newlines (paragraphs)
- Single newlines (lines)
- Sentences (periods, question marks, exclamation points)
- Words (spaces)
- Characters (last resort)
§Example
use vecstore::text_splitter::{RecursiveCharacterTextSplitter, TextSplitter};
let splitter = RecursiveCharacterTextSplitter::new(1000, 100);
let text = "First paragraph.\n\nSecond paragraph with more content...";
let chunks = splitter.split_text(text)?;Implementations§
Source§impl RecursiveCharacterTextSplitter
impl RecursiveCharacterTextSplitter
Sourcepub fn new(chunk_size: usize, chunk_overlap: usize) -> Self
pub fn new(chunk_size: usize, chunk_overlap: usize) -> Self
Create a new recursive splitter
§Arguments
chunk_size- Maximum characters per chunkchunk_overlap- Characters to overlap between chunks (for context continuity)
§Example
use vecstore::text_splitter::RecursiveCharacterTextSplitter;
// 500 char chunks with 50 char overlap
let splitter = RecursiveCharacterTextSplitter::new(500, 50);Sourcepub fn with_separators(self, separators: Vec<String>) -> Self
pub fn with_separators(self, separators: Vec<String>) -> Self
Create with custom separators
Trait Implementations§
Auto Trait Implementations§
impl Freeze for RecursiveCharacterTextSplitter
impl RefUnwindSafe for RecursiveCharacterTextSplitter
impl Send for RecursiveCharacterTextSplitter
impl Sync for RecursiveCharacterTextSplitter
impl Unpin for RecursiveCharacterTextSplitter
impl UnwindSafe for RecursiveCharacterTextSplitter
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more