pub struct ChunkConfig {
pub max_chars: usize,
pub overlap: usize,
pub min_chunk_size: usize,
}Expand description
Configuration for text chunking.
Fields§
§max_chars: usizeMaximum characters per chunk. Default: 2000 (~500 tokens for most models).
overlap: usizeNumber of characters to overlap between chunks. Default: 200 (~50 tokens) for context continuity.
min_chunk_size: usizeMinimum chunk size (avoids tiny trailing chunks). Default: 100 characters.
Implementations§
Source§impl ChunkConfig
impl ChunkConfig
Sourcepub fn for_ollama() -> Self
pub fn for_ollama() -> Self
Create a config optimized for Ollama nomic-embed-text.
nomic-embed-text has an 8192 token context window. We use conservative chunking to stay well under the limit.
Sourcepub fn for_minilm() -> Self
pub fn for_minilm() -> Self
Create a config for HuggingFace MiniLM models.
MiniLM models have a 256 token limit, so we use smaller chunks.
Trait Implementations§
Source§impl Clone for ChunkConfig
impl Clone for ChunkConfig
Source§fn clone(&self) -> ChunkConfig
fn clone(&self) -> ChunkConfig
Returns a duplicate of the value. Read more
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for ChunkConfig
impl Debug for ChunkConfig
Auto Trait Implementations§
impl Freeze for ChunkConfig
impl RefUnwindSafe for ChunkConfig
impl Send for ChunkConfig
impl Sync for ChunkConfig
impl Unpin for ChunkConfig
impl UnsafeUnpin for ChunkConfig
impl UnwindSafe for ChunkConfig
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more