Struct text_splitter::ChunkCapacity
source · pub struct ChunkCapacity { /* private fields */ }
Expand description
Describes the valid chunk size(s) that can be generated.
The desired
size is the target size for the chunk. In most cases, this
will also serve as the maximum size of the chunk. It is always possible
that a chunk may be returned that is less than the desired
value, as
adding the next piece of text may have made it larger than the desired
capacity.
The max
size is the maximum possible chunk size that can be generated.
By setting this to a larger value than desired
, it means that the chunk
should be as close to desired
as possible, but can be larger if it means
staying at a larger semantic level.
The splitter will consume text until at maxumum somewhere between desired
and max
, if they differ, but never above max
.
If you need to ensure a fixed size, set desired
and max
to the same
value. For example, if you are trying to maximize the context window for an
embedding.
If you are loosely targeting a size, but have some extra room, for example
in a RAG use case where you roughly want a certain part of a document, you
can set max
to your absolute maxumum, and the splitter can stay at a
higher semantic level when determining the chunk.
Implementations§
source§impl ChunkCapacity
impl ChunkCapacity
sourcepub fn desired(&self) -> usize
pub fn desired(&self) -> usize
The desired
size is the target size for the chunk. In most cases, this
will also serve as the maximum size of the chunk. It is always possible
that a chunk may be returned that is less than the desired
value, as
adding the next piece of text may have made it larger than the desired
capacity.
sourcepub fn max(&self) -> usize
pub fn max(&self) -> usize
The max
size is the maximum possible chunk size that can be generated.
By setting this to a larger value than desired
, it means that the chunk
should be as close to desired
as possible, but can be larger if it means
staying at a larger semantic level.
sourcepub fn with_max(self, max: usize) -> Result<Self, ChunkCapacityError>
pub fn with_max(self, max: usize) -> Result<Self, ChunkCapacityError>
If you need to ensure a fixed size, set desired
and max
to the same
value. For example, if you are trying to maximize the context window for an
embedding.
If you are loosely targeting a size, but have some extra room, for example
in a RAG use case where you roughly want a certain part of a document, you
can set max
to your absolute maxumum, and the splitter can stay at a
higher semantic level when determining the chunk.
§Errors
If the max
size is less than the desired
size, an error is returned.
Trait Implementations§
source§impl Clone for ChunkCapacity
impl Clone for ChunkCapacity
source§fn clone(&self) -> ChunkCapacity
fn clone(&self) -> ChunkCapacity
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moresource§impl Debug for ChunkCapacity
impl Debug for ChunkCapacity
source§impl From<RangeFull> for ChunkCapacity
impl From<RangeFull> for ChunkCapacity
source§impl From<RangeInclusive<usize>> for ChunkCapacity
impl From<RangeInclusive<usize>> for ChunkCapacity
source§fn from(range: RangeInclusive<usize>) -> Self
fn from(range: RangeInclusive<usize>) -> Self
source§impl From<RangeToInclusive<usize>> for ChunkCapacity
impl From<RangeToInclusive<usize>> for ChunkCapacity
source§fn from(range: RangeToInclusive<usize>) -> Self
fn from(range: RangeToInclusive<usize>) -> Self
source§impl From<usize> for ChunkCapacity
impl From<usize> for ChunkCapacity
source§impl PartialEq for ChunkCapacity
impl PartialEq for ChunkCapacity
source§fn eq(&self, other: &ChunkCapacity) -> bool
fn eq(&self, other: &ChunkCapacity) -> bool
self
and other
values to be equal, and is used
by ==
.impl Copy for ChunkCapacity
impl StructuralPartialEq for ChunkCapacity
Auto Trait Implementations§
impl Freeze for ChunkCapacity
impl RefUnwindSafe for ChunkCapacity
impl Send for ChunkCapacity
impl Sync for ChunkCapacity
impl Unpin for ChunkCapacity
impl UnwindSafe for ChunkCapacity
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> IntoEither for T
impl<T> IntoEither for T
source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moresource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read more