pub struct HtmlSplitter { /* private fields */ }Expand description
HTML-aware splitter. Splits at heading boundaries; metadata records the active heading at each level.
Extension path: implement crate::splitters::TextSplitter
directly when the heading-based scheme doesn’t fit. The
HtmlSplitter::with_levels / HtmlSplitter::with_strip_tags /
HtmlSplitter::with_min_chunk_size knobs cover the common
configuration axes for the built-in heuristic.
Implementations§
Source§impl HtmlSplitter
impl HtmlSplitter
Sourcepub fn with_levels<I: IntoIterator<Item = u8>>(self, levels: I) -> Self
pub fn with_levels<I: IntoIterator<Item = u8>>(self, levels: I) -> Self
Restrict to specific heading levels (e.g. [1, 2] for top-two).
Whether to strip remaining HTML tags from chunk text. Default true.
Sourcepub fn with_min_chunk_size(self, n: usize) -> Self
pub fn with_min_chunk_size(self, n: usize) -> Self
Coalesce trailing chunks smaller than n chars into the previous one.
Trait Implementations§
Source§impl Clone for HtmlSplitter
impl Clone for HtmlSplitter
Source§fn clone(&self) -> HtmlSplitter
fn clone(&self) -> HtmlSplitter
Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
Performs copy-assignment from
source. Read moreSource§impl Debug for HtmlSplitter
impl Debug for HtmlSplitter
Source§impl Default for HtmlSplitter
impl Default for HtmlSplitter
Auto Trait Implementations§
impl Freeze for HtmlSplitter
impl RefUnwindSafe for HtmlSplitter
impl Send for HtmlSplitter
impl Sync for HtmlSplitter
impl Unpin for HtmlSplitter
impl UnsafeUnpin for HtmlSplitter
impl UnwindSafe for HtmlSplitter
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more