pub struct HtmlHeaderTextSplitter { /* private fields */ }Expand description
Splits HTML content by header tags (h1, h2, h3, etc.).
Each section between headers becomes a separate document/chunk. Header text is added as metadata on the resulting chunks.
Implementations§
Source§impl HtmlHeaderTextSplitter
impl HtmlHeaderTextSplitter
Sourcepub fn new(headers_to_split_on: Vec<(String, String)>) -> Self
pub fn new(headers_to_split_on: Vec<(String, String)>) -> Self
Create a new splitter with custom header-to-metadata-key mappings.
Each tuple is (tag_name, metadata_key), e.g., ("h1", "Header 1").
Sourcepub fn default_headers() -> Self
pub fn default_headers() -> Self
Default configuration: split on h1, h2, h3.
Sourcepub fn split_html(&self, text: &str) -> Vec<Document>
pub fn split_html(&self, text: &str) -> Vec<Document>
Split HTML and return documents with header metadata.
Trait Implementations§
Auto Trait Implementations§
impl Freeze for HtmlHeaderTextSplitter
impl RefUnwindSafe for HtmlHeaderTextSplitter
impl Send for HtmlHeaderTextSplitter
impl Sync for HtmlHeaderTextSplitter
impl Unpin for HtmlHeaderTextSplitter
impl UnsafeUnpin for HtmlHeaderTextSplitter
impl UnwindSafe for HtmlHeaderTextSplitter
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more