Struct Splitter

Source
pub struct Splitter { /* private fields */ }
Expand description

A struct for splitting texts into segments based on the most desirable separator found.

§Examples

use semchunk_rs::Splitter;
let splitter = Splitter::default();
let text = "Hello World\nGoodbye World";
let (separator, is_whitespace, segments) = splitter.split_text(text);
assert_eq!(separator, "\n");
assert!(is_whitespace);
assert_eq!(segments, vec!["Hello World", "Goodbye World"]);

Implementations§

Source§

impl Splitter

Source

pub fn split_text<'a>(&self, text: &'a str) -> (&'a str, bool, Vec<&'a str>)

Splits the given text into segments based on the most desirable separator found.

The method prioritizes separators in the following order:

  1. The largest sequence of newlines and/or carriage returns.
  2. The largest sequence of tabs.
  3. The largest sequence of whitespace characters.
  4. A semantically meaningful non-whitespace separator.

If no semantically meaningful separator is found, the text is split into individual characters.

§Arguments
  • text - A string slice that holds the text to be split.
§Returns

A tuple containing:

  • The separator used for splitting the text.
  • A boolean indicating whether the separator is whitespace.
  • A vector of string slices representing the segments of the split text.
§Examples
use semchunk_rs::Splitter;
let splitter = Splitter::default();
let text = "Hello World\nGoodbye World";
let (separator, is_whitespace, segments) = splitter.split_text(text);
assert_eq!(separator, "\n");
assert!(is_whitespace);
assert_eq!(segments, vec!["Hello World", "Goodbye World"]);

Trait Implementations§

Source§

impl Debug for Splitter

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for Splitter

Source§

fn default() -> Self

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.