pub struct LinguaLanguageBlockSentenceSplitter { /* private fields */ }Expand description
Combination of [LinguaLanguageBlockSplitter][crate::segmentation::LiguaLanguageBlockSplitter] and the UnicodeSentenceSplitter.
This is intended to be used near/at the start of the segmentation chain.
In case no language is detected it falls back to only splitting sentences.
Performance: There is a significant spin-up cost involved when using this struct for the first time that won’t happen on subsequent uses. This only happens once per program, independent of you keeping a specific instance or not. See the lingua documentation for RAM requirements.
Using low accuracy mode probably isn’t worth it.
Language support: Currently this crate uses whatlang::Lang to communicate languages, so the language support is the intersection of whatlang and what lingua::Language both support.
Implementations§
Source§impl LinguaLanguageBlockSentenceSplitter
impl LinguaLanguageBlockSentenceSplitter
Sourcepub fn new() -> Self
pub fn new() -> Self
Create a new LiguaLanguageBlockSplitter instance that is configured to preload all languages on the first use.
Sourcepub fn new_with_builder(builder: LanguageDetectorBuilder) -> Self
pub fn new_with_builder(builder: LanguageDetectorBuilder) -> Self
Create a new LiguaLanguageBlockSplitter from a custom LanguageDetectorBuilder.
Trait Implementations§
Source§impl Segmenter for LinguaLanguageBlockSentenceSplitter
impl Segmenter for LinguaLanguageBlockSentenceSplitter
Source§type SubdivisionIter<'a> = IntoIter<SegmentedToken<'a>>
type SubdivisionIter<'a> = IntoIter<SegmentedToken<'a>>
subdivide function if it has multiple results. Read moreSource§fn subdivide<'a>(
&self,
token: SegmentedToken<'a>,
) -> UseOrSubdivide<SegmentedToken<'a>, IntoIter<SegmentedToken<'a>>> ⓘ
fn subdivide<'a>( &self, token: SegmentedToken<'a>, ) -> UseOrSubdivide<SegmentedToken<'a>, IntoIter<SegmentedToken<'a>>> ⓘ
token into zero, one or more subtokens. Read moreAuto Trait Implementations§
impl Freeze for LinguaLanguageBlockSentenceSplitter
impl !RefUnwindSafe for LinguaLanguageBlockSentenceSplitter
impl Send for LinguaLanguageBlockSentenceSplitter
impl Sync for LinguaLanguageBlockSentenceSplitter
impl Unpin for LinguaLanguageBlockSentenceSplitter
impl !UnwindSafe for LinguaLanguageBlockSentenceSplitter
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more