pub struct NormalizationRustStemmers {
pub anyway_above_confidence: f64,
}
Expand description
Will run stemming with the language tagged onto the token if an algorithm is available.
This uses the rust_stemmers crate under the hood.
This is recommended to be run after an AugmentationDetectLanguage has been used, it will not do anything if no language metadata is available!
Tokens will be ignored if:
- They are known to not be an SegmentedTokenKind::AlphaNumeric
- They already have
normalized_text
set. Apply things like lowercasing after this.
Fields§
§anyway_above_confidence: f64
Thereshold above which the flag about the lnguage detection flagging itself as reliable is ignored and the detected lnguage used for normalization anyway. Setting this can help with shorter texts.
1.0 which translates to never ignore the flag. 0.0 would mean to always ignore it.
Default is 0.4 as that is usually “good enough” for correct stemming.
Implementations§
Source§impl NormalizationRustStemmers
impl NormalizationRustStemmers
Sourcepub fn new() -> Self
pub fn new() -> Self
Create a new NormalizationRustStemmers instance with the default settings.
Sourcepub fn set_anyway_above_confidence(self, anyway_above_confidence: f64) -> Self
pub fn set_anyway_above_confidence(self, anyway_above_confidence: f64) -> Self
Adjust the value of anyway_above_confidence builder style.
Trait Implementations§
Source§impl Augmenter for NormalizationRustStemmers
impl Augmenter for NormalizationRustStemmers
Source§fn augment<'a>(&self, token: SegmentedToken<'a>) -> SegmentedToken<'a>
fn augment<'a>(&self, token: SegmentedToken<'a>) -> SegmentedToken<'a>
Source§impl Clone for NormalizationRustStemmers
impl Clone for NormalizationRustStemmers
Source§fn clone(&self) -> NormalizationRustStemmers
fn clone(&self) -> NormalizationRustStemmers
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moreSource§impl Debug for NormalizationRustStemmers
impl Debug for NormalizationRustStemmers
Auto Trait Implementations§
impl Freeze for NormalizationRustStemmers
impl RefUnwindSafe for NormalizationRustStemmers
impl Send for NormalizationRustStemmers
impl Sync for NormalizationRustStemmers
impl Unpin for NormalizationRustStemmers
impl UnwindSafe for NormalizationRustStemmers
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Segmenter for Twhere
T: Augmenter,
impl<T> Segmenter for Twhere
T: Augmenter,
Source§type SubdivisionIter<'a> = IntoIter<SegmentedToken<'a>>
type SubdivisionIter<'a> = IntoIter<SegmentedToken<'a>>
subdivide
function if it has multiple results. Read moreSource§fn subdivide<'a>(
&self,
token: SegmentedToken<'a>,
) -> UseOrSubdivide<SegmentedToken<'a>, <T as Segmenter>::SubdivisionIter<'a>> ⓘ
fn subdivide<'a>( &self, token: SegmentedToken<'a>, ) -> UseOrSubdivide<SegmentedToken<'a>, <T as Segmenter>::SubdivisionIter<'a>> ⓘ
token
into zero, one or more subtokens. Read more