pub struct DecompositionFst<D: AsRef<[u8]>> { /* private fields */ }Expand description
Decompose compound words into their parts found in a given dictionary. Useful for compressed, memory mapped dictionaries.
This implementation is based on the fst crate.
This will decompose only if a word can be fully resolved using the dictionary, any texts containing unknown “words” will be passed through as-is.
This is an adaption of the FstSegmenter from charabia for this crate. Unlike the charabia crate, this does not do splitting into unknown segments using heuristics or character splitting.
Implementations§
Source§impl DecompositionFst<Vec<u8>>
impl DecompositionFst<Vec<u8>>
Sourcepub fn from_dictionary<I, P>(dict: I) -> Result<Self, Error>
pub fn from_dictionary<I, P>(dict: I) -> Result<Self, Error>
Convenience contructor for DecompositionFst,
takes a list of lexicographically ordered words
to recognize as valid parts for decomposition.
If you’re using this constructor outside of testing and development please have a look at the DecompositionAhoCorasick implementation as it is more likely to fit your usecase of an in-memory automaton matcher.
Trait Implementations§
Source§impl<D: Clone + AsRef<[u8]>> Clone for DecompositionFst<D>
impl<D: Clone + AsRef<[u8]>> Clone for DecompositionFst<D>
Source§fn clone(&self) -> DecompositionFst<D>
fn clone(&self) -> DecompositionFst<D>
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl<D> Segmenter for DecompositionFst<D>
impl<D> Segmenter for DecompositionFst<D>
Source§type SubdivisionIter<'a> = IntoIter<SegmentedToken<'a>>
type SubdivisionIter<'a> = IntoIter<SegmentedToken<'a>>
subdivide function if it has multiple results. Read moreSource§fn subdivide<'a>(
&self,
token: SegmentedToken<'a>,
) -> UseOrSubdivide<SegmentedToken<'a>, IntoIter<SegmentedToken<'a>>> ⓘ
fn subdivide<'a>( &self, token: SegmentedToken<'a>, ) -> UseOrSubdivide<SegmentedToken<'a>, IntoIter<SegmentedToken<'a>>> ⓘ
token into zero, one or more subtokens. Read moreAuto Trait Implementations§
impl<D> Freeze for DecompositionFst<D>where
D: Freeze,
impl<D> RefUnwindSafe for DecompositionFst<D>where
D: RefUnwindSafe,
impl<D> Send for DecompositionFst<D>where
D: Send,
impl<D> Sync for DecompositionFst<D>where
D: Sync,
impl<D> Unpin for DecompositionFst<D>where
D: Unpin,
impl<D> UnwindSafe for DecompositionFst<D>where
D: UnwindSafe,
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more