[−][src]Enum imdb_index::NgramType
The style of ngram extraction to use.
The same style of ngram extraction is always used at index time and at query time.
Each ngram type uses the ngram size configuration differently.
All ngram styles used Unicode codepoints as the definition of a character. For example, a 3-gram might contain up to 4 bytes, if it contains 3 Unicode codepoints that each require 4 UTF-8 code units.
Variants
Window
A windowing ngram.
This is the tradition style of ngram, where sliding window of size
N
is moved across the entire content to be index. For example, the
3-grams for the string homer
are hom, ome and mer.
Edge
An edge ngram.
This style of ngram produces ever longer ngrams, where each ngram is anchored to the start of a word. Words are determined simply by splitting whitespace.
For example, the edge ngrams of homer simpson
, where the max ngram
size is 5, would be: hom, home, homer, sim, simp, simps. Generally,
for this ngram type, one wants to use a large maximum ngram size.
Perhaps somewhere close to the maximum number of ngrams in any word
in the corpus.
Note that there is no way to set the minimum ngram size (which is 3).
Methods
impl NgramType
[src]
pub fn possible_names() -> &'static [&'static str]
[src]
Return all possible ngram types.
pub fn as_str(&self) -> &'static str
[src]
Return a string representation of this type.
Trait Implementations
impl PartialEq<NgramType> for NgramType
[src]
fn eq(&self, other: &NgramType) -> bool
[src]
#[must_use]
fn ne(&self, other: &Rhs) -> bool
1.0.0[src]
This method tests for !=
.
impl Copy for NgramType
[src]
impl Eq for NgramType
[src]
impl Default for NgramType
[src]
impl Clone for NgramType
[src]
fn clone(&self) -> NgramType
[src]
fn clone_from(&mut self, source: &Self)
1.0.0[src]
Performs copy-assignment from source
. Read more
impl Hash for NgramType
[src]
fn hash<__H: Hasher>(&self, state: &mut __H)
[src]
fn hash_slice<H>(data: &[Self], state: &mut H) where
H: Hasher,
1.3.0[src]
H: Hasher,
Feeds a slice of this type into the given [Hasher
]. Read more
impl Debug for NgramType
[src]
impl Display for NgramType
[src]
impl FromStr for NgramType
[src]
type Err = Error
The associated error which can be returned from parsing.
fn from_str(s: &str) -> Result<NgramType>
[src]
impl<'de> Deserialize<'de> for NgramType
[src]
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error> where
__D: Deserializer<'de>,
[src]
__D: Deserializer<'de>,
impl Serialize for NgramType
[src]
Auto Trait Implementations
impl Send for NgramType
impl Unpin for NgramType
impl Sync for NgramType
impl UnwindSafe for NgramType
impl RefUnwindSafe for NgramType
Blanket Implementations
impl<T> ToString for T where
T: Display + ?Sized,
[src]
T: Display + ?Sized,
impl<T> ToOwned for T where
T: Clone,
[src]
T: Clone,
type Owned = T
The resulting type after obtaining ownership.
fn to_owned(&self) -> T
[src]
fn clone_into(&self, target: &mut T)
[src]
impl<T> From<T> for T
[src]
impl<T, U> Into<U> for T where
U: From<T>,
[src]
U: From<T>,
impl<T, U> TryFrom<U> for T where
U: Into<T>,
[src]
U: Into<T>,
type Error = Infallible
The type returned in the event of a conversion error.
fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>
[src]
impl<T, U> TryInto<U> for T where
U: TryFrom<T>,
[src]
U: TryFrom<T>,
type Error = <U as TryFrom<T>>::Error
The type returned in the event of a conversion error.
fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>
[src]
impl<T> BorrowMut<T> for T where
T: ?Sized,
[src]
T: ?Sized,
fn borrow_mut(&mut self) -> &mut T
[src]
impl<T> Borrow<T> for T where
T: ?Sized,
[src]
T: ?Sized,
impl<T> Any for T where
T: 'static + ?Sized,
[src]
T: 'static + ?Sized,
impl<T> DeserializeOwned for T where
T: Deserialize<'de>,
[src]
T: Deserialize<'de>,