pub struct KeywordExtractionConfig<'a> {
    pub sentence_embeddings_config: SentenceEmbeddingsConfig,
    pub tokenizer_stopwords: Option<HashSet<&'a str>>,
    pub tokenizer_pattern: Option<Regex>,
    pub scorer_type: KeywordScorerType,
    pub ngram_range: (usize, usize),
    pub num_keywords: usize,
    pub diversity: Option<f64>,
    pub max_sum_candidates: Option<usize>,
}
Expand description

Fields§

§sentence_embeddings_config: SentenceEmbeddingsConfig

SentenceEmbeddingsConfig defining the sentence embeddings model to use

§tokenizer_stopwords: Option<HashSet<&'a str>>

Optional list of tokenizer stopwords to exclude from the keywords candidate list. Default to a list of English stopwords.

§tokenizer_pattern: Option<Regex>

Optional tokenization regex pattern. Defaults to sequence of word characters.

§scorer_type: KeywordScorerType

KeywordScorerType used to rank keywords.

§ngram_range: (usize, usize)

N-gram range (inclusive) for keywords. (1, 2) would consider all 1 and 2 word gram for keyword candidates.

§num_keywords: usize

Number of keywords to return

§diversity: Option<f64>

Optional diversity parameter used for the MaximalMarginRelevance ranker, defaults to 0.5. A high diversity (closer to 1.0) will give more importance to getting varied keywords, at the cost of less relevance to the original document.

§max_sum_candidates: Option<usize>

Optional number of candidate sets used for MaxSum ranker. Higher values are more likely to identify a global optimum for the ranker criterion, but are more likely to include sets that are less relevant to the input document. Larger values also have a higher computational and memory cost (N2 scale)

Trait Implementations§

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Gets the TypeId of self. Read more
Immutably borrows from an owned value. Read more
Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Instruments this type with the current Span, returning an Instrumented wrapper. Read more

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The alignment of pointer.
The type for initializers.
Initializes a with the given initializer. Read more
Dereferences the given pointer. Read more
Mutably dereferences the given pointer. Read more
Drops the object pointed to by the given pointer. Read more
Should always be Self
The type returned in the event of a conversion error.
Performs the conversion.
The type returned in the event of a conversion error.
Performs the conversion.
Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more