pub struct SearchEngineBuilder<K, D = u32, T = DefaultTokenizer> { /* private fields */ }Expand description
A consuming builder for SearchEngine. K is the type of the document id, D is the type of the token embedder and T is the type of the tokenizer.
Implementations§
Source§impl<K, D, T> SearchEngineBuilder<K, D, T>where
K: Hash + Eq + Clone,
D: TokenEmbedder,
D::EmbeddingSpace: Eq + Hash + Clone,
T: Tokenizer + Sync,
impl<K, D, T> SearchEngineBuilder<K, D, T>where
K: Hash + Eq + Clone,
D: TokenEmbedder,
D::EmbeddingSpace: Eq + Hash + Clone,
T: Tokenizer + Sync,
Sourcepub fn with_avgdl(avgdl: f32) -> SearchEngineBuilder<K, D, T>where
T: Default,
pub fn with_avgdl(avgdl: f32) -> SearchEngineBuilder<K, D, T>where
T: Default,
Constructs a new SearchEngineBuilder with the given average document length. Use this if you
know the average document length in advance. If you don’t, but you have your full corpus
ahead of time, use with_documents or with_corpus instead.
If you have neither the full corpus nor a sample of it, you can configure the embedder to
disregard document length by setting b to 0.0. In this case, it doesn’t matter what
value you pass to with_avgdl.
The average document length is the average number of tokens in a document from your corpus;
if you need access to this value, you can construct an Embedder and call avgdl on it.
Sourcepub fn with_tokenizer_and_documents(
tokenizer: T,
documents: impl IntoIterator<Item = impl Into<Document<K>>>,
) -> SearchEngineBuilder<K, D, T>
pub fn with_tokenizer_and_documents( tokenizer: T, documents: impl IntoIterator<Item = impl Into<Document<K>>>, ) -> SearchEngineBuilder<K, D, T>
Constructs a new SearchEngineBuilder with the given documents. The search engine will fit
to the given documents, using the given tokenizer. When you call build, the builder
will pre-populate the search engine with the given documents, and pass on the tokenizer.
Sourcepub fn with_tokenizer_and_corpus(
tokenizer: T,
corpus: impl IntoIterator<Item = impl Into<String>>,
) -> SearchEngineBuilder<u32, D, T>
pub fn with_tokenizer_and_corpus( tokenizer: T, corpus: impl IntoIterator<Item = impl Into<String>>, ) -> SearchEngineBuilder<u32, D, T>
Constructs a new SearchEngineBuilder with the corpus. The search engine will fit
to the given corpus, using the given tokenizer. When you call build, the builder
will pre-populate the search engine with the given corpus, and pass on the tokenizer.
This function will automatically generate u32 ids for each entry in your corpus.
Sourcepub fn build(self) -> SearchEngine<K, D, T>
pub fn build(self) -> SearchEngine<K, D, T>
Builds the search engine.
Source§impl<K, D> SearchEngineBuilder<K, D, DefaultTokenizer>
impl<K, D> SearchEngineBuilder<K, D, DefaultTokenizer>
Sourcepub fn with_documents(
language_mode: impl Into<LanguageMode>,
documents: impl IntoIterator<Item = impl Into<Document<K>>>,
) -> Self
pub fn with_documents( language_mode: impl Into<LanguageMode>, documents: impl IntoIterator<Item = impl Into<Document<K>>>, ) -> Self
Constructs a new SearchEngineBuilder with the given documents. The search engine will fit
to the given documents, using the default tokenizer configured with the given language mode.
When you call build, the builder will pre-populate the search engine with the given
documents, and pass on the tokenizer.
Sourcepub fn with_corpus(
language_mode: impl Into<LanguageMode>,
corpus: impl IntoIterator<Item = impl Into<String>>,
) -> SearchEngineBuilder<u32, D, DefaultTokenizer>
pub fn with_corpus( language_mode: impl Into<LanguageMode>, corpus: impl IntoIterator<Item = impl Into<String>>, ) -> SearchEngineBuilder<u32, D, DefaultTokenizer>
Constructs a new SearchEngineBuilder with the corpus. The search engine will fit
to the given corpus, using the default tokenizer configured with the given language mode.
When you call build, the builder will pre-populate the search engine with the given
corpus and pass on the tokenizer. This function will automatically generate u32 ids for
each entry in your corpus.
Sourcepub fn language_mode(self, language_mode: impl Into<LanguageMode>) -> Self
pub fn language_mode(self, language_mode: impl Into<LanguageMode>) -> Self
Sets the tokenizer to the default tokenizer with the given language mode.
Auto Trait Implementations§
impl<K, D, T> Freeze for SearchEngineBuilder<K, D, T>where
T: Freeze,
impl<K, D, T> RefUnwindSafe for SearchEngineBuilder<K, D, T>
impl<K, D, T> Send for SearchEngineBuilder<K, D, T>
impl<K, D, T> Sync for SearchEngineBuilder<K, D, T>
impl<K, D, T> Unpin for SearchEngineBuilder<K, D, T>
impl<K, D, T> UnwindSafe for SearchEngineBuilder<K, D, T>
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more