Expand description

The index (lookup table of words) lives here. The trait Provider (and OccurenceProvider) enables multiple types of indices to be defined.

The only one (for now) is Simple. That stores a list of all the documents which contains each word in the input data (e.g. web pages). It then fetches those documents again and finds occurrences within those.


The DocumentMap makes it performant to get the document ID from name and vice versa.

Structs

Wrapper for representing T as only containing alphanumeric characters.

If Occurence is part of an AND, these can be associated to tell where the other parts of the AND chain are.

Map of documents and their Ids to quickly get name from id and vice versa.

Id of a document.

Index which keeps track of all occurrences of all words.

The docs this word exists in. Each doc has an associated LosslessDocOccurrences which keeps track of all the occurrences in that document.

The occurrences of a word in this document.

Get occurrences of a word (or similar words) from this Lossless index.

A list of missing occurrences collected when searching for occurrences using SimpleOccurences.

An occurrence of crate::Query.

Needed to index a custom struct in maps. We have to have the same type, so this acts as both the borrowed and owned.

Eq isn’t implemented as you’d probably want to check which document it belongs to as well.

Traits

Allows to insert words and remove occurrences from documents.

Functions

Returns the next valid UTF-8 character.