Skip to main content

Module inverted_index

Module inverted_index 

Source
Expand description

Inverted Index for Lexical Search (Task 4)

This module implements an inverted index for BM25-based lexical search.

§Structure

Term → PostingList
┌────────────────────────────────────────────────────────────────┐
│  "hello" → [(doc_1, tf=2, positions=[0,5]), (doc_3, tf=1, ...)]│
│  "world" → [(doc_1, tf=1, positions=[1]), (doc_2, tf=3, ...)] │
│  ...                                                           │
└────────────────────────────────────────────────────────────────┘

§Query Execution

  1. Tokenize query
  2. Look up posting lists for each query term
  3. Score documents using BM25
  4. Return top-K results

Structs§

DocumentInfo
Information about an indexed document
InvertedIndex
Inverted index for lexical search
InvertedIndexBuilder
Builder for batch index construction
Posting
A posting for a single document
PostingList
A posting list for a term

Type Aliases§

DocId
Document ID type
Position
Term position in document
TermFreq
Term frequency