Skip to main content

Module query_parser

shodh_memory::memory

Module query_parser

Expand description

Linguistic Query Parser

Based on:

Lioma & Ounis (2006): “Content Load of Part of Speech Blocks”
Bendersky & Croft (2008): “Discovering Key Concepts in Verbose Queries”
Porter (1980): Stemming algorithm for term normalization

Extracts focal entities (nouns), discriminative modifiers (adjectives), and relational context (verbs) from natural language queries.

§Polished Features (v2)

Porter2 stemming for term normalization
Compound noun detection (bigrams/trigrams)
Context-aware POS disambiguation
Negation scope tracking
IDF-inspired term rarity weighting

§Shallow Parsing / Chunking (v3)

Sentence-level chunking for co-occurrence detection
POS-based entity extraction (all nouns, verbs, adjectives - not just top-N)
Designed for both query analysis AND memory storage

§Temporal Extraction (v4)

Extract dates from natural language text (“May 7, 2023”, “yesterday”, “last week”)
Detect temporal queries (“when did”, “what date”, “how long ago”)
Based on TEMPR approach (Hindsight paper achieving 89.6% on LoCoMo)

Structs§

AttributeQuery: Extracted attribute query components
ChunkExtraction: Result of chunking a document
FocalEntity: Focal entity extracted from query (noun)
Modifier: Discriminative modifier (adjective/qualifier)
QueryAnalysis: Complete linguistic analysis of a query
Relation: Relational context (verb)
SentenceChunk: A sentence chunk containing tagged words
TaggedWord: A word with its POS annotation
TemporalExtraction: Result of temporal extraction from text
TemporalRef: A temporal reference extracted from text

Enums§

PosTag: Part of speech tag
QueryIntent: Query intent type for retrieval strategy selection (SHO-D6)
QueryType: Type of query for routing to appropriate retrieval strategy
TemporalIntent: Query temporal intent
TemporalRefType: Type of temporal reference

Functions§

analyze_query: Parse query using linguistic analysis with Porter2 stemming
asks_for_temporal_answer: Check if a query is asking FOR a temporal answer (when did X happen?)
classify_query: Classify a query to determine retrieval strategy
detect_attribute_query: Detect and extract attribute query components
detect_temporal_intent: Detect temporal intent in a query
extract_chunks: Extract chunks from text using shallow parsing
extract_temporal_refs: Extract temporal references from text
requires_temporal_filtering: Check if a query requires temporal filtering for accurate retrieval