Skip to main content

Module extraction

Module extraction 

Source
Expand description

Hybrid entity extraction: regex pre-filter + candle BERT NER (graceful degradation). Entity and URL extraction pipeline (NER + regex prefilter).

Runs named-entity recognition and regex heuristics to extract structured entities and hyperlinks from raw memory bodies before embedding.

Structs§

ExtractedEntity
ExtractedUrl
URL with source offset extracted from the memory body.
ExtractionResult
RegexExtractor

Traits§

Extractor

Functions§

extract_graph_auto
extract_urls
Extracts URLs from a memory body, deduplicated by text. URLs are stored in the memory_urls table separately from graph entities. v1.0.24: split of the URL block that polluted apply_regex_prefilter with entity_type=‘concept’.