Skip to main content

Module gleaning_extractor

Module gleaning_extractor 

Source
Expand description

Gleaning-based entity extraction module Gleaning-based entity extraction with TRUE LLM inference

This module implements iterative gleaning refinement using actual LLM calls, not pattern matching. Based on Microsoft GraphRAG and LightRAG research.

Expected performance: 15-30 seconds per chunk per round. For a 1000-page book with 4 gleaning rounds, expect 2-4 hours of processing time.

Structsยง

ExtractionCompletionStatus
Status of entity extraction completion
GleaningConfig
Configuration for gleaning-based entity extraction
GleaningEntityExtractor
Entity extractor with iterative gleaning refinement using TRUE LLM calls
GleaningStatistics
Statistics for gleaning extraction process