Expand description
Natural language processing utilities Advanced NLP Module
This module provides advanced natural language processing capabilities:
- Semantic chunking algorithms
- Custom NER training pipeline
- Syntax analysis
§Features
§Semantic Chunking
- Multiple chunking strategies (sentence, paragraph, topic, semantic, hybrid)
- Intelligent boundary detection
- Coherence scoring
- Configurable chunk sizes and overlap
§Custom NER
- Pattern-based entity extraction
- Dictionary/gazetteer matching
- Rule-based extraction with priorities
- Training dataset management
- Active learning support
§Syntax Analysis
- Part-of-speech tagging
- Dependency parsing
- Noun phrase extraction
Modules§
- custom_
ner - Custom NER Training Pipeline
- semantic_
chunking - Semantic Chunking
- syntax_
analyzer - Rule-based Syntax Analysis
Structs§
- Annotated
Example - Annotated text example
- Chunking
Config - Chunking configuration
- Chunking
Stats - Chunking statistics
- CustomNER
- Custom NER model
- Dataset
Statistics - Dataset statistics
- Dependency
- A dependency arc between tokens
- Entity
Type - Entity type definition
- Extracted
Entity - Extracted entity
- Extraction
Rule - Extraction rule
- Noun
Phrase - A noun phrase
- Semantic
Chunk - Text chunk with metadata
- Semantic
Chunker - Semantic chunker
- Syntax
Analyzer - Rule-based syntax analyzer
- Syntax
Analyzer Config - Configuration for syntax analyzer
- Token
- A token with POS tag
- Training
Dataset - Training dataset for custom NER
Enums§
- Chunking
Strategy - Chunking strategy
- Dependency
Relation - Dependency relation type
- POSTag
- Part-of-Speech tag
- Rule
Type - Rule types