Skip to main content

Module nlp

Module nlp 

Source
Expand description

Natural language processing utilities Advanced NLP Module

This module provides advanced natural language processing capabilities:

  • Semantic chunking algorithms
  • Custom NER training pipeline
  • Syntax analysis

§Features

§Semantic Chunking

  • Multiple chunking strategies (sentence, paragraph, topic, semantic, hybrid)
  • Intelligent boundary detection
  • Coherence scoring
  • Configurable chunk sizes and overlap

§Custom NER

  • Pattern-based entity extraction
  • Dictionary/gazetteer matching
  • Rule-based extraction with priorities
  • Training dataset management
  • Active learning support

§Syntax Analysis

  • Part-of-speech tagging
  • Dependency parsing
  • Noun phrase extraction

Modules§

custom_ner
Custom NER Training Pipeline
semantic_chunking
Semantic Chunking
syntax_analyzer
Rule-based Syntax Analysis

Structs§

AnnotatedExample
Annotated text example
ChunkingConfig
Chunking configuration
ChunkingStats
Chunking statistics
CustomNER
Custom NER model
DatasetStatistics
Dataset statistics
Dependency
A dependency arc between tokens
EntityType
Entity type definition
ExtractedEntity
Extracted entity
ExtractionRule
Extraction rule
NounPhrase
A noun phrase
SemanticChunk
Text chunk with metadata
SemanticChunker
Semantic chunker
SyntaxAnalyzer
Rule-based syntax analyzer
SyntaxAnalyzerConfig
Configuration for syntax analyzer
Token
A token with POS tag
TrainingDataset
Training dataset for custom NER

Enums§

ChunkingStrategy
Chunking strategy
DependencyRelation
Dependency relation type
POSTag
Part-of-Speech tag
RuleType
Rule types