Expand description
Document processing and transformation utilities. Processing module for ReasonKit Core
Provides document and text processing utilities for the RAG pipeline.
§Overview
This module handles:
- Text normalization and cleaning
- Token counting and estimation
- Text chunking strategies
- Processing pipeline orchestration
Modules§
- chunking
- Document chunking module Document Chunking Module
Structs§
- Normalization
Options - Text normalization options
- Processing
Pipeline - Processing pipeline for documents
Functions§
- count_
words - Count words in text
- estimate_
tokens - Estimate token count for text (rough approximation: ~4 chars per token)
- extract_
sentences - Extract sentences from text
- normalize_
text - Normalize text according to options
- split_
paragraphs - Split text into paragraphs