pub struct ExtractTextChunksPipeline { /* private fields */ }Expand description
The extract text chunks pipeline.
This pipeline handles the first two stages of cognify:
- Document classification (text/* only)
- Text chunking
Implementations§
Source§impl ExtractTextChunksPipeline
impl ExtractTextChunksPipeline
pub fn new(storage: Arc<dyn StorageTrait>) -> Self
Sourcepub async fn extract_chunks(
&self,
data_items: Vec<Data>,
max_chunk_size: usize,
) -> Result<Vec<DocumentChunk>, ChunkingError>
pub async fn extract_chunks( &self, data_items: Vec<Data>, max_chunk_size: usize, ) -> Result<Vec<DocumentChunk>, ChunkingError>
Extract text chunks from a set of Data items.
Implements:
- Document classification (text/* only)
- Text chunking
Returns the generated chunks.
Sourcepub async fn extract_chunks_with_counter<C: TokenCounter>(
&self,
data_items: Vec<Data>,
max_chunk_size: usize,
counter: &C,
) -> Result<Vec<DocumentChunk>, ChunkingError>
pub async fn extract_chunks_with_counter<C: TokenCounter>( &self, data_items: Vec<Data>, max_chunk_size: usize, counter: &C, ) -> Result<Vec<DocumentChunk>, ChunkingError>
Extract text chunks with a custom token counter.
Auto Trait Implementations§
impl !RefUnwindSafe for ExtractTextChunksPipeline
impl !UnwindSafe for ExtractTextChunksPipeline
impl Freeze for ExtractTextChunksPipeline
impl Send for ExtractTextChunksPipeline
impl Sync for ExtractTextChunksPipeline
impl Unpin for ExtractTextChunksPipeline
impl UnsafeUnpin for ExtractTextChunksPipeline
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more