pub struct DocumentChunk {
pub base: DataPoint,
pub text: String,
pub chunk_size: usize,
pub chunk_index: usize,
pub cut_type: String,
pub document_id: Uuid,
pub is_part_of: Option<Uuid>,
pub contains: Vec<Value>,
}Expand description
A chunk of text extracted from a document during the cognify pipeline.
Extends DataPoint (via #[serde(flatten)]) following the same pattern
used by Entity, EntityType, and EdgeType.
Python equivalent: cognee.infrastructure.engine.models.DataPoint subclass
DocumentChunk with metadata = {"index_fields": ["text"]}.
Fields§
§base: DataPointBase data point fields (id, timestamps, metadata, type, etc.)
text: StringThe chunk text content.
chunk_size: usizeToken count (word count by default).
chunk_index: usizeSequential index within the parent document, starting at 0.
cut_type: StringHow the chunk boundary was determined (e.g. “paragraph_end”, “sentence_end”).
document_id: UuidID of the parent document this chunk belongs to (convenience field).
is_part_of: Option<Uuid>Document ID for graph edge (mirrors Python’s is_part_of relationship).
contains: Vec<Value>Entity refs populated during graph extraction (mirrors Python’s contains list).
Implementations§
Source§impl DocumentChunk
impl DocumentChunk
Sourcepub fn new(
id: Uuid,
text: String,
chunk_size: usize,
chunk_index: usize,
cut_type: String,
document_id: Uuid,
) -> Self
pub fn new( id: Uuid, text: String, chunk_size: usize, chunk_index: usize, cut_type: String, document_id: Uuid, ) -> Self
Create a new DocumentChunk with a deterministic ID.
Sets:
base.data_type="DocumentChunk"base.metadata["index_fields"]=["text"]base.id= the provided deterministic UUIDis_part_of=Some(document_id)contains= empty
Trait Implementations§
Source§impl Clone for DocumentChunk
impl Clone for DocumentChunk
Source§fn clone(&self) -> DocumentChunk
fn clone(&self) -> DocumentChunk
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for DocumentChunk
impl Debug for DocumentChunk
Source§impl<'de> Deserialize<'de> for DocumentChunk
impl<'de> Deserialize<'de> for DocumentChunk
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Source§impl HasDataPoint for DocumentChunk
impl HasDataPoint for DocumentChunk
Source§fn data_point(&self) -> &DataPoint
fn data_point(&self) -> &DataPoint
DataPoint of this container.Source§fn data_point_mut(&mut self) -> &mut DataPoint
fn data_point_mut(&mut self) -> &mut DataPoint
DataPoint of this container.Source§fn for_each_child_mut(&mut self, _visit: &mut dyn FnMut(&mut dyn HasDataPoint))
fn for_each_child_mut(&mut self, _visit: &mut dyn FnMut(&mut dyn HasDataPoint))
HasDataPoint.
Default: no children. Override on container types whose fields
own (rather than reference by Uuid) another HasDataPoint.