Trait VectorExtractor

Source

pub trait VectorExtractor: Send + Sync {
    // Required method
    fn extract_document(
        &self,
        base_path: &Path,
        file_path: &Path,
        frontmatter: &Value,
        content: &str,
    ) -> Result<VectorDocument>;

    // Provided methods
    fn content_glob(&self) -> &str { ... }
    fn name(&self) -> &str { ... }
}

Expand description

Trait for extracting vector documents from domain-specific content.

Each knowledge domain (music theory, math, etc.) implements this trait to define how its markdown files with frontmatter are transformed into VectorDocument instances. The key responsibility is text composition: deciding what content should be embedded.

§Lifecycle

For each content file, VectorIndexBuilder calls:

extract_document() — Parse file and compose text for embedding

The returned VectorDocument.text is what gets embedded by the EmbeddingProvider.

Required Methods§

Source

fn extract_document( &self, base_path: &Path, file_path: &Path, frontmatter: &Value, content: &str, ) -> Result<VectorDocument>

Extract a vector document from a content file.

§Arguments

base_path - Root directory for content
file_path - Full path to the file being processed
frontmatter - Parsed YAML frontmatter as generic Value
content - Markdown body (after frontmatter)

§Text Composition

The implementation should compose the text field with all content that should influence semantic similarity. A common pattern is:

title | description | key terms | body content

Provided Methods§

Source

fn content_glob(&self) -> &str

Returns the content glob pattern for this domain.

Used by VectorIndexBuilder to discover content files. Default: "**/*.md" (all markdown files recursively).

Source

fn name(&self) -> &str

Returns the name of this extractor for logging/debugging.

Implementors§

Source §

VectorExtractor

Trait VectorExtractor Copy item path

§Lifecycle

Required Methods§

fn extract_document( &self, base_path: &Path, file_path: &Path, frontmatter: &Value, content: &str, ) -> Result<VectorDocument>

§Arguments

§Text Composition

Provided Methods§

fn content_glob(&self) -> &str

fn name(&self) -> &str

Implementors§

impl VectorExtractor for MockVectorExtractor

Trait VectorExtractor