chunkshop-rs 0.4.3

Standalone ingest-to-pgvector: source -> chunker -> embedder -> extractor -> table. int8 BGE by default; bakeoff matrix evaluator built in. Cross-language wire-format compatible with the Python `chunkshop` package.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
//! Source-side shared types. `Document` is the unit yielded by every source.
//!
//! Mirrors `python/src/chunkshop/sources/base.py`. Per-source impls live in
//! sibling files (files.rs, json_corpus.rs, pg_table.rs, http.rs, s3.rs).

#[derive(Debug, Clone)]
pub struct Document {
    pub id: String,
    pub content: String,
    pub title: Option<String>,
    pub metadata: serde_json::Value,
}