Expand description
Solo storage: SQLite + SQLCipher persistence layer.
§Concurrency invariants (per ADR-0003)
- Writes go through
WriteHandle; reads go throughReaderPool. Direct connection access is an anti-pattern outside the actor + pool. - The writer connection opens once and is owned by the writer thread for the daemon’s lifetime.
- The read pool’s
post_createhook binds the raw SQLCipher key on each new connection. pending_indexordering is always SQL COMMIT → HNSW.add → drain row. Never reverse.Arc<dyn VectorIndex + Send + Sync>is shared between the writer and the read pool; concurrency is provided by the impl (e.g., hnsw_rs’s internalparking_lot::RwLock), not by application-level locks.
§Module layout
Commit 1.1 — solo init building blocks:
path_validation— refuse cloud-sync data dirs.key_material— Argon2id passphrase → 32-byte SQLCipher key.config—solo.config.toml(salt + embedder identity).migration— runner + the v0 schema (migrations/0001_initial.sql).lockfile— RAIIsolo.lockto serialize concurrent runs.init— orchestrator:solo_storage::init(params).
Commit 1.2 — single-writer actor + read pool:
writer—WriterActor,WriteHandle,WriteCommand.reader—ReaderPool(deadpool-sqlite + post_create raw-key).
Commit 1.3 — HNSW backing for solo_core::VectorIndex + snapshot I/O:
vector_index—HnswIndex(hnsw_rswrapper),HnswFactory.snapshot— atomic two-file save (live/_bak/_tmpbasenames) +load/load_bakper ADR-0003 §“Startup file-existence decision tree”.recovery—replay_pending_index,detect_drift. Used by the daemon-main startup chain (commit 1.5).
Embedder impls:
embedder::stub—StubEmbedder, deterministic hash-based F32 embedder for tests + offline development.embedder::ollama—OllamaEmbedder, real semantic embeddings via a local Ollama daemon (/api/embeddings). The recommended production backend since v0.5.1; default for new deployments.
(v0.5.x also shipped a BGE-M3 / candle-transformers backend; it was
deprecated in v0.5.0 and removed in v0.6.0. The replacement is
OllamaEmbedder.)
Commit 1.5+ (daemon main + signal handlers) lands in subsequent files; the surfaces here are stable for that wiring.
Re-exports§
pub use audit::AuditEvent;pub use audit::AuditOperation;pub use audit::AuditResult;pub use audit::AuditWriter;pub use audit::AuditWriterShutdown;pub use audit::insert_audit_admin_row;pub use audit::insert_audit_row_in_tx;pub use audit::purge_older_than;pub use backup::DEFAULT_BACKUP_PAGES_PER_STEP;pub use backup::backup_database;pub use backup::backup_from_connection;pub use backup::paths_refer_to_same_file;pub use config::AuditSettings;pub use config::AuthSettings;pub use config::CustomRedactionPattern;pub use config::DocumentConfig;pub use config::EmbedderConfig;pub use config::IdentityConfig;pub use config::LlmSettings;pub use config::RedactionConfig;pub use config::SamplingConfig;pub use config::SamplingConfigDiagnostic;pub use config::SoloConfig;pub use config::StewardSettings;pub use config::TriplesConfig;pub use gdpr::ForgetReport;pub use gdpr::estimate_forget_scope;pub use gdpr::forget_principal;pub use redaction::RedactionMatch;pub use redaction::RedactionRegistry;pub use redaction::RedactionResult;pub use steward_factory::McpSamplingStewardFactory;pub use steward_factory::StaticStewardFactory;pub use steward_factory::StewardFactory;pub use tenant_backup::BackupReport;pub use tenant_backup::RestoreReport;pub use tenant_backup::backup_tenant;pub use tenant_backup::restore_tenant;pub use document::ChunkConfig;pub use document::ChunkSpec;pub use document::ParseError;pub use document::ParsedDocument;pub use document::chunk_text;pub use document::parse_file;pub use embedder::OllamaEmbedder;pub use embedder::StubEmbedder;pub use embedder::build_embedder_from_env;pub use embedder::probe_embedder_config_from_env;pub use embedder::BUNDLED_EMBEDDER_DIM;pub use embedder::BUNDLED_EMBEDDER_NAME;pub use embedder::BUNDLED_EMBEDDER_VERSION;pub use embedder::BundledEmbedder;pub use embedder_registry::EmbedderIdentity;pub use embedder_registry::get_or_insert_embedder_id;pub use hnsw_id::HNSW_CHUNK_BIT;pub use hnsw_id::HnswIdKind;pub use hnsw_id::chunk_hnsw_id;pub use hnsw_id::decode_hnsw_id;pub use hnsw_id::episode_hnsw_id;pub use init::InitOutcome;pub use init::InitParams;pub use init::default_data_dir;pub use init::default_embedder;pub use init::init;pub use init::open_sqlcipher;pub use key_material::KeyMaterial;pub use lockfile::Lockfile;pub use merge_candidates::MergeCandidateStats;pub use merge_candidates::count_existing_merge_candidates;pub use migration::current_tenants_index_version;pub use migration::current_version;pub use migration::run_migrations;pub use migration::run_tenants_index_migrations;pub use path_validation::validate_data_dir;pub use reader::DEFAULT_POOL_SIZE;pub use reader::ReaderPool;pub use recovery::DriftReport;pub use recovery::RebuildReport;pub use recovery::ReplayReport;pub use recovery::detect_drift;pub use recovery::rebuild_hnsw_from_sql;pub use recovery::replay_pending_index;pub use snapshot::BAK_BASENAME;pub use snapshot::LIVE_BASENAME;pub use snapshot::TMP_BASENAME;pub use startup::StartupOutcome;pub use startup::StartupParams;pub use startup::run as startup_run;pub use tenants::TENANTS_INDEX_FILENAME;pub use tenants::TENANTS_SUBDIR;pub use tenants::TenantCostNumbers;pub use tenants::TenantHandle;pub use tenants::TenantOpenParams;pub use tenants::TenantRecord;pub use tenants::TenantRegistry;pub use tenants::TenantRegistryParams;pub use tenants::TenantStatus;pub use tenants::TenantsIndex;pub use tenants::migrate_v071_to_v080;pub use triples_batch::TriplesBatchSignal;pub use vector_index::HnswFactory;pub use vector_index::HnswIndex;pub use vector_index::HnswParams;pub use writer::AttachAbstractionBatchReport;pub use writer::DEFAULT_CHANNEL_CAPACITY;pub use writer::DEFAULT_INGEST_MAX_BYTES;pub use writer::ConsolidationReport;pub use writer::ConsolidationScope;pub use writer::ForgetDocumentReport;pub use writer::IngestReport;pub use writer::MAX_REMEMBER_BATCH_SIZE;pub use writer::NormalizeReport;pub use writer::ReembedReport;pub use writer::ReembedScope;pub use writer::ResolveContradictionReport;pub use writer::WriteCommand;pub use writer::WriteHandle;pub use writer::WriterActor;pub use writer::WriterSpawn;pub use writer::resolve_ingest_max_bytes;
Modules§
- audit
- Per-tenant audit log infrastructure (v0.8.0 P4).
- backup
- Online SQLCipher backup.
- config
solo.config.tomlreader/writer.- document
- Document parsing + chunking for v0.7.0 RAG/document memory.
- embedder
- Embedder implementations behind the
solo_core::Embeddertrait. - embedder_
registry embedderstable registry. Every embedder model that produces vectors in this database has a row keyed by(name, version)with its dim + dtype + first-seen timestamp.- gdpr
- GDPR right-to-erasure (v0.8.0 P6) — hard-delete every row tied to a principal subject in one tenant.
- hnsw_id
- Kind-discriminated rowid encoding for the shared HNSW namespace.
- hnsw_
rebuild - Shared HNSW-tombstone-rebuild helpers.
- init
solo init: create a fresh Solo data directory.- key_
material KeyMaterial: holds the raw 32-byte SQLCipher key derived once at startup from the user passphrase via Argon2id.- llm
- Production
LlmClientbackends. - lockfile
solo.lock: O_EXCL-style mutex that prevents two daemons (or twosolo initinvocations) from racing on the same data dir.- merge_
candidates - Read-side helper for
solo doctor: count existing-cluster pairs that the existing-vs-existing merge pass would coalesce on the nextconsolidate --force-merge(or--force-merge-on-timerdaemon cycle). - migration
- SQL schema migrations. Runs once at startup against the SQLCipher database
after
PRAGMA keyhas been bound. - path_
validation - Refuse to initialize Solo inside a cloud-sync folder.
- reader
ReaderPool: pool of read-only SQLite connections backed bydeadpool-sqlite. Each newly-created connection has its raw SQLCipher key bound via apost_createhook (PBKDF2 cost paid once per connection, not per query). See ADR-0003 §“Trait shapes” and §P8-A/P8-B.- recovery
- Startup recovery for the HNSW index. Two pieces:
- redaction
- Opt-in PII redaction registry (v0.8.0 P5).
- snapshot
- HNSW snapshot save/load. ADR-0003 §P8-C: hnsw_rs writes a pair of files
(
*.hnsw.data+*.hnsw.graph); we drive an atomic two-step save withfsyncand a previous-version backup. - startup
- Daemon startup orchestration. Per ADR-0003 §O6 (“Startup ordering: linear await chain in main()”) and §“Startup file-existence decision tree”.
- steward_
factory StewardFactorytrait — abstracts how a per-tenantArc<Steward>is built at registry-open time.- tenant_
backup - Per-tenant SQLCipher backup + restore (v0.8.0 P6).
- tenants
- Tenant registry for v0.8.0 multi-tenancy.
- triples_
batch - v0.9.0 P4c: daemon-side background batch driver for triple extraction.
- vector_
index HnswIndex—solo_core::VectorIndeximplementation backed byhnsw_rs.- writer
WriterActor,WriteCommand,WriteHandle— single-writer actor on a dedicated OS thread. See ADR-0003 §“Trait shapes” and §“Operational invariants”.