# Changelog
All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [Unreleased]
---
## [0.13.0] - 2026-05-21
### Added
- **QuaQue versioning** (`src/version/`) — based on arXiv:2603.18654
- Bitstring validity model: each entity/relation row carries a `validity` INTEGER; membership test is `(validity & (1 << bit_slot)) != 0`
- Up to 64 concurrent named versions via reclaimable `bit_slot` allocation (slots freed on delete, lowest free slot always reused)
- `create_version` / `delete_version` / `get_version` / `list_versions` — version CRUD
- `version_add_entity` / `version_remove_entity` / `version_add_relation` / `version_remove_relation` — snapshot membership management
- `version_snapshot_entities` / `version_snapshot_relations` — bulk snapshot of all current rows into a version
- `version_entities` / `version_relations` / `version_neighbors` — version-filtered queries with optional pagination
- `version_compare` — diff two versions: returns `VersionDiff` with added/removed/common entities and relations
- `version_entity_history` — all versions a given entity belongs to (newest first)
- `version_merge` — merge one or more source versions into a new target version with Union or Intersection strategy
- `Version`, `VersionDiff`, `MergeStrategy` public types
- **Schema migration v4** — non-destructive; adds `kg_versions` table and `validity` column to `kg_entities` / `kg_relations`
- `kg_versions.bit_slot CHECK (bit_slot BETWEEN 0 AND 63)` — DB-level guard against out-of-range slot values
### Fixed
- `bit_from_slot` now returns `Option<i64>` — previously relied on `debug_assert!` (omitted in release builds); out-of-range slots now surface as `Error::CorruptBitSlot` instead of panicking on `1 << slot`
- `create_version` detects duplicate names via structured `rusqlite::Error::SqliteFailure` extended error code + `table.column` identifier instead of brittle English message substring matching
- `load_relations` diff reuses a single prepared statement across all IDs (eliminated per-iteration SQL recompile)
- `load_entities` diff batches via `WHERE id IN (...)` queries (chunked at 900 IDs, removes N+1 pattern)
- `get_version` doc comment corrected: function returns `Error::VersionNotFound`, not `None`
### Technical
- Schema version: v4
- 177 unit tests passing
---
## [0.12.0] - 2026-05-20
### Added
- **SmartVector: self-aware vector embeddings** (`src/vector/confidence.rs`, `src/graph/ripple.rs`, `src/rag/smart_retrieval.rs`) — based on arXiv:2604.20598
- `ConfidenceEngine` — Ebbinghaus forgetting-curve formula: `base·exp(-λt) + access_bonus·ln(1+access) + feedback_sum`
- `ConfidenceParams` — tuneable λ (decay rate) and access_bonus; `decay_rate` per entity overrides the default
- `KnowledgeGraph::smart_search(query, k)` — four-signal retrieval: cosine × temporal × live confidence × graph importance
- `KnowledgeGraph::set_retrieval_weights(w)` / `retrieval_weights()` — configure blending weights at runtime
- `SmartRetrieval` / `RetrievalWeights` / `SmartSearchResult` — public types for direct use
- `ripple::propagate(conn, origin_id, base_penalty)` — BFS confidence penalty propagation up to 2 hops with 0.5× attenuation per hop
- `ripple::add_dependency(conn, source_id, target_id, dep_type)` — explicit entity dependency edges
- **Schema migration v3** — non-destructive, all new columns have defaults
- `kg_entities`: `confidence`, `access_count`, `last_accessed`, `valid_from`, `valid_until`, `base_confidence`, `decay_rate`
- `kg_relations`: `confidence`, `valid_from`, `valid_until`
- `kg_dependencies` table — dependency edge graph with FK cascades
- `kg_confidence_log` table — full audit trail of confidence changes with FK cascade and composite index `(entity_id, reason)`
### Fixed
- **`ConfidenceParams.lambda` now used as fallback** — `get_confidence` previously hard-coded `0.05` via SQL `COALESCE`; it now uses `self.params.lambda` when `decay_rate IS NULL`
- **`update_confidence` log accuracy** — `kg_confidence_log.new_value` now records the actual recomputed confidence rather than the raw `old_conf + feedback`, which diverges when the feedback sum hits the `[-1, 1]` clamp
- **`update_confidence` atomicity** — INSERT log + UPDATE entity now execute inside a single transaction
- **`apply_penalty` confidence floor** — result is clamped to `[0.0, 1.0]` so confidence can never go negative
- **`load_indegrees` N+1 queries** — replaced per-entity `SELECT COUNT(*)` loop with a single `GROUP BY` query
- **`retrieve` uses live confidence** — replaced stale `cached_confidence` column read with `ConfidenceEngine::get_confidence` so the w3 signal reflects decay and feedback
- **Async API: remove redundant `unsafe impl Send/Sync`** — `Arc<Mutex<KnowledgeGraph>>` derives `Send + Sync` automatically; the explicit `unsafe impl` blocks were unnecessary and masked potential future regressions
- **Async API: Python code injection** — `AsyncEmbeddingGenerator` now passes `model_name` via the `KG_MODEL_NAME` environment variable instead of interpolating it directly into the Python source string
- **Async API: `BrokenPipe` error wrapping** — stdin `write_all` errors are now propagated directly via `From<std::io::Error>` instead of being wrapped and losing the original error context
### Technical
- Schema version: v3
- 133 unit tests passing
---
## [0.11.1] - 2026-04-08
### Fixed
- **CLI: support global `--db` flag before subcommand** — Running `sqlite-kg --db foo.db stats` previously failed with "Unknown command: --db". The CLI now accepts `--db <path>` in any position (before or after the subcommand).
- **CLI: add `--help` / `-h` / `help` command** — Previously these were unrecognized; now they print usage and exit cleanly.
---
## [0.11.0] - 2026-04-06
### Added
- **Async API** (`src/async_kg/`) - Tokio-based async wrapper, feature-gated behind `features = ["async"]`
- `AsyncKnowledgeGraph` - Full async coverage of all `KnowledgeGraph` methods via `spawn_blocking`
- `AsyncEmbeddingGenerator` - Non-blocking Python subprocess embedding using `tokio::process::Command`
- `KnowledgeGraph::into_async()` - Convenience conversion from sync to async
- `dispatch!` macro - Internal boilerplate reducer for spawn_blocking dispatch
- `Error::TaskPanicked` - New error variant for async task failures
- **End-to-end async demo** (`examples/async_demo.rs`)
- Concurrent paper ingestion, PageRank, Louvain, semantic search, shortest path
- **CI async coverage** - Build, test, and clippy with `--features async` in GitHub Actions
### Technical
- Optional dependency: `tokio` (rt, rt-multi-thread, macros, process, io-util)
- Thread safety: `Arc<Mutex<KnowledgeGraph>>` with `unsafe impl Send/Sync`
- 122 unit tests + 11 async integration tests + 3 sync integration tests passing
- Zero overhead for users not enabling the `async` feature
---
## [0.10.3] - 2026-04-02
### Changed
- **Schema auto-migration** - `ensure_schema()` migrator with `kg_schema_version` table
- Supports incremental upgrade from any prior version
- `create_schema()` remains backward compatible
- **Cache invalidation upgrade** - `kg_turboquant_cache` adds `vectors_checksum` column (`SUM(entity_id)`)
- Dual validation: count + checksum prevents stale cache on same-count different-vector scenarios
### Technical
- Schema version: v2
- All P0/P1/P2 issues resolved
---
## [0.10.2] - 2026-04-01
### Performance
- **Persistent TurboQuant Index** - Eliminated per-query index rebuild in RAG Stage 1
- Index serialized as JSON BLOB and stored in new `kg_turboquant_cache` table
- Cache invalidated automatically when `kg_vectors` row count changes
- Same database with repeated RAG queries now builds the index only once
- `TurboQuantIndex::to_bytes()` / `from_bytes()` added for BLOB round-trip
### Technical
- New table: `kg_turboquant_cache` (singleton row, `id = 1`)
- 95 tests passing
---
## [0.10.1] - 2026-03-31
### Fixed
- All P0/P1/P2 quality issues resolved
- Louvain Phase 2 super-node aggregation implemented (`P1-1`)
- Remaining bare `.unwrap()` calls replaced with `map_err(?)` (`P0-5`)
### Technical
- 95 tests passing
---
## [0.10.0] - 2026-03-31
### Added
- **Paper-driven two-stage RAG Engine** (`src/rag/mod.rs`)
- Stage 1 (MemRL): TurboQuant ANN fast candidate retrieval
- Stage 2 (MemRL): exact cosine rerank
- RAPO: BFS graph-neighbour expansion
- SuperLocalMemory: quality threshold filtering
- Memex(RL): context entity BFS attachment
- `RagEngine`, `RagConfig`, `RagResult` public types
### References
- MemRL (2601.03192), RAPO (2603.02958), Memex (2603.03561)
- SuperLocalMemory (2602.13398), NN-RAG (2511.20333)
### Technical
- 95 tests passing
---
## [0.9.0] - 2026-03-26
### Added
- **Vector Embedding Generation** - Real embeddings with sentence-transformers
- `EmbeddingGenerator` - Generate embeddings using `all-MiniLM-L6-v2` (384 dimensions)
- `sqlite-kg embed` CLI command - Generate embeddings for papers and skills
- Batch processing (100 entities/batch)
- Incremental mode - Skip entities with existing real embeddings
- `--force` flag - Regenerate all embeddings
### Changed
- **Search Command Fixed** - Now uses real query embeddings instead of dummy vectors
- Semantic search similarity scores now in 0.7-0.8+ range (previously 0.05-0.07)
- Results are highly relevant to query
### Technical
- New module: `src/embed.rs` (~400 lines)
- Python integration via subprocess for sentence-transformers
- 20 new unit tests for embedding functionality
- Integration test with dependency check
- Total: 60 tests passing
### Usage
```bash
# Generate embeddings
sqlite-kg embed --db kg.db
# Generate for papers only
sqlite-kg embed --db kg.db --papers
# Force regenerate all
sqlite-kg embed --db kg.db --force
# Semantic search
sqlite-kg search "brain network" --k 5 --db kg.db
```
### Dependencies
- Requires `sentence-transformers` Python package
- Virtual environment recommended: `python3 -m venv .venv && pip install sentence-transformers`
---
## [0.8.0] - 2026-03-25
### Added
- **TurboQuant Vector Indexing** - Near-optimal vector quantization for instant search
- `TurboQuantIndex` - Fast approximate nearest neighbor search
- `TurboQuantConfig` - Configurable dimension, bit_width, and seed
- `KnowledgeGraph::create_turboquant_index()` - Create new index
- `KnowledgeGraph::build_turboquant_index()` - Build from existing vectors
- **Benefits:**
- Instant indexing (no training required)
- 6x memory compression
- Near-zero accuracy loss
- Up to 184,000x faster indexing vs Product Quantization
### Technical
- Added `rand` and `ndarray` dependencies
- New module: `src/vector/turboquant.rs`
- 4 new tests for TurboQuant functionality
- Total: 43 tests passing
### References
- Based on arXiv:2504.19874 (ICLR 2026)
- Google Research: "TurboQuant: Redefining AI efficiency with extreme compression"
---
## [0.7.0] - 2026-03-25
### Added
- **More Extension Functions** - Extended SQLite extension with graph algorithm functions
- `kg_pagerank(damping, max_iterations, tolerance)` - PageRank algorithm with configurable parameters
- `kg_louvain()` - Louvain community detection
- `kg_bfs(start_id, max_depth)` - BFS traversal from starting entity
- `kg_shortest_path(from_id, to_id, max_depth)` - Shortest path between entities
- `kg_connected_components()` - Find connected components in graph
- All functions support multiple parameter overloads
- Returns JSON with algorithm info and parameters
### Changed
- Updated `src/extension.rs` with new SQL functions
- Updated README.md with new extension function documentation
- Updated Implementation Status table
### Technical
- 34 tests passing (33 unit + 1 extension test)
- Extension functions use sqlite-loadable crate
- Functions support optional parameters with defaults
---
## [0.6.0] - 2026-03-25
### Added
- **SQLite Extension Support** (experimental)
- `src/extension.rs` - Extension entry points for macOS/Linux
- Can be compiled as loadable extension (.dylib/.so)
- Functions: kg_version, kg_stats, kg_search, kg_bfs, kg_shortest_path, kg_pagerank, kg_louvain, kg_connected_components
### Known Issues
- Extension loading may cause SIGSEGV on some platforms
- Recommend using CLI tool (`sqlite-kg`) or Rust API instead
### Technical
- Added `load_extension` feature to rusqlite
- Compiled extension at: `target/release/libsqlite_knowledge_graph.dylib`
---
## [0.5.0] - 2026-03-25
### Added
- **Graph Algorithms Module** (`src/algorithms/`)
- `pagerank()` - PageRank centrality with configurable damping
- `louvain_communities()` - Community detection via modularity optimization
- `connected_components()` - Weakly connected components
- `strongly_connected_components()` - Kosaraju's SCC algorithm
- `analyze_graph()` - Full graph analysis (PageRank + Louvain + Components)
- **New Types**
- `PageRankConfig` - PageRank configuration (damping, iterations, tolerance)
- `CommunityResult` - Community memberships and modularity score
- `GraphAnalysis` - Complete graph analysis results
- **KnowledgeGraph API**
- `kg_pagerank()` - Compute centrality scores
- `kg_louvain()` - Detect communities
- `kg_connected_components()` - Find connected components
- `kg_analyze()` - Run full analysis
### Technical
- 38 tests passing (33 unit + 5 integration)
- Full graph algorithm coverage
---
## [0.4.0] - 2026-03-25
### Added
- **Graph Traversal Module** (`src/graph/traversal.rs`)
- `bfs_traversal()` - Breadth-first search with depth tracking
- `dfs_traversal()` - Depth-first search with depth tracking
- `find_shortest_path()` - BFS-based shortest path between entities
- `compute_graph_stats()` - Graph statistics (entities, relations, density)
- **New Types**
- `TraversalNode` - Node with depth information
- `TraversalPath` - Complete path with steps
- `PathStep` - Edge information in path
- `GraphStats` - Graph statistics
- `Direction` - Traversal direction (Outgoing/Incoming/Both)
- `TraversalQuery` - Query parameters for traversal
- **KnowledgeGraph API**
- `kg_bfs_traversal()` - BFS from entity
- `kg_dfs_traversal()` - DFS from entity
- `kg_shortest_path()` - Shortest path between entities
- `kg_graph_stats()` - Get graph statistics
### Technical
- 32 tests passing (27 unit + 5 integration)
- New traversal module with comprehensive test coverage
---
## [0.3.0] - 2026-03-25
### Added
- **RAG Integration Module** (`src/rag/`)
- `kg_semantic_search()` - Semantic search with vector similarity
- `kg_get_context()` - Get entity context with related entities
- `kg_hybrid_search()` - Combine keyword and semantic search
- **CLI Tool** (`sqlite-kg`)
- `migrate` command - Migrate data from knowledge.db
- `search` command - Search entities
- `stats` command - Show knowledge graph statistics
- **Data Migration**
- Migrated 2,497 papers from knowledge.db
- Migrated 122 skills from knowledge.db
- Built 1,480,951 relations between entities
### Changed
- Enhanced entity storage with metadata support
- Improved relation storage with confidence scores
### Technical
- 27 tests passing (22 unit + 5 integration)
- Vector storage using placeholder zero vectors
---
## [0.2.0] - 2026-03-24
### Added
- **Entity Storage Module** (`src/graph/entity.rs`)
- `kg_create_entity()` - Create entities
- `kg_get_entity()` - Get entity by ID
- `kg_update_entity()` - Update entity properties
- `kg_delete_entity()` - Delete entity
- `kg_list_entities()` - List entities with filters
- **Relation Storage Module** (`src/graph/relation.rs`)
- `kg_create_relation()` - Create relations
- `kg_get_relation()` - Get relation by ID
- `kg_delete_relation()` - Delete relation
- `kg_get_related_entities()` - Get related entities
- **Vector Storage Module** (`src/vector/store.rs`)
- `kg_set_embedding()` - Set entity embedding
- `kg_get_embedding()` - Get entity embedding
- `kg_find_similar()` - Find similar entities
- **Database Schema** (`src/schema.rs`)
- Entities table with metadata JSON
- Relations table with confidence scores
- Embeddings table for vector storage
- **SQLite Custom Functions** (`src/functions.rs`)
- All functions registered as SQLite UDFs
### Changed
- Fixed compilation errors (rusqlite features, module declarations)
- Added proper error handling with `thiserror`
### Technical
- 24 tests passing (19 unit + 5 integration)
- Production-ready core modules
---
## [0.1.0] - 2026-03-24
### Added
- Project initialization
- Rust project scaffolding with Cargo.toml
- Module structure: `graph/`, `vector/`, `rag/`
- Basic SQLite function registration
- MIT License
- README.md with project overview
- DEVLOG.md for development tracking
- Technical research report (`research.md`)
### Technical
- 1 test passing
- Project compiles successfully
---
## Version History
| 0.13.0 | 2026-05-21 | QuaQue bitstring versioning, version diff/merge, schema v4 |
| 0.12.0 | 2026-05-20 | SmartVector: temporal confidence, four-signal retrieval, ripple propagation |
| 0.11.0 | 2026-04-06 | Async API (tokio spawn_blocking) |
| 0.10.3 | 2026-04-02 | Schema auto-migration, cache invalidation upgrade |
| 0.10.2 | 2026-04-01 | Persistent TurboQuant index (SQLite cache) |
| 0.10.1 | 2026-03-31 | All P0/P1/P2 quality issues resolved |
| 0.10.0 | 2026-03-31 | Paper-driven two-stage RAG engine |
| 0.9.0 | 2026-03-26 | Vector embedding generation (sentence-transformers) |
| 0.8.0 | 2026-03-25 | TurboQuant vector indexing (ANN) |
| 0.7.0 | 2026-03-25 | More extension functions (PageRank, Louvain, BFS, Shortest Path) |
| 0.6.0 | 2026-03-25 | SQLite extension support |
| 0.5.0 | 2026-03-25 | Graph algorithms (PageRank, Louvain, Connected Components) |
| 0.4.0 | 2026-03-25 | Graph traversal (BFS, DFS, Shortest Path) |
| 0.3.0 | 2026-03-25 | RAG integration, data migration |
| 0.2.0 | 2026-03-24 | Core modules (entity, relation, vector) |
| 0.1.0 | 2026-03-24 | Project initialization |
---
[Unreleased]: https://github.com/hiyenwong/sqlite-knowledge-graph/compare/v0.13.0...HEAD
[0.13.0]: https://github.com/hiyenwong/sqlite-knowledge-graph/compare/v0.12.0...v0.13.0
[0.12.0]: https://github.com/hiyenwong/sqlite-knowledge-graph/compare/v0.11.1...v0.12.0
[0.11.0]: https://github.com/hiyenwong/sqlite-knowledge-graph/compare/v0.10.3...v0.11.0
[0.10.3]: https://github.com/hiyenwong/sqlite-knowledge-graph/compare/v0.10.2...v0.10.3
[0.10.2]: https://github.com/hiyenwong/sqlite-knowledge-graph/compare/v0.10.1...v0.10.2
[0.10.1]: https://github.com/hiyenwong/sqlite-knowledge-graph/compare/v0.10.0...v0.10.1
[0.10.0]: https://github.com/hiyenwong/sqlite-knowledge-graph/compare/v0.9.0...v0.10.0
[0.9.0]: https://github.com/hiyenwong/sqlite-knowledge-graph/compare/v0.8.0...v0.9.0
[0.8.0]: https://github.com/hiyenwong/sqlite-knowledge-graph/compare/v0.7.0...v0.8.0
[0.7.0]: https://github.com/hiyenwong/sqlite-knowledge-graph/compare/v0.6.0...v0.7.0
[0.6.0]: https://github.com/hiyenwong/sqlite-knowledge-graph/compare/v0.5.0...v0.6.0
[0.5.0]: https://github.com/hiyenwong/sqlite-knowledge-graph/compare/v0.4.0...v0.5.0
[0.4.0]: https://github.com/hiyenwong/sqlite-knowledge-graph/compare/v0.3.0...v0.4.0
[0.3.0]: https://github.com/hiyenwong/sqlite-knowledge-graph/releases/tag/v0.3.0
[0.2.0]: https://github.com/hiyenwong/sqlite-knowledge-graph/releases/tag/v0.2.0
[0.1.0]: https://github.com/hiyenwong/sqlite-knowledge-graph/releases/tag/v0.1.0