Feather DB
Embedded vector database + living context engine
Part of Hawky.ai — AI-Native Digital Marketing OS
Feather DB is "SQLite for Vectors" — a fast, zero-server, file-based vector store with a built-in knowledge graph and adaptive memory decay. No separate database server required.
What's Inside (v0.5.0)
| Capability | Description |
|---|---|
| ANN Search | Sub-millisecond approximate nearest-neighbor search via HNSW |
| Multimodal Pockets | Text, image, audio vectors stored per entity under a single ID |
| Context Graph | Typed + weighted edges, reverse index, auto-link by similarity |
| Living Context | Recall-count-based sticky memory — frequently accessed items resist decay |
| Namespace / Entity / Attributes | Generic partition + subject + KV metadata for any domain |
| Graph Visualizer | Self-contained D3 force-graph HTML — fully offline, no CDN |
| Single-file persistence | .feather binary format (v5); v3/v4 files load transparently |
Installation
CLI (Rust):
Build from source:
Quick Start
# Open or create a database
=
# Add a vector with metadata
=
=
= 0.9
# Semantic search
=
Core Features
Multimodal Pockets
Each named modality gets its own independent HNSW index with its own dimensionality. A single entity ID can hold text, visual, and audio vectors simultaneously.
# 768-dim
# 512-dim
# 256-dim
=
Context Graph
Typed, weighted edges between records. Nine built-in relationship types plus free-form strings.
# Link records with typed relationships
# Query graph structure
= # outgoing edges
= # reverse index
# Auto-create edges by vector similarity
Built-in relationship types: related_to, derived_from, caused_by, contradicts, supports, precedes, part_of, references, multimodal_of.
Context Chain (Vector Search + Graph Expansion)
One call that combines semantic vector search with n-hop BFS graph traversal:
=
Score = similarity × hop_decay × importance × stickiness
Namespace / Entity / Attributes
Generic partitioning for multi-tenant, multi-domain use:
# Build metadata with domain profile
=
=
# Filter by namespace + entity + attribute
=
=
Works for any domain — healthcare, e-commerce, finance — by subclassing DomainProfile.
Living Context / Adaptive Decay
Records accessed more frequently resist temporal decay:
=
=
Formula:
stickiness = 1 + log(1 + recall_count)
effective_age = age_in_days / stickiness
recency = 0.5 ^ (effective_age / half_life_days)
final_score = ((1 - time_weight) * similarity + time_weight * recency) * importance
touch() is called automatically on every search hit. Call db.touch(id) manually to boost salience.
Graph Visualization
Exports a self-contained, offline D3 force-graph HTML — no CDN, no server:
# Interactive HTML force graph
# JSON for D3 / Cytoscape (namespace-filtered)
=
Import / Export
# D3 / Cytoscape-compatible JSON
=
# Raw vector retrieval
=
=
# Metadata update without touching HNSW index
Filtered Search
=
Metadata Fields
=
= # Unix timestamp
= 0.9 # [0.0–1.0]
= # FACT | PREFERENCE | EVENT | CONVERSATION
=
=
=
= # partition key
= # subject key
# safe KV setter (use this, not meta.attributes['k']=v)
=
Rust CLI
# Add a record
# Search
# Link two records
Performance
| Metric | Value |
|---|---|
| Add rate | 2,000–5,000 vectors/sec |
| Search latency (k=10) | 0.5–1.5 ms |
| Max vectors per modality | 1,000,000 (configurable) |
| HNSW params | M=16, ef_construction=200 |
| File format | Binary .feather v5 |
SIMD (AVX2/AVX512) optimizations are available in space_l2.h. Enable with -DUSE_AVX -march=native in setup.py.
File Format
[magic: 4B = "FEAT"] [version: 4B = 5]
[meta_count: 4B]
for each record:
[id: 8B] [serialized Metadata including namespace/entity/attributes/edges]
[modal_count: 4B]
for each modality:
[name_len: 2B] [name: N bytes]
[dim: 4B] [element_count: 4B]
for each element:
[id: 8B] [float32 vector: dim * 4 bytes]
v3 and v4 files load transparently — missing fields default to empty.
Examples
| File | Description |
|---|---|
examples/context_graph_demo.py |
Full context graph demo — auto-link, context_chain, D3 HTML export |
examples/marketing_living_context.py |
Multi-brand namespace/entity/attribute filtering + importance feedback |
examples/feather_inspector.py |
Local HTTP inspector — force graph, PCA scatter, edit, delete |
Run any example:
Architecture
[Generic Core — C++17]
feather::DB
├── modality_indices_ (unordered_map<string, ModalityIndex>) — one HNSW per modality
├── metadata_store_ (unordered_map<uint64_t, Metadata>) — shared metadata by ID
└── Methods: add, search, link, context_chain, auto_link, export_graph_json ...
[Python Layer]
feather_db (pybind11)
├── DB, Metadata, ContextType, ScoringConfig
├── Edge, IncomingEdge, ContextNode, ContextEdge, ContextChainResult
├── FilterBuilder — fluent search filter helper
├── DomainProfile — generic namespace/entity/attributes base class
├── MarketingProfile — digital marketing typed adapter
├── RelType — standard relationship type constants
└── graph.visualize() — D3 force-graph HTML exporter
[Rust CLI]
feather-db-cli (FFI via extern "C" from src/feather_core.cpp)
Known Limitations
| Issue | Detail |
|---|---|
| No concurrent writes | HNSW is not thread-safe for simultaneous adds |
| No vector deletion | HNSW marks deletions; data stays until compaction |
| Max 1M vectors/modality | Hardcoded in get_or_create_index; increase max_elements to raise |
meta.attributes['k'] = v silent no-op |
pybind11 map copy; use meta.set_attribute(k, v) |
| tags_json is raw string | Tag filtering uses substring search, not proper JSON parsing |
Contributing
- Fork the repository
- Create a feature branch
- Make your changes with tests
- Submit a pull request
See CONTRIBUTING.md for details.
License
MIT — see LICENSE