iridium-db 0.3.0

A high-performance vector-graph hybrid storage and indexing engine
Here are a few additional **superpowers** of Iridium that stand out in real-world scenarios, each with a concise motivating use case that highlights something different from the fraud detection (read-heavy hybrid) and video recommendations (write-heavy social graph) examples we already covered.

### 1. Superpower: Extremely low-latency incremental updates + point-in-time views  
   **Use case: Live supply-chain risk monitoring & disruption prediction**

**Context**  
A global logistics company tracks ~120 million shipments annually across 1.2 million suppliers, ports, vessels, trucks, and warehouses. Every minute, thousands of events arrive: GPS pings, customs delays, weather alerts, port congestion reports, geopolitical news signals.

**Requirements**  
- Ingest and reflect every new event in <5 ms p99 (no batching delay).  
- Answer “What-if” questions instantly: “Show me the impact if this Suez Canal blockage lasts 72 more hours” → traverse affected supply chains and re-rank risk using updated vectors (disruption embeddings).  
- Support **point-in-time queries** (“What did the risk graph look like 3 hours ago?”) for auditing & forensic analysis.

**Why Iridium shines here**  
- **Delta append model** → every GPS ping / news signal = one tiny `EdgeDelta` + `VectorDelta` → sub-millisecond ingest.  
- **LSM multi-versioning** → natural point-in-time snapshots (query with a timestamp filter on version; no extra copy-on-write).  
- **Graph-aware compaction** prioritizes high-risk chains (e.g., nodes with sudden degree spikes from rerouting) → keeps hot paths consolidated.  
- **Vector co-location** ensures disruption embeddings for connected ports/suppliers stay prefetchable.

**Contrast to alternatives** (e.g., Weaviate / Neo4j + Kafka)  
- Most systems batch events (10–60 s windows) or use eventual-consistency indexes → 100–1000 ms ingest lag.  
- Point-in-time requires expensive snapshots or separate time-series DB.  
- Iridium: **<5 ms ingest + instant historical views** → analysts see live “what-if” rerouting simulations in seconds.

### 2. Superpower: Native multi-vector support + vector-grouped traversal  
   **Use case: Multi-modal enterprise knowledge search & compliance discovery**

**Context**  
A large law firm or regulatory agency ingests millions of documents daily: contracts, emails, chat logs, scanned PDFs, audio transcripts, images of whiteboards/handwritten notes.

Each document has **multiple embeddings**:
- text content (E5-large)
- summary / key entities (short-context model)
- legal-risk vector (fine-tuned on regulations)
- OCR-extracted image embeddings (CLIP-like)
- speaker-tone embeddings from audio

**Requirements**  
- Search across any combination of modalities (“find contracts similar in text + high legal risk + suspicious tone in attached call recording”).  
- Traverse document → cited documents → people → related emails → attachments (6–8 hops).  
- Return ranked results with per-vector scores.

**Why Iridium shines here**  
- **Multi-vector per node** (up to 4–8 inline descriptors) → one node holds all modalities.  
- **Vector-grouped traversal** — filter traversal by vector space (“only follow edges if legal_risk_vector > 0.85”).  
- **Co-location during compaction** keeps a contract + its attachments + cited docs + people physically close → fast multi-modal expansion.  
- **Hybrid scoring** natively combines cosine scores from different spaces.

**Contrast to alternatives** (Weaviate, Pinecone + graph DB)  
- Weaviate supports multi-vector per object but traversal is limited to cross-refs (not vector-conditioned).  
- Bolt-on setups require separate indexes per modality → painful joins across 6–8 hops.  
- Iridium: **single-node multi-vector + conditioned traversal** → feels like one coherent knowledge graph.

### 3. Superpower: Predictable p99 under mixed read/write + bursty traffic  
   **Use case: Live event-driven cybersecurity threat hunting**

**Context**  
A cybersecurity SOC monitors 10,000+ enterprise endpoints + cloud workloads. Every second: endpoint telemetry, network flows, log lines, threat-intel feeds.

**Requirements**  
- Ingest bursts of 500k–2M events/min during attacks (e.g., ransomware propagation).  
- Simultaneously run analyst queries: “Show me all endpoints similar to this compromised host (behavior vector), then traverse lateral movement paths (processes → connections → files) up to 8 hops.”  
- p99 < 120 ms even during 10× traffic spikes.

**Why Iridium shines here**  
- **LSM separation** — writes append to memtable/WAL → no read blocking.  
- **Graph-aware compaction** detects bursty lateral-movement chains (sudden degree spikes) → early consolidation of hot attack paths.  
- **Buffer pool pinning + prefetch** keeps active investigations (analyst sessions) pinned during bursts.  
- **Dynamic Bloom + fence pointers** skip irrelevant SSTables fast → stable p99.

**Contrast to alternatives**  
- Many systems (Weaviate, Neo4j) see p99 spikes 5–20× during write bursts because of index contention or compaction storms.  
- Iridium’s LSM + graph-aware tuning keeps p99 **stable within 2–3×** even under extreme load.

### Quick Summary Table of Superpowers & Use Cases

| Superpower                              | Highlighted Use Case                          | Key Iridium Advantage vs Alternatives |
|-----------------------------------------|-----------------------------------------------|---------------------------------------|
| Low-latency incremental + PIT views     | Live supply-chain risk monitoring             | Sub-5 ms ingest + instant snapshots   |
| Native multi-vector + grouped traversal | Multi-modal enterprise knowledge search       | One node, conditioned multi-hop       |
| Predictable p99 under bursty mixed load | Live cybersecurity threat hunting             | Stable latency during attack spikes   |

Any of these resonate as a strong third example to round out the motivating story?  
- Supply-chain one emphasizes **real-time + historical** duality.  
- Knowledge-search one shows **multi-modal richness**.  
- Cyber one proves **mixed workload resilience**.

Which one (or combination) would you like to flesh out into a full draft like the previous two? Or do you have another domain in mind (e.g., scientific literature graph, autonomous vehicle sensor fusion, etc.)?