Phago — Biological Computing Primitives

Status: Beta / Production-Ready

A framework that maps cellular biology mechanisms to computational operations. Agents self-organize, consume documents, build a Hebbian knowledge graph, share vocabulary, detect anomalies, and exhibit emergent collective behavior — all without top-down orchestration.

Latest Results (Production Release)

Metric	Before	After	Change
Tests passing	32/34	99/99	+67 tests, 100% pass rate
Graph edges (100 docs)	255,888	4,472	-98.3% density reduction
Best P@5	0.658 (TF-IDF)	0.742 (Hybrid)	+12.8%
Best MRR	0.714 (Graph)	0.800 (Hybrid)	+12.0%
Genome parameters	5	8	+3 wiring strategy params
Query types	1	5	BFS, Hybrid, Path, Centrality, Bridge
MCP tools	0	3	remember, recall, explore

What It Does

Feed the colony documents. Agents digest them into concepts, wire a knowledge graph through co-activation (Hebbian learning), share vocabulary across agent boundaries (horizontal gene transfer), and detect anomalies (negative selection). The graph structure IS the memory — frequently used connections strengthen, unused ones decay.

Documents → Agents digest → Concepts extracted → Graph wired → Knowledge emerges
                ↑                                      ↓
                └──── Transfer, Symbiosis, Dissolution ─┘

Quick Start

Run the Demos

# Build
cargo build

# Run the proof-of-concept (120-tick simulation)
cargo run --bin phago-poc

# Run all tests (99 tests)
cargo test --workspace

# Open the interactive visualization (generated by POC)
open output/phago-colony.html

Use as a Library

Add to your Cargo.toml:

[dependencies]
phago = { git = "https://github.com/Clemens865/Phago_Project.git" }

Basic usage with the prelude:

use phago::prelude::*;

fn main() {
    let mut colony = Colony::new();

    // Ingest documents
    colony.ingest_document("doc1", "Cell membrane transport proteins", Position::new(0.0, 0.0));
    colony.ingest_document("doc2", "Protein folding and membrane insertion", Position::new(1.0, 0.0));

    // Spawn digesters and run
    colony.spawn(Box::new(Digester::new(Position::new(0.0, 0.0)).with_max_idle(30)));
    colony.run(30);

    // Query with hybrid scoring
    let results = hybrid_query(&colony, "membrane protein", &HybridConfig {
        alpha: 0.5, max_results: 5, candidate_multiplier: 3,
    });

    for r in results {
        println!("{} (score: {:.3})", r.label, r.final_score);
    }
}

See docs/INTEGRATION_GUIDE.md for complete examples and API reference.

Production Features

Single import: use phago::prelude::* gives you everything
Structured errors: Result<T, PhagoError> with typed error categories
Deterministic testing: Digester::with_seed(pos, seed) for reproducible simulations
Session persistence: Save/restore colony state across sessions (JSON or SQLite)
SQLite persistence: ColonyBuilder with auto-save for production deployments
Async runtime: AsyncColony with TickTimer for real-time visualization
MCP adapter: Ready for external LLM/agent integration
Semantic embeddings: Vector-based concept extraction (optional semantic feature)

SQLite Persistence (Phase 10)

Enable durable storage with automatic save/load:

[dependencies]
phago-runtime = { version = "0.1", features = ["sqlite"] }

use phago_runtime::prelude::*;

// Create colony with persistent storage
let mut colony = ColonyBuilder::new()
    .with_persistence("knowledge.db")  // SQLite file
    .auto_save(true)                   // Save on drop
    .build()?;

// Use normally — persistence is automatic
colony.ingest_document("title", "content", Position::new(0.0, 0.0));
colony.run(100);
colony.save()?;  // Explicit save (also happens on drop)

// Later: reload with full state preserved
let colony2 = ColonyBuilder::new()
    .with_persistence("knowledge.db")
    .build()?;

Async Runtime (Phase 10)

Enable controlled-rate simulation for visualization:

[dependencies]
phago-runtime = { version = "0.1", features = ["async"] }

use phago_runtime::prelude::*;
use phago_runtime::async_runtime::{run_in_local, TickTimer};

#[tokio::main]
async fn main() {
    let colony = Colony::new();

    // Fast async simulation
    run_in_local(colony, |ac| async move {
        ac.run_async(100).await
    }).await;

    // Or controlled tick rate for visualization
    let colony2 = Colony::new();
    run_in_local(colony2, |ac| async move {
        let mut timer = TickTimer::new(100);  // 100ms per tick
        timer.run_timed(&ac, 50).await;
    }).await;
}

Semantic Embeddings (Phase 9)

Enable vector embeddings for semantic understanding:

[dependencies]
phago = { version = "0.1", features = ["semantic"] }

use phago::prelude::*;
use std::sync::Arc;

// Create an embedder (SimpleEmbedder or API-backed)
let embedder: Arc<dyn Embedder> = Arc::new(SimpleEmbedder::new(256));

// SemanticDigester uses embeddings for concept extraction
let mut digester = SemanticDigester::new(Position::new(0.0, 0.0), embedder.clone());
let concepts = digester.digest_text("The mitochondria is the powerhouse of the cell.".into());

// Find semantically similar concepts
let similar = digester.find_similar("cellular energy", 5);

The semantic feature adds:

SimpleEmbedder — Hash-based embeddings (no dependencies)
SemanticDigester — Embedding-backed agent for semantic concept extraction
Chunker — Document chunking with configurable overlap
Similarity functions — cosine_similarity, euclidean_distance, normalize_l2

LLM Integration (Phase 9.2)

Enable LLM-backed concept extraction:

[dependencies]
# Local LLM (Ollama)
phago = { version = "0.1", features = ["llm-local"] }

# Cloud APIs (Claude, OpenAI)
phago = { version = "0.1", features = ["llm-api"] }

# All backends
phago = { version = "0.1", features = ["llm-full"] }

use phago::prelude::*;

// Local Ollama backend (no API key needed)
let ollama = OllamaBackend::localhost().with_model("llama3.2");
let concepts = ollama.extract_concepts("Cell membrane transport").await?;

// Claude backend
let claude = ClaudeBackend::new("sk-ant-...").sonnet();
let concepts = claude.extract_concepts("Cell membrane transport").await?;

// OpenAI backend
let openai = OpenAiBackend::new("sk-...").gpt4o_mini();
let concepts = openai.extract_concepts("Cell membrane transport").await?;

The llm features add:

OllamaBackend — Local LLM via Ollama (no API key needed)
ClaudeBackend — Anthropic Claude API
OpenAiBackend — OpenAI GPT API
LlmBackend trait — Common interface for all backends
Concept extraction — Extract structured concepts from text
Relationship identification — Find relationships between concepts
Query expansion — Expand queries for better recall

The Ten Biological Primitives

Primitive	Biological Analog	What It Does
DIGEST	Phagocytosis	Consume input, extract fragments, present to graph
APOPTOSE	Programmed cell death	Self-assess health, gracefully self-terminate
SENSE	Chemotaxis	Detect signals, follow gradients
TRANSFER	Horizontal gene transfer	Export/import vocabulary between agents
EMERGE	Quorum sensing	Detect threshold, activate collective behavior
WIRE	Hebbian learning	Strengthen used connections, prune unused
SYMBIOSE	Endosymbiosis	Integrate another agent as permanent symbiont
STIGMERGE	Stigmergy	Coordinate through environmental traces
NEGATE	Negative selection	Learn self-model, detect anomalies by exclusion
DISSOLVE	Holobiont boundary	Modulate agent-substrate boundaries

Agent Types

Digester — Consumes documents, extracts keywords, presents concepts to the knowledge graph. Implements DIGEST + SENSE + APOPTOSE + TRANSFER + SYMBIOSE + DISSOLVE.
Synthesizer — Dormant until quorum reached, then identifies bridge concepts and topic clusters. Implements EMERGE + SENSE + APOPTOSE.
Sentinel — Learns what "normal" looks like, flags anomalies by deviation from self-model. Implements NEGATE + SENSE + APOPTOSE.

Research Branches

Four falsifiable hypotheses, each with a working prototype, benchmark, visualization, and papers.

1. Bio-RAG — Self-Reinforcing Retrieval

Hebbian-reinforced knowledge graph retrieval with hybrid scoring (TF-IDF + graph re-ranking).

cargo run --bin phago-bio-rag-demo

Metric	Graph-only	TF-IDF	Hybrid
P@5	0.280	0.742	0.742
MRR	0.650	0.775	0.800
NDCG@10	0.357	0.404	0.410

Key insight: The graph's value is not in replacing TF-IDF but in re-ranking candidates using structural context. Hybrid scoring beats pure TF-IDF on MRR (first relevant result ranked higher).

2. Agent Evolution — Evolutionary Agents Through Apoptosis

Agents evolving through intrinsic selection pressure (death + mutation + inheritance) produce richer knowledge graphs.

cargo run --bin phago-agent-evolution-demo

Metric (tick 300)	Evolved	Static	Random
Nodes	1,582	864	1,191
Edges	101,824	8,769	38,399
Clustering coeff.	0.969	0.948	0.970
Spawns / Generations	140 / 135	0 / 0	144 / 144

3. KG Training — Knowledge Graph to Training Data

Hebbian-weighted triples with curriculum ordering for language model fine-tuning.

cargo run --bin phago-kg-training-demo

Metric	Value
Communities detected	548
NMI vs ground truth	0.170
Triples exported	252,641
Foundation coherence	100% same-community
Weight ratio (foundation/periphery)	1.3x

4. Agentic Memory — Persistent Code Knowledge

Self-organizing code knowledge graph that persists across sessions.

cargo run --bin phago-agentic-memory-demo

Metric	Value
Code elements extracted	830
Graph nodes / edges	659 / 33,490
Session persistence	100% fidelity
Graph P@5	0.140

New Features (Ralph Loop Phase 1)

Hebbian LTP Model (Tentative Edge Wiring)

First co-occurrence creates edge at 0.1 weight (tentative)
Subsequent co-occurrences reinforce: weight += 0.1
Single-document edges decay quickly under synaptic pruning
Cross-document reinforced edges survive

Multi-Objective Fitness

4-dimensional evolution:

30% Productivity — concepts + edges per tick
30% Novelty — novel concepts / total concepts
20% Quality — strong edges (co_act ≥ 2) / total edges
20% Connectivity — bridge edges / total edges

Structural Queries

// Path queries — "What connects A to B?"
graph.shortest_path(&from, &to) -> Option<(Vec<NodeId>, f64)>

// Centrality queries — "What's most important?"
graph.betweenness_centrality(100) -> Vec<(NodeId, f64)>

// Bridge queries — "What concepts connect domains?"
graph.bridge_nodes(10) -> Vec<(NodeId, f64)>

// Component queries — "How many disconnected regions?"
graph.connected_components() -> usize

MCP Integration

External LLMs/agents can interact via typed request/response API:

phago_remember(title, content, ticks) — ingest document
phago_recall(query, max_results, alpha) — hybrid query
phago_explore(type: path|centrality|bridges|stats) — structural queries

Architecture

crates/
├── phago/            # Unified facade crate (use this!)
├── phago-cli/        # Command-line interface (ingest, query, stats, session)
├── phago-core/       # Traits (10 primitives) + shared types + error handling
├── phago-runtime/    # Colony, substrate, topology, corpus, sessions, SQLite, async
├── phago-agents/     # Digester, Sentinel, Synthesizer, SemanticDigester, genome, evolution
├── phago-embeddings/ # Vector embeddings (SimpleEmbedder, OnnxEmbedder, API providers)
├── phago-llm/        # LLM integration (Ollama, Claude, OpenAI)
├── phago-rag/        # Query engine, scoring, baselines, hybrid, MCP adapter
├── phago-viz/        # Self-contained HTML visualization (D3.js)
└── phago-wasm/       # WASM integration (future)
poc/
├── knowledge-ecosystem/   # Original proof of concept
├── bio-rag-demo/          # Branch 1: self-reinforcing RAG
├── agent-evolution-demo/  # Branch 2: evolutionary agents
├── kg-training-demo/      # Branch 3: KG → training data
├── agentic-memory-demo/   # Branch 4: persistent code knowledge
└── data/corpus/           # 100-doc test corpus (4 topics × 25 docs)
docs/papers/               # White papers + explainers for each branch

Colony Lifecycle (per tick)

Sense — All agents observe substrate (signals, documents, traces)
Act — Colony processes agent actions (move, digest, present, wire)
Transfer — Agents export/integrate vocabulary, attempt symbiosis
Dissolve — Mature agents modulate boundaries, reinforce graph nodes
Death — Remove agents that self-assessed for termination
Decay — Signals, traces, and edge weights decay; weak edges pruned

Key Design Choices

Rust ownership = biological resource management. move semantics model consumption (you can't eat something twice). Drop models apoptosis. No garbage collector = deterministic death.
The graph IS the memory. No separate storage layer. The topology of the knowledge graph, shaped by Hebbian learning, encodes all accumulated knowledge.
No LLMs in the loop. The v0.1 primitives must prove emergence without external intelligence. The framework is designed for LLM-backed agents in future versions.

Quantitative Proof (Phase 5)

Running cargo run --bin phago-poc produces metrics proving the model works:

Metric	What It Proves
Transfer Effect	Vocabulary sharing across agents (shared terms ratio, export/integration counts)
Dissolution Effect	Boundary modulation reinforces knowledge (concept vs non-concept access ratio)
Graph Richness	Colony builds meaningful structure (density, clustering coefficient, bridge concepts)
Vocabulary Spread	Knowledge propagates across agents (Gini coefficient of vocabulary sizes)

The POC also generates output/phago-colony.html — an interactive D3.js visualization with:

Force-directed knowledge graph
Agent spatial canvas
Event timeline
Metrics dashboard with tick slider

Implementation Status

Phase	Status	Description
0 — Scaffold	✅ Done	Workspace, 10 primitive traits, shared types
1 — First Cell	✅ Done	Digester agent, apoptosis, colony lifecycle
2 — Self-Organization	✅ Done	Chemotaxis, document ingestion, Hebbian wiring
3 — Emergence	✅ Done	Synthesizer (quorum sensing), Sentinel (negative selection)
4 — Cooperation	✅ Done	Transfer, Symbiosis, Dissolution
5 — Prove It Works	✅ Done	Metrics, visualization, hardening tests, performance optimization
6 — Research Branches	✅ Done	4 branches with prototypes, benchmarks, papers
7 — Production Ready	✅ Done	Facade crate, preludes, error types, deterministic testing
8 — Distribution	✅ Done	Published to crates.io, CLI tool with all commands
9.1 — Embeddings	✅ Done	phago-embeddings crate, SemanticDigester agent
9.2 — LLM Integration	✅ Done	phago-llm crate (Ollama, Claude, OpenAI)
9.3 — Vector Wiring	✅ Done	SemanticWiringConfig, similarity-based edge weights
10.1 — Agent Serialization	✅ Done	SerializableAgent trait, session persistence with agents
10.2 — SQLite Persistence	✅ Done	ColonyBuilder, auto-save, WAL mode, full roundtrip
10.3 — Async Runtime	✅ Done	AsyncColony, TickTimer, run_in_local, spawn_simulation_local

Tests

# All tests
cargo test --workspace

# With all features (sqlite + async)
cargo test --workspace --features "sqlite,async"

# By category
cargo test --test transfer_tests       # Vocabulary export/import
cargo test --test symbiosis_tests      # Agent absorption
cargo test --test dissolution_tests    # Boundary modulation
cargo test --test phase4_integration   # Full colony integration
cargo test -p phago-runtime metrics    # Quantitative metrics
cargo test -p phago-viz                # HTML visualization

# Benchmarks (with features)
cargo test --release --features "sqlite,async" -p phago-runtime --test benchmarks -- --nocapture

Phase 10 Benchmark Results

Category	Metric	Result
Throughput	Ticks/sec (small colony)	733
SQLite	Save/load time	<1ms
Async	Overhead vs sync	<5%
Serialization	200 agents	8µs
Semantic wiring	Overhead	~11%

Documentation

docs/INTEGRATION_GUIDE.md — How to use Phago — installation, examples, API reference
docs/papers/phago-whitepaper-v2.md — Main whitepaper (v2.0) — comprehensive technical paper
docs/EXECUTIVE_SUMMARY.md — Latest results and roadmap
docs/COMPETITIVE_ANALYSIS.md — Where Phago wins vs traditional approaches
docs/USE_CASES.md — Practical applications
docs/WHITEPAPER.md — Original theoretical foundation
docs/PRD.md — Product requirements and specifications
docs/BUILD_PLAN.md — Phased implementation roadmap

Research Papers

Branch	White Paper	Explainer
Bio-RAG	`bio-rag-whitepaper.md`	`bio-rag-explainer.md`
Agent Evolution	`agent-evolution-whitepaper.md`	`agent-evolution-explainer.md`
KG Training	`kg-training-whitepaper.md`	`kg-training-explainer.md`
Agentic Memory	`agentic-memory-whitepaper.md`	`agentic-memory-explainer.md`

License

MIT

phago 0.2.0