colonylib 0.5.13

A library implementing the Colony metadata framework on Autonomi
Documentation

colonylib

A Rust library implementing the Colony metadata framework for the Autonomi decentralized network. This library provides the core infrastructure for creating, managing, and searching metadata about files stored on Autonomi using a semantic RDF-based approach.

Note: This is a library for developers. If you're looking for an end-user application to easily upload/download/search files from Autonomi, look at the following applications based on your needs:

  • Colony(IN PROGRESS) - Cross platform GUI applications
  • Colony Daemon and CLI - Command line interface and background daemon for headless servers
  • Mutant - Rust GUI application

Overview

Core Concepts

Pods are the fundamental building blocks of colonylib. A pod consists of:

  • An Autonomi pointer that serves as the pod's address
  • A scratchpad containing RDF metadata about files and other pods
  • Semantic metadata written in RDF in the TriG format using schema.org schema

This architecture enables:

  • Rich metadata storage: File types, sizes, names, descriptions, and custom properties
  • Knowledge graphs: Pods can reference other pods, creating interconnected networks of metadata
  • Semantic search: Query data using SPARQL via the integrated oxigraph database
  • Decentralized discovery: Traverse pod references to discover related content across the network

Key Features

  • Deterministic key derivation: Uses 12-word mnemonic seed phrases to reproducibly generate all pod addresses
  • Offline-first operation: Build and search your metadata locally without network access or cryptocurrency
  • Cross-device synchronization: Access your pods from any device using the same seed phrase
  • Network integration: Upload/download pods to/from the Autonomi network when ready
  • Semantic interoperability: Standardized RDF schemas enable data sharing between applications

Scope

Colonylib focuses on metadata management and semantic search. It does not handle actual file uploads/downloads - those operations use the standard Autonomi API. Think of colonylib as a sophisticated indexing and discovery layer on top of Autonomi's storage primitives.

Status

Current Capabilities ✅

  • Local pod management: Create, modify, and cache pods in platform-appropriate data directories (Windows, Mac, Linux)
  • Secure key management: Password-protected keystore with deterministic key derivation from mnemonic phrases
  • Network synchronization: Upload modified pods and download updates from the Autonomi network
  • Cache management: Populate, repair, and maintain local pod caches with automatic conflict resolution
  • RDF graph database: Store and query semantic metadata using oxigraph with SPARQL support
  • Pod references: Create interconnected networks of pods with configurable traversal depth
  • Semantic search: Query pods by content, type, properties, and relationships
  • Advanced search features: Relevance ranking by number of matches and pod depth, fuzzy matching
  • Automatic scratchpad overflow: Automatically splits up pod data into multiple scratchpads for large metadata collections (>4MB)

Roadmap 🚧

  • Improve Autonomi error handling (auto retry on certain failures, library specific errors)
  • Performance optimizations for large-scale pod networks (threading Autonomi fetch operations)

Library Architecture

Colonylib is organized into four core modules that work together to provide a complete metadata management system:

1. KeyStore (key.rs)

Purpose: Cryptographic key management and derivation

  • Mnemonic-based: Generate deterministic keys from 12-word seed phrases
  • Secure storage: Password-encrypted keystore files using the Cocoon library
  • Key derivation: Separate key spaces for pointers, scratchpads, and wallet operations
  • Cross-device sync: Same mnemonic produces identical keys across devices

2. DataStore (data.rs)

Purpose: Local file system operations and pod caching

  • Platform-aware: Uses OS-appropriate data directories (~/.local/share/colony on Linux)
  • Organized storage: Separate directories for pointers, scratchpads, and pod references
  • Cache management: Track upload queues, handle file operations, manage local state
  • Atomic operations: Safe concurrent access to pod files

3. Graph (graph.rs)

Purpose: RDF semantic database and SPARQL query engine

  • Oxigraph integration: High-performance RDF store with SPARQL 1.1 support
  • Ontology independent: Library supports any ontology or schema (Using schema.org is preferred for portability)
  • Named graphs: Isolate pod data while enabling cross-pod queries
  • JSON-LD metadata entry: Write JSON-LD metadata for subjects within pods
  • Query interface: Execute SPARQL queries across all local and referenced pods and return results in JSON format

4. PodManager (pod.rs)

Purpose: High-level pod operations and network coordination

  • Unified interface: Coordinates between KeyStore, DataStore, Graph, and Autonomi network
  • Pod lifecycle: Create, modify, upload, download, and synchronize pods
  • Reference traversal: Discover and cache interconnected pod networks
  • Search operations: Execute semantic queries across local and referenced pods

Public API

The main entry point for colonylib is the PodManager struct, which provides a high-level interface for all pod operations. Here are the key methods:

Wallet Key Management

// Add a new wallet key with a name
async fn add_wallet_key(&mut self, name: &str, wallet_key: &str) -> Result<(), Error>

// Retrieve a specific wallet key by name
async fn get_wallet_key(&self, name: &str) -> Result<String, Error>

// Retrieve all wallet keys
fn get_wallet_keys(&self) -> HashMap<String, String>

// Set the active wallet and persist to local storage
fn set_active_wallet(&mut self, name: &str) -> Result<(String, String), Error>

// Get the currently active wallet from local storage
fn get_active_wallet(&self) -> Result<(String, String), Error>

Core Pod Operations

// Create a new pod with a given name
async fn add_pod(&mut self, pod_name: &str) -> Result<(String, String), Error>

// Remove a pod and all its associated data
async fn remove_pod(&mut self, pod_address: &str) -> Result<(), Error>

// Rename an existing pod
async fn rename_pod(&mut self, pod_address: &str, new_name: &str) -> Result<(), Error>

// Create a reference from one pod to another
fn add_pod_ref(&mut self, pod_address: &str, referenced_pod_address: &str) -> Result<(), Error>

// Add metadata for a specific subject (file/resource) to a pod using JSON-LD syntax
async fn put_subject_data(&mut self, pod_address: &str, subject_address: &str, metadata: &str) -> Result<(), Error>

// Get metadata for a specific subject and return a JSON string
async fn get_subject_data(&mut self, subject_address: &str) -> Result<String, Error>

Network Synchronization

// Upload all local changes to the Autonomi network
async fn upload_all(&mut self) -> Result<(), Error>

// Upload a specific pod to the Autonomi network
async fn upload_pod(&mut self, address: &str) -> Result<(), Error>

// Download updates for user-created pods
async fn refresh_cache(&mut self) -> Result<(), Error>

// Download referenced pods up to specified depth
async fn refresh_ref(&mut self, depth: u64) -> Result<(), Error>

// Get the current list of pods that need to be uploaded in JSON format
fn get_update_list(&self) -> Result<serde_json::Value, Error>

Pod Discovery and Listing

// List all pods owned by the user
fn list_my_pods(&self) -> Result<serde_json::Value, Error>

// List all subjects (resources) within a specific pod
fn list_pod_subjects(&self, pod_address: &str) -> Result<Vec<String>, Error>

Search and Query

// Search pods using various criteria (text, type, properties)
async fn search(&mut self, query: serde_json::Value) -> Result<serde_json::Value, Error>

Initialization

// Create a new PodManager instance
async fn new(
    client: Client,           // Autonomi network client
    wallet: &Wallet,          // Payment wallet
    data_store: &mut DataStore,   // Local storage
    key_store: &mut KeyStore,     // Cryptographic keys
    graph: &mut Graph,            // RDF database
) -> Result<PodManager, Error>

Installation

Add colonylib to your Rust project:

[dependencies]
colonylib = "0.3.0"
autonomi = "0.4.6"
tokio = "1.44"
serde_json = "1.0"

Or use cargo:

cargo add colonylib autonomi tokio serde_json

Examples

The repository includes three comprehensive examples that demonstrate colonylib's capabilities. These examples are designed to be run in sequence and work on a local Autonomi testnet. Running on main or the Alpha network is possible, but requires code changes.

Prerequisites

Before running the examples, you need:

  1. Rust toolchain (1.70 or later)

  2. Autonomi network access:

    • Local testnet: Run a local Autonomi node for development. See the Autonomi documentation for setup instructions.
    • Alpha network: Connect to the Autonomi alpha testnet (change the init_client function in each example)
    • Main network: Connect to the live Autonomi network (change the init_client function in each example)
  3. Wallet with tokens (for Alpha/Main network operations, creating the local testnet will handle this for you):

    • ETH for gas fees
    • ANT tokens for storage payments

Example 1: Setup (examples/setup.rs)

Purpose: Initialize the colonylib environment and verify network connectivity.

This example:

  • Creates the local data directory structure
  • Initializes or loads an encrypted keystore
  • Sets up the RDF graph database
  • Connects to the Autonomi network
  • Displays wallet balances

Run it:

# For local testnet
cargo run --example setup

# The example uses these default settings:
# - Network: local testnet
# - Mnemonic: "abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon about"
# - Password: "password"

What it does:

  • Creates ~/.local/share/colony/ (Linux) or equivalent on other platforms
  • Generates deterministic keys from the mnemonic phrase
  • Verifies network connectivity and wallet balance
  • Prepares the environment for pod operations

Example 2: Adding Pods (examples/add_pods.rs)

Purpose: Create pods with sample metadata and upload them to the network.

This example demonstrates:

  • Creating multiple pods with descriptive names
  • Adding rich JSON-LD metadata for different file types
  • Uploading pods to the Autonomi network
  • Handling network costs and replication delays

Run it:

cargo run --example add_pods

What it creates:

  • Pod 1: Metadata for an image file (ant_girl.png)
  • Pod 2: Metadata for an audio file (BegBlag.mp3)

Each pod contains structured metadata using Schema.org vocabularies:

{
  "@context": {"schema": "http://schema.org/"},
  "@type": "schema:MediaObject",
  "@id": "ant://[file-address]",
  "schema:name": "filename.ext",
  "schema:description": "File description",
  "schema:contentSize": "2MB"
}

Example 3: Search (examples/search.rs)

Purpose: Demonstrate various search capabilities across the pod network.

This example shows:

  • Simple text search across all metadata
  • Type-based queries (find all MediaObjects)
  • Property-based queries (find items with specific attributes)
  • Advanced multi-criteria searches
  • Subject data retrieval

Run it:

cargo run --example search

Search types demonstrated:

  1. Text search: Find pods containing specific words
  2. Type search: Query by RDF type (e.g., MediaObject, Document)
  3. Predicate search: Find resources with specific properties
  4. Browse: List all subjects with their name, type, and description, ordered by pod depth
  5. Advanced search: Combine multiple criteria
  6. Subject retrieval: Get complete metadata for specific resources
  7. Pod listing: Enumerate all user pods and their contents

Running the Examples

Complete workflow:

# 1. Initialize the environment
cargo run --example setup

# 2. Create sample pods with metadata
cargo run --example add_pods

# 3. Search and query the pods
cargo run --example search

Network Configuration:

To use different networks, modify the environment variable in each example:

// Local testnet (default)
let environment = "local".to_string();

// Alpha testnet (needs test tokens)
let environment = "alpha".to_string();

// Main network (needs real tokens)
let environment = "autonomi".to_string();

Wallet Configuration:

The examples use a hardcoded private key for local testing. For production use:

  1. Generate a secure private key
  2. Fund the wallet with ETH and ANT tokens
  3. Update the LOCAL_PRIVATE_KEY constant

Data Persistence:

  • All examples use the same data directory
  • Keystores and graph databases persist between runs
  • You can safely re-run examples to see updated results
  • Delete the data directory and run setup.rs to reset the environment if needed

NOTE! This is a destructive operation. It will overwrite the local data directory and recreate it. If you have things you want to keep, make sure you have uploaded everything to Autonomi before running this.

Usage Patterns

Basic Workflow

use autonomi::{Client, Wallet};
use colonylib::{PodManager, DataStore, KeyStore, Graph};
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 1. Initialize components
    let client = Client::init_local().await?;
    let wallet = &Wallet::new_from_private_key(client.evm_network(), private_key)?;
    let data_store = &mut DataStore::create()?;

    // 2. Set up keystore
    let key_store = &mut if keystore_exists {
        KeyStore::from_file(&mut file, password)?
    } else {
        KeyStore::from_mnemonic(mnemonic)?
    };

    // 3. Initialize graph database
    let graph = &mut Graph::open(&data_store.get_graph_path())?;

    // 4. Create pod manager
    let mut pod_manager = PodManager::new(client, wallet, data_store, key_store, graph).await?;

    // 5. Create and populate pods
    let (pod_addr, _) = pod_manager.add_pod("My Collection").await?;

    let metadata = json!({
        "@context": "http://schema.org/",
        "@type": "Dataset",
        "name": "Research Data",
        "description": "Important research findings"
    });

    pod_manager.put_subject_data(&pod_addr, FILE_ADDRESS, metadata).await?;

    // 6. Upload to network
    pod_manager.upload_all().await?;

    // 7. Search and query
    let results = pod_manager.search(json!("research")).await?;
    println!("Found: {}", results);

    Ok(())
}

Advanced Pod Operations

Uploading Individual Pods

For more granular control over network operations, you can upload specific pods instead of all pending changes:

// Create and populate a pod
let (pod_addr, _) = pod_manager.add_pod("Research Data").await?;

let metadata = json!({
    "@context": "http://schema.org/",
    "@type": "Dataset",
    "name": "Climate Research",
    "description": "Temperature data from weather stations"
});

pod_manager.put_subject_data(&pod_addr, subject_address, &metadata.to_string()).await?;

// Upload only this specific pod
pod_manager.upload_pod(&pod_addr).await?;

Discovering and Managing Pods

List all your pods and explore their contents:

// Get all user pods
let pods_result = pod_manager.list_my_pods()?;

if let Some(bindings) = pods_result["results"]["bindings"].as_array() {
    for pod in bindings {
        let pod_address = pod["pod"]["value"].as_str().unwrap();
        let pod_name = pod["name"]["value"].as_str().unwrap();

        println!("Pod: {} ({})", pod_name, pod_address);

        // List all subjects in this pod
        let subjects = pod_manager.list_pod_subjects(pod_address)?;
        println!("  Contains {} subjects:", subjects.len());

        for subject_address in subjects {
            // Get detailed metadata for each subject
            let metadata = pod_manager.get_subject_data(&subject_address).await?;
            let metadata_json: serde_json::Value = serde_json::from_str(&metadata)?;

            // Extract subject name from metadata
            if let Some(bindings) = metadata_json["results"]["bindings"].as_array() {
                for binding in bindings {
                    if binding["predicate"]["value"].as_str() == Some("http://schema.org/name") {
                        if let Some(name) = binding["object"]["value"].as_str() {
                            println!("    - {} ({})", name, subject_address);
                        }
                    }
                }
            }
        }
    }
}

Working with Pod References

Create interconnected pod networks:

// Create related pods
let (main_pod, _) = pod_manager.add_pod("Research Collection").await?;
let (data_pod, _) = pod_manager.add_pod("Raw Data").await?;
let (analysis_pod, _) = pod_manager.add_pod("Analysis Results").await?;

// Create references between pods
pod_manager.add_pod_ref(&main_pod, &data_pod).await?;
pod_manager.add_pod_ref(&main_pod, &analysis_pod).await?;

// Upload all pods
pod_manager.upload_all().await?;

// Later, refresh with references to discover connected pods
pod_manager.refresh_ref(2).await?; // Depth 2 to include referenced pods

Pod Management Operations

Rename and remove pods as needed:

// Create a pod with an initial name
let (pod_addr, _) = pod_manager.add_pod("Temporary Research").await?;

// Add some metadata
let metadata = json!({
  "@context": {"schema": "http://schema.org/"},
  "@type": "schema:SoftwareApplication",
  "@id": "ant://cca4e991284bfd22005bd29884079154817c7f0c3ae09c1685ffa3764c6c1e83",
  "schema:name": "colony-daemon",
  "schema:description": "colony-daemon v0.1.2 x86_64 linux binary
  "schema:operatingSystem": "Linux",
  "schema:applicationCategory": "Application"
});

pod_manager.put_subject_data(&pod_addr, subject_address, &metadata.to_string()).await?;

// Rename the pod to be more descriptive
pod_manager.rename_pod(&pod_addr, "Climate Research Dataset").await?;

// Upload the changes
pod_manager.upload_all().await?;

// Later, if the pod is no longer needed, remove it
pod_manager.remove_pod(&pod_addr).await?;

// Upload the removal to the network
pod_manager.upload_all().await?;

Important Notes:

  • Renaming: Only changes the display name; the pod address remains the same
  • Removal: Completely removes the pod and all associated scratchpads
  • Configuration pod: The configuration pod cannot be removed (it's protected)
  • Irreversible: Once uploaded to the network, removals cannot be undone

Offline-First User Support

Colonylib supports offline usage - you can create, reference, and search pods without performing any Autonomi network write operations:

// Create pods locally
let (pod_addr, _) = pod_manager.add_pod("Offline Pod").await?;
pod_manager.put_subject_data(&pod_addr, subject, metadata).await?;

// Search works immediately
let results = pod_manager.search(json!("offline")).await?;

// Upload when ready (requires network + tokens)
pod_manager.upload_all().await?;

The caveat here is that the data is only stored on your computer. There is no way to recover it if you lose your computer. Uploading to the network is necessary to ensure data persistence, cross-device synchronization, and the ability to share your data with others.

Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

Development Setup

git clone https://github.com/zettawatt/colonylib.git
cd colonylib
cargo build
cargo test

Running Tests

# Run all tests
cargo test

# Run specific module tests
cargo test key_tests
cargo test pod_tests
cargo test graph_tests

License

This project is licensed under the GPL-3.0-only License - see the LICENSE file for details.

Links