colonylib 0.2.2

A library implementing the Colony metadata framework on Autonomi
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
# colonylib

A Rust library implementing the Colony metadata framework for the [Autonomi](https://autonomi.com) decentralized network. This library provides the core infrastructure for creating, managing, and searching metadata about files stored on Autonomi using a semantic RDF-based approach.

> **Note**: This is a library for developers. If you're looking for an end-user application to easily upload/download/search files from Autonomi, see [Colony]https://github.com/zettawatt/colony(IN PROGRESS) or [Mutant]https://github.com/champii/mutant.

## Overview

### Core Concepts

**Pods** are the fundamental building blocks of colonylib. A pod consists of:
- An **Autonomi pointer** that serves as the pod's address
- A **scratchpad** containing RDF metadata about files and other pods
- **Semantic metadata** written in a standardized [RDF]https://www.w3.org/RDF/ schema

This architecture enables:
- **Rich metadata storage**: File types, sizes, names, descriptions, and custom properties
- **Knowledge graphs**: Pods can reference other pods, creating interconnected networks of metadata
- **Semantic search**: Query data using SPARQL via the integrated [oxigraph]https://docs.rs/oxigraph/latest/oxigraph/index.html database
- **Decentralized discovery**: Traverse pod references to discover related content across the network

### Key Features

- **Deterministic key derivation**: Uses 12-word mnemonic seed phrases to reproducibly generate all pod addresses
- **Offline-first operation**: Build and search your metadata locally without network access or cryptocurrency
- **Cross-device synchronization**: Access your pods from any device using the same seed phrase
- **Network integration**: Upload/download pods to/from the Autonomi network when ready
- **Semantic interoperability**: Standardized RDF schemas enable data sharing between applications

### Scope

Colonylib focuses on **metadata management** and **semantic search**. It does not handle actual file uploads/downloads - those operations use the standard Autonomi API. Think of colonylib as a sophisticated indexing and discovery layer on top of Autonomi's storage primitives.

## Status

### Current Capabilities ✅

- **Local pod management**: Create, modify, and cache pods in platform-appropriate data directories (Windows, Mac, Linux)
- **Secure key management**: Password-protected keystore with deterministic key derivation from mnemonic phrases
- **Network synchronization**: Upload modified pods and download updates from the Autonomi network
- **Cache management**: Populate, repair, and maintain local pod caches with automatic conflict resolution
- **RDF graph database**: Store and query semantic metadata using oxigraph with SPARQL support
- **Pod references**: Create interconnected networks of pods with configurable traversal depth
- **Semantic search**: Query pods by content, type, properties, and relationships

### Roadmap 🚧

- Improve Autonomi error handling (auto retry on certain failures, library specific errors)
- Automatic scratchpad overflow handling for large metadata collections (>4MB)
- Advanced search features (faceted search, relevance ranking by pod depth, fuzzy matching)
- Performance optimizations for large-scale pod networks (threading Autonomi fetch operations)

## Library Architecture

Colonylib is organized into four core modules that work together to provide a complete metadata management system:

### 1. KeyStore (`key.rs`)
**Purpose**: Cryptographic key management and derivation

- **Mnemonic-based**: Generate deterministic keys from 12-word seed phrases
- **Secure storage**: Password-encrypted keystore files using the Cocoon library
- **Key derivation**: Separate key spaces for pointers, scratchpads, and wallet operations
- **Cross-device sync**: Same mnemonic produces identical keys across devices

### 2. DataStore (`data.rs`)
**Purpose**: Local file system operations and pod caching

- **Platform-aware**: Uses OS-appropriate data directories (`~/.local/share/colony` on Linux)
- **Organized storage**: Separate directories for pointers, scratchpads, and pod references
- **Cache management**: Track upload queues, handle file operations, manage local state
- **Atomic operations**: Safe concurrent access to pod files

### 3. Graph (`graph.rs`)
**Purpose**: RDF semantic database and SPARQL query engine

- **Oxigraph integration**: High-performance RDF store with SPARQL 1.1 support
- **Ontology independent**: Library supports any ontology or schema (Using [schema.org]https://schema.org/ is preferred for portability)
- **Named graphs**: Isolate pod data while enabling cross-pod queries
- **JSON-LD metadata entry**: Write JSON-LD metadata for subjects within pods
- **Query interface**: Execute SPARQL queries across all local and referenced pods and return results in JSON format

### 4. PodManager (`pod.rs`)
**Purpose**: High-level pod operations and network coordination

- **Unified interface**: Coordinates between KeyStore, DataStore, Graph, and Autonomi network
- **Pod lifecycle**: Create, modify, upload, download, and synchronize pods
- **Reference traversal**: Discover and cache interconnected pod networks
- **Search operations**: Execute semantic queries across local and referenced pods

## Public API

The main entry point for colonylib is the `PodManager` struct, which provides a high-level interface for all pod operations. Here are the key methods:

### Core Pod Operations

```rust
// Create a new pod with a given name
async fn add_pod(&mut self, pod_name: &str) -> Result<(String, String), Error>

// Create a reference from one pod to another
fn add_pod_ref(&mut self, pod_address: &str, referenced_pod_address: &str) -> Result<(), Error>

// Add metadata for a specific subject (file/resource) to a pod using JSON-LD syntax
async fn put_subject_data(&mut self, pod_address: &str, subject_address: &str, metadata: &str) -> Result<(), Error>

// Get metadata for a specific subject and return a JSON string
async fn get_subject_data(&mut self, subject_address: &str) -> Result<String, Error>
```

### Network Synchronization

```rust
// Upload all local changes to the Autonomi network
async fn upload_all(&mut self) -> Result<(), Error>

// Download updates for user-created pods
async fn refresh_cache(&mut self) -> Result<(), Error>

// Download referenced pods up to specified depth
async fn refresh_ref(&mut self, depth: u64) -> Result<(), Error>
```

### Search and Query

```rust
// Search pods using various criteria (text, type, properties)
async fn search(&mut self, query: serde_json::Value) -> Result<serde_json::Value, Error>
```

### Initialization

```rust
// Create a new PodManager instance
async fn new(
    client: Client,           // Autonomi network client
    wallet: &Wallet,          // Payment wallet
    data_store: &mut DataStore,   // Local storage
    key_store: &mut KeyStore,     // Cryptographic keys
    graph: &mut Graph,            // RDF database
) -> Result<PodManager, Error>
```

## Installation

Add colonylib to your Rust project:

```toml
[dependencies]
colonylib = "0.1.0"
autonomi = "0.4.6"
tokio = "1.44"
serde_json = "1.0"
```

Or use cargo:

```bash
cargo add colonylib autonomi tokio serde_json
```

## Examples

The repository includes three comprehensive examples that demonstrate colonylib's capabilities. These examples are designed to be run in sequence and work on a local Autonomi testnet. Running on main or the Alpha network is possible, but requires code changes.

### Prerequisites

Before running the examples, you need:

1. **Rust toolchain** (1.70 or later)
2. **Autonomi network access**:
   - **Local testnet**: Run a local Autonomi node for development. See the [Autonomi documentation]https://autonomi.com/docs for setup instructions.
   - **Alpha network**: Connect to the Autonomi alpha testnet (change the `init_client` function in each example)
   - **Main network**: Connect to the live Autonomi network (change the `init_client` function in each example)

3. **Wallet with tokens** (for Alpha/Main network operations, creating the local testnet will handle this for you):
   - ETH for gas fees
   - ANT tokens for storage payments

### Example 1: Setup (`examples/setup.rs`)

**Purpose**: Initialize the colonylib environment and verify network connectivity.

This example:
- Creates the local data directory structure
- Initializes or loads an encrypted keystore
- Sets up the RDF graph database
- Connects to the Autonomi network
- Displays wallet balances

**Run it:**
```bash
# For local testnet
cargo run --example setup

# The example uses these default settings:
# - Network: local testnet
# - Mnemonic: "abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon about"
# - Password: "password"
```

**What it does:**
- Creates `~/.local/share/colony/` (Linux) or equivalent on other platforms
- Generates deterministic keys from the mnemonic phrase
- Verifies network connectivity and wallet balance
- Prepares the environment for pod operations

### Example 2: Adding Pods (`examples/add_pods.rs`)

**Purpose**: Create pods with sample metadata and upload them to the network.

This example demonstrates:
- Creating multiple pods with descriptive names
- Adding rich JSON-LD metadata for different file types
- Uploading pods to the Autonomi network
- Handling network costs and replication delays

**Run it:**
```bash
cargo run --example add_pods
```

**What it creates:**
- **Pod 1**: Metadata for an image file (`ant_girl.png`)
- **Pod 2**: Metadata for an audio file (`BegBlag.mp3`)

Each pod contains structured metadata using Schema.org vocabularies:
```json
{
  "@context": {"schema": "http://schema.org/"},
  "@type": "schema:MediaObject",
  "@id": "ant://[file-address]",
  "schema:name": "filename.ext",
  "schema:description": "File description",
  "schema:contentSize": "2MB"
}
```

### Example 3: Search (`examples/search.rs`)

**Purpose**: Demonstrate various search capabilities across the pod network.

This example shows:
- Simple text search across all metadata
- Type-based queries (find all MediaObjects)
- Property-based queries (find items with specific attributes)
- Advanced multi-criteria searches
- Subject data retrieval

**Run it:**
```bash
cargo run --example search
```

**Search types demonstrated:**
1. **Text search**: Find pods containing specific words
2. **Type search**: Query by RDF type (e.g., MediaObject, Document)
3. **Predicate search**: Find resources with specific properties
4. **Advanced search**: Combine multiple criteria
5. **Subject retrieval**: Get complete metadata for specific resources

### Running the Examples

**Complete workflow:**
```bash
# 1. Initialize the environment
cargo run --example setup

# 2. Create sample pods with metadata
cargo run --example add_pods

# 3. Search and query the pods
cargo run --example search
```

**Network Configuration:**

To use different networks, modify the `environment` variable in each example:

```rust
// Local testnet (default)
let environment = "local".to_string();

// Alpha testnet (needs test tokens)
let environment = "alpha".to_string();

// Main network (needs real tokens)
let environment = "autonomi".to_string();
```

**Wallet Configuration:**

The examples use a hardcoded private key for local testing. For production use:

1. Generate a secure private key
2. Fund the wallet with ETH and ANT tokens
3. Update the `LOCAL_PRIVATE_KEY` constant

**Data Persistence:**

- All examples use the same data directory
- Keystores and graph databases persist between runs
- You can safely re-run examples to see updated results
- Delete the data directory and run `setup.rs` to reset the environment if needed

NOTE! This is a destructive operation. It will overwrite the local data directory and recreate it.
If you have things you want to keep, make sure you have uploaded everything to Autonomi before running this.

## Usage Patterns

### Basic Workflow

```rust
use autonomi::{Client, Wallet};
use colonylib::{PodManager, DataStore, KeyStore, Graph};
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 1. Initialize components
    let client = Client::init_local().await?;
    let wallet = &Wallet::new_from_private_key(client.evm_network(), private_key)?;
    let data_store = &mut DataStore::create()?;

    // 2. Set up keystore
    let key_store = &mut if keystore_exists {
        KeyStore::from_file(&mut file, password)?
    } else {
        KeyStore::from_mnemonic(mnemonic)?
    };

    // 3. Initialize graph database
    let graph = &mut Graph::open(&data_store.get_graph_path())?;

    // 4. Create pod manager
    let mut pod_manager = PodManager::new(client, wallet, data_store, key_store, graph).await?;

    // 5. Create and populate pods
    let (pod_addr, _) = pod_manager.add_pod("My Collection").await?;

    let metadata = json!({
        "@context": "http://schema.org/",
        "@type": "Dataset",
        "name": "Research Data",
        "description": "Important research findings"
    });

    pod_manager.put_subject_data(&pod_addr, FILE_ADDRESS, metadata).await?;

    // 6. Upload to network
    pod_manager.upload_all().await?;

    // 7. Search and query
    let results = pod_manager.search(json!("research")).await?;
    println!("Found: {}", results);

    Ok(())
}
```

### Offline-First User Support

Colonylib supports offline usage - you can create, reference, and search pods without performing any
Autonomi network write operations:

```rust
// Create pods locally
let (pod_addr, _) = pod_manager.add_pod("Offline Pod").await?;
pod_manager.put_subject_data(&pod_addr, subject, metadata).await?;

// Search works immediately
let results = pod_manager.search(json!("offline")).await?;

// Upload when ready (requires network + tokens)
pod_manager.upload_all().await?;
```

The caveat here is that the data is only stored on your computer. There is no way to recover it if
you lose your computer. Uploading to the network is necessary to ensure data persistence, cross-device
synchronization, and the ability to share your data with others.

## Contributing

We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

### Development Setup

```bash
git clone https://github.com/zettawatt/colonylib.git
cd colonylib
cargo build
cargo test
```

### Running Tests

```bash
# Run all tests
cargo test

# Run specific module tests
cargo test key_tests
cargo test pod_tests
cargo test graph_tests
```

## License

This project is licensed under the GPL-3.0-only License - see the [LICENSE](LICENSE) file for details.

## Links

- **Documentation**: [docs.rs/colonylib]https://docs.rs/colonylib
- **Repository**: [github.com/zettawatt/colonylib]https://github.com/zettawatt/colonylib
- **Issues**: [github.com/zettawatt/colonylib/issues]https://github.com/zettawatt/colonylib/issues
- **Autonomi Network**: [autonomi.com]https://autonomi.com
- **Colony App**: [github.com/zettawatt/colony]https://github.com/zettawatt/colony