colonylib 0.6.0

A library implementing the Colony metadata framework on Autonomi
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
# colonylib

A Rust library implementing the Colony metadata framework for the [Autonomi](https://autonomi.com) decentralized network. This library provides the core infrastructure for creating, managing, and searching metadata about files stored on Autonomi using a semantic RDF-based approach.

> **Note**: This is a library for developers. If you're looking for an end-user application to easily upload/download/search files from Autonomi, look at the following applications based on your needs:
> - [Colony]https://github.com/zettawatt/colony(IN PROGRESS) - Cross platform GUI applications
> - [Colony Daemon and CLI]https://github.com/zettawatt/colony-utils - Command line interface and background daemon for headless servers
> - [Mutant]https://github.com/champii/mutant - Rust GUI application

## Overview

### Core Concepts

**Pods** are the fundamental building blocks of colonylib. A pod consists of:
- An **Autonomi pointer** that serves as the pod's address
- A **scratchpad** containing RDF metadata about files and other pods
- **Semantic metadata** written in [RDF]https://www.w3.org/RDF/ in the [TriG format]https://www.w3.org/TR/trig/ using [schema.org]https://schema.org/ schema

This architecture enables:
- **Rich metadata storage**: File types, sizes, names, descriptions, and custom properties
- **Knowledge graphs**: Pods can reference other pods, creating interconnected networks of metadata
- **Semantic search**: Query data using SPARQL via the integrated [oxigraph]https://docs.rs/oxigraph/latest/oxigraph/index.html database
- **Decentralized discovery**: Traverse pod references to discover related content across the network

### Key Features

- **Deterministic key derivation**: Uses 12-word mnemonic seed phrases to reproducibly generate all pod addresses
- **Offline-first operation**: Build and search your metadata locally without network access or cryptocurrency
- **Cross-device synchronization**: Access your pods from any device using the same seed phrase
- **Network integration**: Upload/download pods to/from the Autonomi network when ready
- **Semantic interoperability**: Standardized RDF schemas enable data sharing between applications

### Scope

Colonylib focuses on **metadata management** and **semantic search**. It does not handle actual file uploads/downloads - those operations use the standard Autonomi API. Think of colonylib as a sophisticated indexing and discovery layer on top of Autonomi's storage primitives.

## Status

### Current Capabilities ✅

- **Local pod management**: Create, modify, and cache pods in platform-appropriate data directories (Windows, Mac, Linux)
- **Secure key management**: Password-protected keystore with deterministic key derivation from mnemonic phrases
- **Network synchronization**: Upload modified pods and download updates from the Autonomi network
- **Cache management**: Populate, repair, and maintain local pod caches with automatic conflict resolution
- **RDF graph database**: Store and query semantic metadata using oxigraph with SPARQL support
- **Pod references**: Create interconnected networks of pods with configurable traversal depth
- **Semantic search**: Query pods by content, type, properties, and relationships
- **Advanced search features**: Relevance ranking by number of matches and pod depth, fuzzy matching
- **Automatic scratchpad overflow**: Automatically splits up pod data into multiple scratchpads for large metadata collections (>4MB)

### Roadmap 🚧

- Improve Autonomi error handling (auto retry on certain failures, library specific errors)
- Performance optimizations for large-scale pod networks (threading Autonomi fetch operations)

## Library Architecture

Colonylib is organized into four core modules that work together to provide a complete metadata management system:

### 1. KeyStore (`key.rs`)
**Purpose**: Cryptographic key management and derivation

- **Mnemonic-based**: Generate deterministic keys from 12-word seed phrases
- **Secure storage**: Password-encrypted keystore files using the Cocoon library
- **Key derivation**: Separate key spaces for pointers, scratchpads, and wallet operations
- **Cross-device sync**: Same mnemonic produces identical keys across devices

### 2. DataStore (`data.rs`)
**Purpose**: Local file system operations and pod caching

- **Platform-aware**: Uses OS-appropriate data directories (`~/.local/share/colony` on Linux)
- **Organized storage**: Separate directories for pointers, scratchpads, and pod references
- **Cache management**: Track upload queues, handle file operations, manage local state
- **Atomic operations**: Safe concurrent access to pod files

### 3. Graph (`graph.rs`)
**Purpose**: RDF semantic database and SPARQL query engine

- **Oxigraph integration**: High-performance RDF store with SPARQL 1.1 support
- **Ontology independent**: Library supports any ontology or schema (Using [schema.org]https://schema.org/ is preferred for portability)
- **Named graphs**: Isolate pod data while enabling cross-pod queries
- **JSON-LD metadata entry**: Write JSON-LD metadata for subjects within pods
- **Query interface**: Execute SPARQL queries across all local and referenced pods and return results in JSON format

### 4. PodManager (`pod.rs`)
**Purpose**: High-level pod operations and network coordination

- **Unified interface**: Coordinates between KeyStore, DataStore, Graph, and Autonomi network
- **Pod lifecycle**: Create, modify, upload, download, and synchronize pods
- **Reference traversal**: Discover and cache interconnected pod networks
- **Search operations**: Execute semantic queries across local and referenced pods

## Public API

The main entry point for colonylib is the `PodManager` struct, which provides a high-level interface for all pod operations. Here are the key methods:

### Wallet Key Management

```rust
// Add a new wallet key with a name
async fn add_wallet_key(&mut self, name: &str, wallet_key: &str) -> Result<(), Error>

// Retrieve a specific wallet key by name
async fn get_wallet_key(&self, name: &str) -> Result<String, Error>

// Retrieve all wallet keys
fn get_wallet_keys(&self) -> HashMap<String, String>

// Set the active wallet and persist to local storage
fn set_active_wallet(&mut self, name: &str) -> Result<(String, String), Error>

// Get the currently active wallet from local storage
fn get_active_wallet(&self) -> Result<(String, String), Error>
```

### Core Pod Operations

```rust
// Create a new pod with a given name
async fn add_pod(&mut self, pod_name: &str) -> Result<(String, String), Error>

// Remove a pod and all its associated data
async fn remove_pod(&mut self, pod_address: &str) -> Result<(), Error>

// Rename an existing pod
async fn rename_pod(&mut self, pod_address: &str, new_name: &str) -> Result<(), Error>

// Create a reference from one pod to another
fn add_pod_ref(&mut self, pod_address: &str, referenced_pod_address: &str) -> Result<(), Error>

// Add metadata for a specific subject (file/resource) to a pod using JSON-LD syntax
async fn put_subject_data(&mut self, pod_address: &str, subject_address: &str, metadata: &str) -> Result<(), Error>

// Get metadata for a specific subject and return a JSON string
async fn get_subject_data(&mut self, subject_address: &str) -> Result<String, Error>
```

### Network Synchronization

```rust
// Upload all local changes to the Autonomi network
async fn upload_all(&mut self) -> Result<(), Error>

// Upload a specific pod to the Autonomi network
async fn upload_pod(&mut self, address: &str) -> Result<(), Error>

// Download updates for user-created pods
async fn refresh_cache(&mut self) -> Result<(), Error>

// Download referenced pods up to specified depth
async fn refresh_ref(&mut self, depth: u64) -> Result<(), Error>

// Get the current list of pods that need to be uploaded in JSON format
fn get_update_list(&self) -> Result<serde_json::Value, Error>
```

### Pod Discovery and Listing

```rust
// List all pods owned by the user
fn list_my_pods(&self) -> Result<serde_json::Value, Error>

// List all subjects (resources) within a specific pod
fn list_pod_subjects(&self, pod_address: &str) -> Result<Vec<String>, Error>
```

### Search and Query

```rust
// Search pods using various criteria (text, type, properties)
async fn search(&mut self, query: serde_json::Value) -> Result<serde_json::Value, Error>
```

### Initialization

```rust
// Create a new PodManager instance
async fn new(
    client: Client,           // Autonomi network client
    wallet: &Wallet,          // Payment wallet
    data_store: &mut DataStore,   // Local storage
    key_store: &mut KeyStore,     // Cryptographic keys
    graph: &mut Graph,            // RDF database
) -> Result<PodManager, Error>
```

## Installation

Add colonylib to your Rust project:

```toml
[dependencies]
colonylib = "0.3.0"
autonomi = "0.4.6"
tokio = "1.44"
serde_json = "1.0"
```

Or use cargo:

```bash
cargo add colonylib autonomi tokio serde_json
```

## Examples

The repository includes three comprehensive examples that demonstrate colonylib's capabilities. These examples are designed to be run in sequence and work on a local Autonomi testnet. Running on main or the Alpha network is possible, but requires code changes.

### Prerequisites

Before running the examples, you need:

1. **Rust toolchain** (1.70 or later)
2. **Autonomi network access**:
   - **Local testnet**: Run a local Autonomi node for development. See the [Autonomi documentation]https://autonomi.com/docs for setup instructions.
   - **Alpha network**: Connect to the Autonomi alpha testnet (change the `init_client` function in each example)
   - **Main network**: Connect to the live Autonomi network (change the `init_client` function in each example)

3. **Wallet with tokens** (for Alpha/Main network operations, creating the local testnet will handle this for you):
   - ETH for gas fees
   - ANT tokens for storage payments

### Example 1: Setup (`examples/setup.rs`)

**Purpose**: Initialize the colonylib environment and verify network connectivity.

This example:
- Creates the local data directory structure
- Initializes or loads an encrypted keystore
- Sets up the RDF graph database
- Connects to the Autonomi network
- Displays wallet balances

**Run it:**
```bash
# For local testnet
cargo run --example setup

# The example uses these default settings:
# - Network: local testnet
# - Mnemonic: "abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon about"
# - Password: "password"
```

**What it does:**
- Creates `~/.local/share/colony/` (Linux) or equivalent on other platforms
- Generates deterministic keys from the mnemonic phrase
- Verifies network connectivity and wallet balance
- Prepares the environment for pod operations

### Example 2: Adding Pods (`examples/add_pods.rs`)

**Purpose**: Create pods with sample metadata and upload them to the network.

This example demonstrates:
- Creating multiple pods with descriptive names
- Adding rich JSON-LD metadata for different file types
- Uploading pods to the Autonomi network
- Handling network costs and replication delays

**Run it:**
```bash
cargo run --example add_pods
```

**What it creates:**
- **Pod 1**: Metadata for an image file (`ant_girl.png`)
- **Pod 2**: Metadata for an audio file (`BegBlag.mp3`)

Each pod contains structured metadata using Schema.org vocabularies:
```json
{
  "@context": {"schema": "http://schema.org/"},
  "@type": "schema:MediaObject",
  "@id": "ant://[file-address]",
  "schema:name": "filename.ext",
  "schema:description": "File description",
  "schema:contentSize": "2MB"
}
```

### Example 3: Search (`examples/search.rs`)

**Purpose**: Demonstrate various search capabilities across the pod network.

This example shows:
- Simple text search across all metadata
- Type-based queries (find all MediaObjects)
- Property-based queries (find items with specific attributes)
- Advanced multi-criteria searches
- Subject data retrieval

**Run it:**
```bash
cargo run --example search
```

**Search types demonstrated:**
1. **Text search**: Find pods containing specific words
2. **Type search**: Query by RDF type (e.g., MediaObject, Document)
3. **Predicate search**: Find resources with specific properties
4. **Browse**: List all subjects with their name, type, and description, ordered by pod depth
5. **Advanced search**: Combine multiple criteria
6. **Subject retrieval**: Get complete metadata for specific resources
7. **Pod listing**: Enumerate all user pods and their contents

### Running the Examples

**Complete workflow:**
```bash
# 1. Initialize the environment
cargo run --example setup

# 2. Create sample pods with metadata
cargo run --example add_pods

# 3. Search and query the pods
cargo run --example search
```

**Network Configuration:**

To use different networks, modify the `environment` variable in each example:

```rust
// Local testnet (default)
let environment = "local".to_string();

// Alpha testnet (needs test tokens)
let environment = "alpha".to_string();

// Main network (needs real tokens)
let environment = "autonomi".to_string();
```

**Wallet Configuration:**

The examples use a hardcoded private key for local testing. For production use:

1. Generate a secure private key
2. Fund the wallet with ETH and ANT tokens
3. Update the `LOCAL_PRIVATE_KEY` constant

**Data Persistence:**

- All examples use the same data directory
- Keystores and graph databases persist between runs
- You can safely re-run examples to see updated results
- Delete the data directory and run `setup.rs` to reset the environment if needed

NOTE! This is a destructive operation. It will overwrite the local data directory and recreate it.
If you have things you want to keep, make sure you have uploaded everything to Autonomi before running this.

## Usage Patterns

### Basic Workflow

```rust
use autonomi::{Client, Wallet};
use colonylib::{PodManager, DataStore, KeyStore, Graph};
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 1. Initialize components
    let client = Client::init_local().await?;
    let wallet = &Wallet::new_from_private_key(client.evm_network(), private_key)?;
    let data_store = &mut DataStore::create()?;

    // 2. Set up keystore
    let key_store = &mut if keystore_exists {
        KeyStore::from_file(&mut file, password)?
    } else {
        KeyStore::from_mnemonic(mnemonic)?
    };

    // 3. Initialize graph database
    let graph = &mut Graph::open(&data_store.get_graph_path())?;

    // 4. Create pod manager
    let mut pod_manager = PodManager::new(client, wallet, data_store, key_store, graph).await?;

    // 5. Create and populate pods
    let (pod_addr, _) = pod_manager.add_pod("My Collection").await?;

    let metadata = json!({
        "@context": "http://schema.org/",
        "@type": "Dataset",
        "name": "Research Data",
        "description": "Important research findings"
    });

    pod_manager.put_subject_data(&pod_addr, FILE_ADDRESS, metadata).await?;

    // 6. Upload to network
    pod_manager.upload_all().await?;

    // 7. Search and query
    let results = pod_manager.search(json!("research")).await?;
    println!("Found: {}", results);

    Ok(())
}
```

### Advanced Pod Operations

#### Uploading Individual Pods

For more granular control over network operations, you can upload specific pods instead of all pending changes:

```rust
// Create and populate a pod
let (pod_addr, _) = pod_manager.add_pod("Research Data").await?;

let metadata = json!({
    "@context": "http://schema.org/",
    "@type": "Dataset",
    "name": "Climate Research",
    "description": "Temperature data from weather stations"
});

pod_manager.put_subject_data(&pod_addr, subject_address, &metadata.to_string()).await?;

// Upload only this specific pod
pod_manager.upload_pod(&pod_addr).await?;
```

#### Discovering and Managing Pods

List all your pods and explore their contents:

```rust
// Get all user pods
let pods_result = pod_manager.list_my_pods()?;

if let Some(bindings) = pods_result["results"]["bindings"].as_array() {
    for pod in bindings {
        let pod_address = pod["pod"]["value"].as_str().unwrap();
        let pod_name = pod["name"]["value"].as_str().unwrap();

        println!("Pod: {} ({})", pod_name, pod_address);

        // List all subjects in this pod
        let subjects = pod_manager.list_pod_subjects(pod_address)?;
        println!("  Contains {} subjects:", subjects.len());

        for subject_address in subjects {
            // Get detailed metadata for each subject
            let metadata = pod_manager.get_subject_data(&subject_address).await?;
            let metadata_json: serde_json::Value = serde_json::from_str(&metadata)?;

            // Extract subject name from metadata
            if let Some(bindings) = metadata_json["results"]["bindings"].as_array() {
                for binding in bindings {
                    if binding["predicate"]["value"].as_str() == Some("http://schema.org/name") {
                        if let Some(name) = binding["object"]["value"].as_str() {
                            println!("    - {} ({})", name, subject_address);
                        }
                    }
                }
            }
        }
    }
}
```

#### Working with Pod References

Create interconnected pod networks:

```rust
// Create related pods
let (main_pod, _) = pod_manager.add_pod("Research Collection").await?;
let (data_pod, _) = pod_manager.add_pod("Raw Data").await?;
let (analysis_pod, _) = pod_manager.add_pod("Analysis Results").await?;

// Create references between pods
pod_manager.add_pod_ref(&main_pod, &data_pod).await?;
pod_manager.add_pod_ref(&main_pod, &analysis_pod).await?;

// Upload all pods
pod_manager.upload_all().await?;

// Later, refresh with references to discover connected pods
pod_manager.refresh_ref(2).await?; // Depth 2 to include referenced pods
```

#### Pod Management Operations

Rename and remove pods as needed:

```rust
// Create a pod with an initial name
let (pod_addr, _) = pod_manager.add_pod("Temporary Research").await?;

// Add some metadata
let metadata = json!({
  "@context": {"schema": "http://schema.org/"},
  "@type": "schema:SoftwareApplication",
  "@id": "ant://cca4e991284bfd22005bd29884079154817c7f0c3ae09c1685ffa3764c6c1e83",
  "schema:name": "colony-daemon",
  "schema:description": "colony-daemon v0.1.2 x86_64 linux binary
  "schema:operatingSystem": "Linux",
  "schema:applicationCategory": "Application"
});

pod_manager.put_subject_data(&pod_addr, subject_address, &metadata.to_string()).await?;

// Rename the pod to be more descriptive
pod_manager.rename_pod(&pod_addr, "Climate Research Dataset").await?;

// Upload the changes
pod_manager.upload_all().await?;

// Later, if the pod is no longer needed, remove it
pod_manager.remove_pod(&pod_addr).await?;

// Upload the removal to the network
pod_manager.upload_all().await?;
```

**Important Notes:**
- **Renaming**: Only changes the display name; the pod address remains the same
- **Removal**: Completely removes the pod and all associated scratchpads
- **Configuration pod**: The configuration pod cannot be removed (it's protected)
- **Irreversible**: Once uploaded to the network, removals cannot be undone

### Offline-First User Support

Colonylib supports offline usage - you can create, reference, and search pods without performing any
Autonomi network write operations:

```rust
// Create pods locally
let (pod_addr, _) = pod_manager.add_pod("Offline Pod").await?;
pod_manager.put_subject_data(&pod_addr, subject, metadata).await?;

// Search works immediately
let results = pod_manager.search(json!("offline")).await?;

// Upload when ready (requires network + tokens)
pod_manager.upload_all().await?;
```

The caveat here is that the data is only stored on your computer. There is no way to recover it if
you lose your computer. Uploading to the network is necessary to ensure data persistence, cross-device
synchronization, and the ability to share your data with others.

## Contributing

We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

### Development Setup

```bash
git clone https://github.com/zettawatt/colonylib.git
cd colonylib
cargo build
cargo test
```

### Running Tests

```bash
# Run all tests
cargo test

# Run specific module tests
cargo test key_tests
cargo test pod_tests
cargo test graph_tests
```

## License

This project is licensed under the GPL-3.0-only License - see the [LICENSE](LICENSE) file for details.

## Links

- **Documentation**: [docs.rs/colonylib]https://docs.rs/colonylib
- **Repository**: [github.com/zettawatt/colonylib]https://github.com/zettawatt/colonylib
- **Issues**: [github.com/zettawatt/colonylib/issues]https://github.com/zettawatt/colonylib/issues
- **Autonomi Network**: [autonomi.com]https://autonomi.com
- **Colony App**: [github.com/zettawatt/colony]https://github.com/zettawatt/colony
- **Colony Daemon and CLI**: [github.com/zettawatt/colony-utils]https://github.com/zettawatt/colony-utils