embeddenator-fs 0.21.0

EmbrFS: FUSE filesystem backed by holographic engrams
Documentation

embeddenator-fs

Crates.io Documentation License: MIT

A holographic filesystem implementation using Vector Symbolic Architecture (VSA) for encoding entire directory trees into high-dimensional sparse vectors with bit-perfect reconstruction guarantees.

Status: Alpha - Core functionality complete, API may change. Suitable for experimental use and research.

What is EmbrFS?

EmbrFS (Embeddenator Filesystem) is a novel approach to filesystem storage that encodes files and directories as holographic "engrams" - bundled high-dimensional sparse vectors. Unlike traditional filesystems that store files as sequential blocks, EmbrFS distributes file information across a holographic representation, enabling:

  • Bit-perfect reconstruction through algebraic correction layers
  • Holographic properties - complete information distributed across the representation
  • Hierarchical scalability - sub-engrams for bounded memory usage
  • Read-only FUSE mounting - kernel-level filesystem integration
  • Incremental operations - add/modify/remove files without full rebuilds

Realistic Scope & Limitations

What EmbrFS IS:

  • ✅ A research-grade holographic encoding system for filesystems
  • ✅ A read-only FUSE filesystem for browsing encoded directory trees
  • ✅ An experimental VSA application demonstrating bit-perfect reconstruction
  • ✅ A foundation for exploring holographic storage and retrieval patterns

What EmbrFS IS NOT:

  • ❌ A replacement for production filesystems (ext4, btrfs, ZFS)
  • ❌ A compression tool (overhead varies, typically 0-5% for correction layer)
  • ❌ A writable filesystem (holographic engrams are immutable snapshots)
  • ❌ A distributed storage system (single-machine only)

Current Limitations:

  • Read-only FUSE operations (by design - engrams are immutable)
  • No symbolic link support (returns ENOSYS)
  • No extended attributes
  • No write/modify operations through FUSE (modifications require re-encoding)
  • Alpha API stability (breaking changes possible)

Features

Core Capabilities

  • Holographic Encoding: Encodes files into SparseVec representations using VSA
  • Bit-Perfect Reconstruction: 100% accurate file recovery via correction layer
    • Primary SparseVec encoding
    • Immediate verification on encode
    • Algebraic correction store for exact differences
  • Hierarchical Architecture: Scales to large filesystems via sub-engram trees
  • Incremental Operations:
    • add_files - Add new files without full rebuild
    • modify_files - Update existing files
    • remove_files - Soft-delete files
    • compact - Hard rebuild to reclaim space
  • FUSE Integration (optional fuse feature):
    • Mount engrams as read-only filesystems
    • Standard Unix tools (ls, cat, grep) work transparently
    • Kernel-level integration via fuser library
  • Correction Strategies:
    • BitFlips - Sparse bit-level corrections
    • TritFlips - Ternary value corrections
    • BlockReplace - Contiguous region replacement
    • Verbatim - Full data storage (fallback)

Architecture Highlights

User Tools (ls, cat, etc.)
         ↓
  FUSE Kernel Interface (fuse_shim.rs)
         ↓
  Holographic Filesystem Core (embrfs.rs)
         ↓
  Correction Layer (correction.rs)
         ↓
  VSA Primitives (embeddenator-vsa)

File Structure:

  • embrfs.rs - Core filesystem logic (1,884 lines)
  • fuse_shim.rs - FUSE integration (1,263 lines)
  • correction.rs - Bit-perfect reconstruction (531 lines)

Test Coverage:

  • 20 tests covering core functionality
  • All tests passing in CI
  • Unit tests for correction logic
  • Integration tests for FUSE operations

Installation

Add to your Cargo.toml:

[dependencies]
embeddenator-fs = "0.20.0-alpha.3"

# Enable FUSE mounting support (Linux only)
embeddenator-fs = { version = "0.20.0-alpha.3", features = ["fuse"] }

Usage

Basic API

use embeddenator_fs::{EmbrFS, IngestOptions};
use std::path::Path;

// Create a new holographic filesystem
let mut fs = EmbrFS::new();

// Ingest a directory tree
let options = IngestOptions::default();
fs.ingest_directory(Path::new("/path/to/data"), &options)?;

// Save the engram
fs.save("filesystem.engram")?;

// Later: Load and extract
let fs = EmbrFS::load("filesystem.engram")?;
fs.extract_all(Path::new("/output/dir"))?;

FUSE Mounting (Linux Only)

use embeddenator_fs::{EmbrFS, fuse::mount_embrfs};

// Load an engram
let fs = EmbrFS::load("filesystem.engram")?;

// Mount as read-only filesystem
let mountpoint = Path::new("/mnt/embrfs");
mount_embrfs(fs, mountpoint, &[])?;

// Now access files normally:
// $ ls /mnt/embrfs
// $ cat /mnt/embrfs/file.txt
// $ grep "pattern" /mnt/embrfs/**/*.log

FUSE Limitations:

  • Read-only operations only (writes return EROFS)
  • No symbolic links (readlink returns ENOSYS)
  • Simplified permission model
  • Requires root or user_allow_other in /etc/fuse.conf

Hierarchical Sub-Engrams

For large filesystems, use hierarchical encoding:

let options = IngestOptions {
    max_files_per_engram: 1000,  // Split into sub-engrams
    beam_width: 10,               // Beam search for retrieval
    ..Default::default()
};

fs.ingest_directory(path, &options)?;

Performance Characteristics

Encoding Performance:

  • Time: O(N) where N = total file size
  • Space: O(chunks) + correction overhead (typically 0-5%)
  • Chunk size: 4KB default (configurable)

Retrieval Performance:

  • Beam-limited hierarchical search: O(beam_width × max_depth)
  • LRU caching reduces repeated disk I/O
  • Inverted index enables sub-linear candidate generation

Correction Overhead:

  • Observed: 0-5% typical for structured data
  • Varies with data entropy and VSA dimensionality
  • Statistics tracked per-engram

Development

# Clone and build
git clone https://github.com/tzervas/embeddenator-fs
cd embeddenator-fs
cargo build

# Run tests
cargo test

# Run tests with FUSE support
cargo test --features fuse

# Build documentation
cargo doc --open

# Check code quality
cargo clippy -- -D warnings
cargo fmt --check

For cross-component development with other Embeddenator crates:

# Add to workspace Cargo.toml
[patch.crates-io]
embeddenator-vsa = { path = "../embeddenator-vsa" }
embeddenator-retrieval = { path = "../embeddenator-retrieval" }

Documentation

Related Projects

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

This project is in active development. Expect API changes in minor versions until 1.0.

License

MIT - See LICENSE file for details.

Copyright (c) 2024-2026 Tyler Zervas