InfoTheory
1. Unified Information Estimation
Estimate core measures using both Marginal (distribution-based) and Rate (predictive-based) approaches:
- NCD (Normalized Compression Distance): Approximates information distance using compression.
- MI (Mutual Information): Quantifies shared information between sequences.
- NED (Normalized Entropy Distance): A metric distance based on mutual information.
- NTE (Normalized Transform Effort): Variation of Information (VI).
- Intrinsic Dependence: Redundancy Ratio.
- Resistance: Information preservation under noise/transform.
2. Multi-Backend Predictive Engine
Switch between different modeling paradigms seamlessly:
- ROSA+ (Rapid Online Suffix Automaton + Witten Bell): A fast statistical LM. Default backend.
- CTW (Context Tree Weighting): Historically standard for AIXI. Accurate bit-level Bayesian model (KT-estimator).
- RWKV (Neural Network): Highly optimized x86_64 RWKV7 CPU inference backend.
3. Integrated MC-AIXI Agent
Includes a full implementation of the Monte Carlo AIXI (MC-AIXI) agent described by Hutter et al. This approximates the incomputable AIXI Agent using Monte-Carlo Tree Search, and is backend-agnostic and can utilize any of the available predictive backends (ROSA, CTW, or RWKV) for universal reinforcement learning.
RWKV inference is SIMD-optimized for x86_64. On non-x86_64 systems, or very old x86_64 CPUs without AVX2/FMA, performance may be significantly lower and support may be limited. You can use a trained RWKV7 model as a rate backend ("world model") for MC-AIXI. Something like Rosetta 2 should make an exception to this for Apple Silicon.
Compilation & Installation
Platform Support (tested)
infotheory is currently tested on x86_64 for:
- Linux (GNU libc) (
x86_64-unknown-linux-gnu) - Linux (musl) (
x86_64-unknown-linux-musl) - macOS (Intel) (
x86_64-apple-darwin) - FreeBSD (
x86_64-unknown-freebsd) - OpenBSD (
x86_64-unknown-openbsd) - NetBSD (
x86_64-unknown-netbsd)
Apple Silicon (AARCH64) with MacOS can run this program using Rosetta 2
Build Prerequisites
- Rust toolchain (stable):
rustuprecommended. - C/C++ toolchain:
clang+lldrecommended on Unix-like systems. - For local repository builds with VM support available: clone recursively (
--recurse-submodules) sonyx-liteis present.
Build the CLI
Enable the cli feature (the binary is feature-gated):
Output binary:
./target/release/infotheory(host target)./target/<target-triple>/release/infotheory(cross target)
Build as a library
Add the dependency in your Cargo.toml:
[]
= { = "." } # Or git or whatever, you know rust.
Building nyx-lite
The VM backend is optional (--features vm) and depends on nyx-lite (and its vendored submodule code). Build it with:
Notes:
- VM is Linux/KVM-oriented (
/dev/kvmrequired). - Some
nyx-litetests also require VM image artifacts undernyx-lite/vm_image.
Additional notes
Platform caveats:
- OpenBSD/NetBSD: kernel W^X policies can break ZPAQ JIT at runtime. Set
CARGO_FEATURE_NOJIT=true. - NetBSD: release LTO is problematic in common toolchains; disable release LTO if needed (see
.cargo/config.tomlcomments). - MacOS: MacOS is supported in full, and will work on both Intel and Modern Apple Silicon natively due to Rosetta.
Optional tooling used by some tests/workflows:
- docker (for tests, or if you want to use it for rootfs generation)
- cpio
- wget (for tests, or to use the provided kernel. you can also use curl instead manually on the download_kernel.sh file )
- cmake (for VM feature, firecracker needs it)
- Lean4 (Toolchain Version 4.14.0)
CLI Usage
The infotheory binary provides a powerful interface for file analysis.
Primitives
# Calculate Mutual Information (ROSA backend, order 8)
# Use CTW backend for NTE (Normalized Transform Effort)
# Calculate NCD with custom ZPAQ method
AIXI Agent Mode
# Run the AIXI agent using config-specified backend
AIXI Agent Mode (VM via Nyx-Lite)
# VM-backed environment using high-performance Firecracker (Nyx-Lite)
VM config highlights:
- Environment: Use
"environment": "nyx-vm"or"vm"(requiresvmfeature). - Core Config:
vm_config.kernel_image_path: Path tovmlinuxkernel.vm_config.rootfs_image_path: Path torootfs.ext4.vm_config.instance_id: Unique ID for the VM instance.
- Performance:
vm_config.shared_memory_policy: Use"snapshot"for fast resets (fork-server style).vm_config.observation_policy:"shared_memory"for zero-copy observations.
- Rewards & Observations:
vm_reward.mode:"guest"(guest writes to specific address),"pattern", or"trace-entropy".vm_observation.mode:"raw"(bytes) or hash-based.observation_stream_len: Critical for planning consistency. Must match guest output.
Prerequisites:
- Linux with KVM enabled (
/dev/kvmaccessible). vmlinuxkernel androotfs.ext4image valid for Firecracker.nyx-litecrate (included in workspace).
Setup:
- Ensure you have the
vmlinux-6.1.58kernel in the project root (or update config). - Ensure
nyx-lite/vm_image/dockerimage/rootfs.ext4exists or provide your own. - Enable the feature:
cargo build --release --features vm.
Library Usage
use *;
// Entropy rate of a sequence (uses ROSA by default)
let h = entropy_rate_bytes;
// Switch the entire thread to use CTW for all subsequent calls
set_default_ctx;
Supported Primitives
| Command | Description | Domain |
|---|---|---|
ncd |
Normalized Compression Distance | Compression |
ned |
Normalized Entropy Distance | Shannon |
nte |
Variation of Information | Shannon |
mi |
Mutual Information | Shannon |
id |
Internal Redundancy | Algorithmic |
rt |
Resistance to Transform | Algorithmic |
| and more! |
📄 License
Apache License, Version 2.0.