Mokosh
A high-performance Rust implementation of Hierarchical Temporal Memory (HTM) algorithms, ported from the htm.core C++ library.
Overview
Mokosh provides a complete implementation of the core HTM algorithms for sequence learning, anomaly detection, and pattern recognition. HTM is a machine learning technology that aims to capture the structural and algorithmic properties of the neocortex.
Key Features
- Sparse Distributed Representations (SDR) - The fundamental data structure for HTM
- Spatial Pooler - Creates sparse, distributed representations of input patterns
- Temporal Memory - Learns sequences and makes predictions
- Anomaly Detection - Identifies unusual patterns in data streams
- 39 Encoders - Convert raw data into SDR format (scalar, categorical, temporal, audio, vision, network, biometric, financial, probabilistic) - see ENCODERS.md
- SIMD Optimizations - AVX2-accelerated operations for boost factors, duty cycles, and permanence updates
- Serialization - Save and load models in binary or JSON format
- Zero unsafe code in core algorithms - Safe Rust implementation
- Well tested - 450+ unit tests with comprehensive coverage including property-based testing
Installation
Add mokosh to your Cargo.toml:
[]
= "0.1"
Feature Flags
std(default) - Standard library supportserde- Serialization/deserialization support (addsserde,serde_json,bincode)rayon- Parallel processing supportsimd- SIMD optimizations (enabled by default on x86_64)
[]
= { = "0.1", = ["serde"] }
Quick Start
use *;
// Create a Spatial Pooler
let mut sp = new.unwrap;
// Create a Temporal Memory
let mut tm = new.unwrap;
// Encode input data
let encoder = new.unwrap;
// Process a value
let input_sdr = encoder.encode_to_sdr.unwrap;
// Run through Spatial Pooler
let mut active_columns = new;
sp.compute;
// Run through Temporal Memory
tm.compute;
// Get predictions and anomaly score
let predictions = tm.get_predictive_cells;
let anomaly = tm.anomaly;
println!;
Core Components
Sparse Distributed Representations (SDR)
SDRs are the fundamental data structure in HTM - binary vectors where only a small percentage of bits are active.
use Sdr;
// Create a 100-bit SDR
let mut sdr = new;
// Set active bits
sdr.set_sparse.unwrap;
// Access in different formats
let sparse = sdr.get_sparse; // Active indices: [5, 12, 23, 45, 67]
let dense = sdr.get_dense; // Full binary vector
let sum = sdr.get_sum; // Number of active bits: 5
let sparsity = sdr.get_sparsity; // Fraction active: 0.05
// Compute overlap between SDRs (SIMD-accelerated)
let other = new;
let overlap = sdr.get_overlap;
Spatial Pooler
The Spatial Pooler creates sparse, distributed representations that maintain semantic similarity.
use ;
use Sdr;
let mut sp = new.unwrap;
let input = new;
let mut output = new;
// learn=true enables learning
sp.compute;
Temporal Memory
The Temporal Memory learns sequences and makes predictions about future inputs.
use ;
use Sdr;
let mut tm = new.unwrap;
let active_columns = new;
// Process input (learn=true)
tm.compute;
// Get results
let active_cells = tm.get_active_cells;
let predictive_cells = tm.get_predictive_cells;
let winner_cells = tm.get_winner_cells;
let anomaly_score = tm.anomaly;
// Reset for new sequence
tm.reset;
Encoders
Mokosh includes 39 encoders across multiple domains. See ENCODERS.md for comprehensive documentation.
Scalar Encoders
use ;
let encoder = new.unwrap;
let sdr = encoder.encode_to_sdr.unwrap;
Random Distributed Scalar Encoder (RDSE)
use ;
let encoder = new.unwrap;
Date Encoder
use ;
let encoder = new.unwrap;
Additional Encoder Categories
| Category | Encoders |
|---|---|
| Categorical | CategoryEncoder, BooleanEncoder, OrdinalEncoder, HierarchicalCategoryEncoder, SetEncoder, DeltaEncoder |
| Temporal | DateEncoder, CoordinateEncoder, GeospatialEncoder, GridCellEncoder |
| Text/NLP | SimHashDocumentEncoder, WordEmbeddingEncoder, LlmEmbeddingEncoder, CharacterEncoder, NGramEncoder |
| Audio | SpectrogramEncoder, WaveformEncoder, PitchEncoder |
| Vision | PatchEncoder, ColorEncoder, EdgeOrientationEncoder |
| Network | IpAddressEncoder, MacAddressEncoder, GraphNodeEncoder |
| Biometric | HrvEncoder, EcgEncoder, AccelerometerEncoder |
| Financial | PriceEncoder, CurrencyPairEncoder, OrderBookEncoder |
| Probabilistic | DistributionEncoder, ConfidenceIntervalEncoder |
| Composite | MultiEncoder, VecMultiEncoder, PassThroughEncoder |
Anomaly Detection
use ;
use Sdr;
// Raw anomaly score
let anomaly = new;
let active = new;
let predicted = new;
let score = anomaly.compute;
// Anomaly likelihood (statistical)
let mut likelihood = new;
let prob = likelihood.anomaly_probability;
SDR Classifier
use ;
use Sdr;
let mut classifier = new;
let pattern = new;
classifier.learn;
let probabilities = classifier.infer;
Serialization
use *;
use ;
// Save to file
sp.save_to_file?;
sp.save_to_file?;
// Load from file
let sp2 = load_from_file?;
Performance
Mokosh includes SIMD optimizations for critical hot paths:
SIMD-Accelerated Operations
| Operation | Speedup | Description |
|---|---|---|
| Boost factors | ~30% faster | Fast exp() approximation with AVX2 |
| Duty cycle updates | ~5% faster | Vectorized exponential moving average |
| Permanence updates | ~2-4x faster | Batch clamping with SIMD min/max |
| SDR overlap | Optimized | Cache-efficient two-pointer merge |
Benchmarking
# Run criterion benchmarks
# Run quick performance tests
# Save baseline and compare
# ... make changes ...
Design Optimizations
- Efficient SDR operations - O(k) for sparse operations where k is active bits
- Cache-friendly - Data structures optimized for CPU cache utilization
- SmallVec - Stack allocation for small segment/synapse collections
- AHash - Fast hash map implementation for connection lookups
- Zero-copy where possible - Minimizes memory allocations
- Optional parallelism - Enable
rayonfeature for parallel processing
Architecture
mokosh/
├── src/
│ ├── lib.rs # Library root, prelude, error types
│ ├── types/
│ │ ├── mod.rs
│ │ ├── primitives.rs # Real, UInt, Permanence, etc.
│ │ └── sdr.rs # Sparse Distributed Representation
│ ├── algorithms/
│ │ ├── mod.rs
│ │ ├── connections.rs # Synaptic connections management
│ │ ├── spatial_pooler.rs
│ │ ├── temporal_memory.rs
│ │ ├── anomaly.rs # Anomaly & AnomalyLikelihood
│ │ └── sdr_classifier.rs
│ ├── encoders/ # 38 encoder implementations
│ │ ├── mod.rs
│ │ ├── base.rs # Encoder trait
│ │ ├── scalar.rs, rdse.rs, date.rs, simhash.rs
│ │ └── ... (domain-specific encoders)
│ ├── utils/
│ │ ├── mod.rs
│ │ ├── random.rs # Deterministic RNG
│ │ ├── topology.rs # Spatial topology utilities
│ │ ├── sdr_metrics.rs # SDR analysis metrics
│ │ └── simd.rs # SIMD utilities and optimizations
│ └── serialization.rs # Serde-based serialization
├── benches/
│ └── simd_benchmarks.rs # Criterion benchmarks
├── tests/
│ └── simd_correctness.rs # Property-based correctness tests
└── src/bin/
└── perf_test.rs # Performance testing tool
Testing
# Run all tests (450+)
# Run with serde feature
# Run SIMD correctness tests (property-based)
# Run specific test module
# Run with output
Comparison with htm.core
Mokosh is a faithful port of the core HTM algorithms from htm.core:
| Component | Status | Notes |
|---|---|---|
| SDR | Complete | Full sparse/dense/coordinate support |
| Spatial Pooler | Complete | Global and local inhibition, SIMD boost |
| Temporal Memory | Complete | All learning rules |
| Connections | Complete | Segment and synapse management |
| Anomaly Detection | Complete | Raw and likelihood modes |
| SDR Classifier | Complete | Multi-step prediction |
| Encoders | Extended | 38 encoders (more than htm.core) |
| Serialization | Complete | Binary and JSON formats |
| SIMD | New | AVX2 optimizations not in htm.core |
Not Included
- Network Engine - Multi-region network orchestration
- Region Framework - Plugin-based region implementations
- REST API - HTTP interface
Examples
Anomaly Detection in Time Series
use *;
Sequence Learning
use *;
Contributing
Contributions are welcome! Please feel free to submit issues and pull requests.
License
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0), the same license as htm.core.
Acknowledgments
- Numenta for creating the HTM theory and algorithms
- htm.core contributors for the reference C++ implementation
- The HTM community for ongoing research and development