Module inference

Module inference 

Source
Expand description

Hydra ML inference for intelligent algorithm routing.

The Hydra SLM is a small language model optimized for M2M protocol tasks.

§Inference Backends

  • Native (safetensors): Pure Rust inference from safetensors weights
  • ONNX Runtime: Optional, requires onnx feature flag
  • Heuristic fallback: Rule-based fallback when model unavailable

§Tokenizers

Hydra supports multiple tokenizer backends:

  • Llama 3 (128K vocab): Primary tokenizer for open source ecosystem
  • o200k_base (200K vocab): OpenAI GPT-4o, o1, o3
  • cl100k_base (100K vocab): OpenAI GPT-3.5, GPT-4
  • Fallback: Byte-level tokenizer when nothing else available

§Tasks

  • Compression selection: Predicts optimal algorithm (None/BPE/Brotli/Zlib)
  • Security detection: Classifies prompt injection and jailbreak attempts
  • Token estimation: Fast approximate token counting

§Model Architecture

vocab_size: 128000 (Llama 3), hidden_size: 192, num_layers: 4, num_experts: 4

The Hydra model is a Mixture of Experts classifier:

  • 4 MoE layers with top-2 expert routing
  • Heterogeneous expert architectures (different depths/widths)
  • ~100MB model size (float32 weights with 128K vocab)

§Download

huggingface-cli download infernet/hydra --local-dir ./models/hydra

§Example

use m2m::inference::{HydraModel, CompressionDecision, Llama3Tokenizer};

// Load model and tokenizer
let model = HydraModel::load("./models/hydra")?;

let decision = model.predict_compression(&content)?;
match decision.algorithm {
    Algorithm::Brotli => // use brotli
    _ => // use other
}

Re-exports§

pub use bitnet::HydraBitNet;
pub use tokenizer::boxed;
pub use tokenizer::load_tokenizer;
pub use tokenizer::load_tokenizer_by_type;
pub use tokenizer::BoxedTokenizer;
pub use tokenizer::FallbackTokenizer;
pub use tokenizer::HydraByteTokenizer;
pub use tokenizer::HydraTokenizer;
pub use tokenizer::Llama3Tokenizer;
pub use tokenizer::TiktokenTokenizer;
pub use tokenizer::TokenizerType;
pub use tokenizer::MAX_SEQUENCE_LENGTH;

Modules§

bitnet
Native BitNet MoE model implementation for Hydra.
tokenizer
Tokenizer infrastructure for Hydra model.

Structs§

CompressionDecision
Compression decision from the model
HydraModel
Hydra model wrapper
SecurityDecision
Security decision from the model

Enums§

ThreatType
Types of security threats

Constants§

DEFAULT_MODEL_PATH
Default model path (safetensors format)
DEFAULT_TOKENIZER_PATH
Default tokenizer path
MODEL_VERSION
Model version