Crate llmix_rs

Expand description

§llmix-rs

llmix-rs is the Rust binding for the current LLMix orchestration contract.

The crate is usable today, but it should still be treated as beta. The public surface is aligned to the current Python and TypeScript bindings, while downstream production adoption is still earlier-stage in Rust.

§What the crate covers

Async call orchestration through a user-supplied dispatch callback
Two-tier response cache with shared cache-key fixture parity
Circuit breaker, kill switch, retry, singleflight, key-pool rotation, and AIMD semaphore
Provider kwargs normalization for OpenAI, Anthropic, Gemini, OpenRouter, and sno-gpu
Thinking-token stripping
Direct .mda loading and runtime primitives for preset configs
Low-level MDA Source Mode config helpers with load_config and load_config_preset
Optional provider modules for OpenAI-compatible, Anthropic, Gemini, and sno-gpu HTTP dispatch

§Feature flags

Default features: core crate only
redis: enable Redis-backed L2 response cache
providers-openai: enable the OpenAI chat provider adapter
providers-sno-gpu: enable the sno-gpu provider adapter
providers-anthropic: enable the Anthropic chat provider adapter
providers-gemini: enable the Gemini chat provider adapter
helpers-*: legacy aliases that forward to the matching providers-* feature

§Minimal example

use llmix_rs::{
    CallInput, CallPipeline, DispatchContext, KeyPool, LlmUsage, PipelineConfig, ProviderResult,
};
use serde_json::json;

#[tokio::main(flavor = "current_thread")]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let pipeline = CallPipeline::new(PipelineConfig::new(|ctx: DispatchContext| async move {
        let prompt = ctx
            .messages
            .last()
            .and_then(|message| message.get("content"))
            .and_then(|value| value.as_str())
            .unwrap_or("hello");

        Ok(ProviderResult {
            content: format!("echo: {prompt}"),
            model: ctx.model,
            usage: LlmUsage {
                input_tokens: 1,
                output_tokens: 2,
                total_tokens: 3,
            },
            headers: None,
            tool_calls: None,
        })
    }))?;
    pipeline.set_key_pool("openai", KeyPool::new(vec!["demo-key".to_owned()])?);

    let response = pipeline
        .call(CallInput {
            config: json!({
                "provider": "openai",
                "model": "gpt-4o-mini"
            }),
            messages: vec![json!({
                "role": "user",
                "content": "hello"
            })],
            singleflight_key: None,
        })
        .await;

    assert!(response.success);
    println!("{}", response.content);

    pipeline.close().await;
    Ok(())
}

The same callback shape works when you replace the inline closure with your own async function or an adapter around a provider-specific client.

§Config Registry

The public secure registry flow is defined by the root LLMix docs:

config/llm/
  source/
    <module>/
      <preset>.mda
  current.json
  compiled/

Use the MDA CLI for validation, integrity, signing, verification, and release gates. Use the official LLMix publisher against config/llm to generate current.json and compiled/. Do not build a Rust-local compiler, publisher, or custom directory layout.

load_config and load_config_preset remain available as low-level helpers for editing tools, tests, and migration work. They now hard-require .mda files and reject legacy .yaml / .yml preset paths.

§Optional provider modules

The core crate does not require provider SDKs. If you want a batteries-included HTTP adapter, enable one of the providers-* features and wire the corresponding helper into PipelineConfig::new(...).

These provider adapters are deliberately optional because the LLMix contract is the orchestration layer, not a mandatory SDK abstraction.

§Shared parity tests

The Rust crate consumes the same fixtures used by the rest of the repo under fixtures/.

cargo test --all-features runs the local parity and unit suites
live provider tests are opt-in and only run when LLMIX_RUN_LIVE_TESTS=1
provider-specific live tests also require the matching provider credentials in the environment

§Monorepo location

The crate lives at packages/llmix/rust/ in the main LLMix monorepo alongside the Python and TypeScript bindings.

Re-exports§

pub use adaptive_semaphore::parse_openai_ratelimit_headers;
pub use adaptive_semaphore::AdaptiveSemaphore;
pub use adaptive_semaphore::RateLimitHeaders;
pub use config::load_config;
pub use config::load_config_preset;
pub use config::load_config_preset_with_options;
pub use config::load_config_with_options;
pub use config::resolve_config_dir;
pub use config::validate_module;
pub use config::validate_preset;
pub use config::validate_version;
pub use config::ConfigDirSource;
pub use config::LlmixPathConfig;
pub use config::MdaConfigLoadOptions;
pub use config::ResolvedConfigDir;
pub use config_registry::load_llmix_trust_manifest;
pub use config_registry::registry_root_options_from_trust_manifest;
pub use config_registry::registry_root_options_from_trust_manifest_with_hooks;
pub use config_registry::ConfigRegistryManager;
pub use config_registry::ConfigRegistryOpenOptions;
pub use config_registry::ConfigRegistryPublishOptions;
pub use config_registry::ConfigRegistryPublisher;
pub use config_registry::LlmixTrustManifest;
pub use config_registry::LlmixTrustManifestRegistryRoot;
pub use config_registry::LlmixTrustManifestReleasePlan;
pub use config_registry::PublishedRevision;
pub use config_registry::RegistryRootCurrentBinding;
pub use config_registry::RegistryRootEnvelope;
pub use config_registry::RegistryRootFileDigest;
pub use config_registry::RegistryRootHighWatermark;
pub use config_registry::RegistryRootManifestBinding;
pub use config_registry::RegistryRootPayload;
pub use config_registry::RegistryRootSignature;
pub use config_registry::RegistryRootSigner;
pub use config_registry::RegistryRootSigningInput;
pub use config_registry::RegistryRootSigningOptions;
pub use config_registry::RegistryRootVerificationOptions;
pub use config_registry::LLMIX_TRUST_MANIFEST_KIND;
pub use config_registry::LLMIX_TRUST_MANIFEST_VERSION;
pub use dispatch::DispatchFn;
pub use error::AdaptiveSemaphoreClosedError;
pub use error::CircuitOpenError;
pub use error::ConfigAccessError;
pub use error::ConfigNotFoundError;
pub use error::InvalidConfigError;
pub use error::KeyPoolExhaustedError;
pub use error::KillSwitchActiveError;
pub use error::LlmixError;
pub use error::LlmixResult;
pub use error::ProviderError;
pub use error::SecurityError;
pub use key_pool::load_keys_from_env;
pub use key_pool::KeyPool;
pub use pipeline::CallPipeline;
pub use pipeline::PipelineConfig;
pub use provider_kwargs::apply_transform_kwargs;
pub use provider_kwargs::gemini_transform_kwargs;
pub use provider_kwargs::is_reasoning_model;
pub use provider_kwargs::openai_transform_kwargs;
pub use provider_kwargs::openrouter_transform_kwargs;
pub use provider_kwargs::provider_kwargs_callback;
pub use provider_kwargs::sno_gpu_transform_kwargs;
pub use provider_kwargs::TransformKwargsCallback;
pub use provider_kwargs::TransformKwargsContext;
pub use provider_kwargs::PROVIDER_KWARGS_REGISTRY;
pub use providers::anthropic::AnthropicChatHelper;providers-anthropic
pub use providers::gemini::GeminiChatHelper;providers-gemini
pub use providers::openai::OpenAiChatHelper;providers-openai
pub use providers::sno_gpu::SnoGpuChatHelper;providers-sno-gpu
pub use resilience::calculate_delay;
pub use resilience::is_retryable;
pub use resilience::parse_retry_after;
pub use resilience::resolve_state_dir;
pub use resilience::CircuitBreaker;
pub use resilience::CircuitState;
pub use resilience::FileLock;
pub use resilience::KillSwitch;
pub use resilience::RetryPolicy;
pub use resilience::RetryPolicyOptions;
pub use resilience::SharedCallResult;
pub use resilience::Singleflight;
pub use response_cache::generate_cache_key;
pub use response_cache::is_response_cache_strategy;
pub use response_cache::resolve_response_cache_strategy;
pub use response_cache::should_skip_cache;
pub use response_cache::CacheKeyParams;
pub use response_cache::CacheResult;
pub use response_cache::TwoTierCache;
pub use response_cache::TwoTierCacheConfig;
pub use response_cache::CACHE_KEY_PREFIX;
pub use thinking::strip_thinking;
pub use thinking::StripThinkingResult;
pub use types::CacheHitTier;
pub use types::CachingStrategy;
pub use types::CallInput;
pub use types::CallResponse;
pub use types::DispatchContext;
pub use types::LlmUsage;
pub use types::ProviderResult;
pub use types::ResponseCacheStats;
pub use types::ResponseCacheStrategy;

Modules§

adaptive_semaphore
canonical_json
config
config_registry
dispatch
error
key_pool
pipeline
provider_kwargs
providersproviders-anthropic or providers-gemini or providers-openai or providers-sno-gpu
resilience
response_cache
thinking
types

Structs§

DidWebVerificationInput
MdaConfigError
RekorPolicy
TrustPolicy

Enums§

TrustedSigner

Traits§

DidWebVerifier
RekorClient
SigstoreVerifier

Type Aliases§

MdaConfigResult

Crate llmix_rs

Crate llmix_rs Copy item path

§llmix-rs

§What the crate covers

§Feature flags

§Minimal example

§Config Registry

§Optional provider modules

§Shared parity tests

§Monorepo location

Re-exports§

Modules§

Structs§

Enums§

Traits§

Type Aliases§

Crate llmix_rs