This library is not recommended for use and is currently under development.
omni_search
omni_search is a Rust SDK for multimodal embedding and similarity search over local ONNX model directories.
Current scope:
- load a flat local model directory with root-level
model_config.json; - compute text embeddings;
- compute image embeddings;
- compare embeddings with cosine similarity;
- expose runtime snapshots with requested device, planned providers, registered providers, and effective provider;
- manually unload ONNX Runtime sessions.
Supported families:
chinese_clipfg_clipopen_clip
The published crate does not bundle ONNX models or sample images. Point it at your own local assets with OMNI_BUNDLE_DIR and OMNI_SAMPLES_DIR.
The CLI binaries automatically load a repo-root .env file when present. Existing shell environment variables still win, so .env works as a local default layer.
Quickstart:
- create a
.envfrom.env.exampleor edit the existing local.envdefaults when you want to pin a bundle, sample directory, or test fixtures; - set
OMNI_BUNDLE_DIRto a local model directory that containsmodel_config.jsonplus flat root-level assets; - set
OMNI_SAMPLES_DIRto a directory containing one or more.jpg,.jpeg,.png,.webp, or.bmpimages; - build with
cargo build --features avifwhen you need.avifinputs; theaviffeature is opt-in and enables AVIF decode support; - on Windows, the
aviffeature additionally requires a nativedav1dlibrary; provide it throughpkg-configor setSYSTEM_DEPS_DAV1D_NO_PKG_CONFIG=1,SYSTEM_DEPS_DAV1D_SEARCH_NATIVE, andSYSTEM_DEPS_DAV1D_LIB; - see docs/avif-setup.md for platform-specific
aviffeature setup on Windows, Linux, and macOS; - build SDK instances with
OmniSearch::builder()when you only want to override part of the runtime config and keep the rest at defaults; - run
cargo run --bin omni_search --releaseto scan all images inOMNI_SAMPLES_DIRwith the default query"山"; - run
cargo run --bin omni_search --release -- "海边"to scan all images with a custom query; - run
cargo run --bin omni_search --release -- "海边" 20to print a different top-k; - run
cargo run --bin omni_search --release -- "海边" "/absolute/path/to/query.jpg"to runimage_to_imagewith a specific query image; - set
OMNI_DEVICEtoauto,cpu, orgputo control execution provider selection; default isauto; - set
OMNI_PROVIDER_POLICYtoauto,interactive, orservicewhen you want to change GPU provider priority without pinning a single provider; - call
sdk.runtime_snapshot()when you need to inspect or display the current execution provider state; - set
OMNI_ORT_DYLIB_PATHwhen you build withruntime-dynamicand want to pointomni_searchat a specificonnxruntime.dll/.so/.dylib; - set
OMNI_ORT_PROVIDER_DIR,OMNI_CUDA_BIN_DIR,OMNI_CUDNN_BIN_DIR, andOMNI_TENSORRT_LIB_DIRwhen you needomni_searchto preload or register NVIDIA provider libraries from explicit directories; - set
OMNI_PRELOAD_RUNTIME_LIBRARIES=falsewhen you want to disable eager runtime DLL preloading while still keeping the path hints in config; - the default
RuntimeConfig::intra_threadsvalue also resolves to the host physical core count; - set
OMNI_INTRA_THREADStoautoor a positive integer to override the ONNX Runtime intra-op thread count;autoresolves to the host physical core count; - set
OMNI_INTER_THREADSto a positive integer when you need to override the ONNX Runtime inter-op thread count while benchmarking or tuning; - set
OMNI_FGCLIP_MAX_PATCHESto cap FG-CLIP2 image preprocessing at a smaller bucket without changing the exported model directory; - recommended
OMNI_FGCLIP_MAX_PATCHESvalues are128,256,576,784, or1024; - run
cargo test --test quickstart -- --ignored --nocaptureto execute the smoke test after settingOMNI_TEST_BUNDLE_DIRandOMNI_TEST_SAMPLE_IMAGE.
Example .env:
OMNI_DEVICE=auto
OMNI_PROVIDER_POLICY=auto
OMNI_BUNDLE_DIR=models/fgclip2_flat
OMNI_SAMPLES_DIR=samples
Device selection notes:
- all current model families load standard ONNX graphs, so GPU support is determined by the ONNX Runtime execution provider rather than by a model-specific code path;
- the default build is
runtime-bundled + directml + coreml, which keeps WindowsDirectMLand AppleCoreMLenabled while still using the bundled ONNX Runtime loading mode; - enable
cargo build --features nvidiawhen you wantTensorRT -> CUDAahead of the platform fallback provider on supported Windows/Linux x64 targets; - build with
cargo build --no-default-features --features runtime-dynamic,directml,nvidiawhen you want a Windows/NVIDIA variant that uses system-provided ONNX Runtime, CUDA, cuDNN, and TensorRT libraries instead of bundling them into the application; OMNI_PROVIDER_POLICY=serviceprefers steady-state throughput and currently triesTensorRT -> CUDA -> DirectML -> CPUon Windows with--features nvidia;OMNI_PROVIDER_POLICY=interactiveprefers lower warmup cost and currently triesCUDA -> DirectML -> TensorRT -> CPUon Windows with--features nvidia;OMNI_FORCE_PROVIDERremains available as a diagnostics-only override when you need to pin one execution provider for benchmarking or debugging;- on Apple Silicon/macOS,
gpuis wired to the CoreML execution provider; - on Linux, the current crate build does not yet wire a GPU provider; AMD GPU support would require a dedicated ROCm or WebGPU path;
autofirst tries the configured GPU chain and falls back to CPU if acceleration is unavailable;gpurequires at least one GPU execution provider to register successfully; it does not silently fall back to CPU;runtime_snapshot()now separatescompiled_providers,planned_providers,registered_providers, andissues, so upper layers can distinguish feature-gated providers from missing runtime libraries or dependency-chain failures;- the Windows build output includes
DirectML.dll; application packaging should ship that file with the executable.
Legacy migration:
python .\scripts\flatten_bundle_to_flat.py --input .\models\chinese_clip_bundle --mode hardlink
python .\scripts\flatten_bundle_to_flat.py --input .\models\fgclip2_bundle --mode hardlink
Direct exporters:
The exporter scripts live in D:\code\vl-embedding-test and default to writing flat model
directories into D:\code\omni_search\models.
uv run D:\code\vl-embedding-test\export_openclip_flat.py --id timm/MobileCLIP2-S2-OpenCLIP --output D:\code\omni_search\models\mobileclip2 --force
uv run D:\code\vl-embedding-test\export_chinese_clip_flat.py --model-dir D:\models\chinese-clip-vit-base-patch16 --output D:\code\omni_search\models\chinese_clip_flat --force
uv run D:\code\vl-embedding-test\export_fgclip2_flat.py --model-dir D:\models\fg-clip2-base --output D:\code\omni_search\models\fgclip2_flat --force
Builder example:
use ;
let sdk = builder
.from_local_model_dir
.device
.provider_policy
.provider_dir
.cuda_bin_dir
.cudnn_bin_dir
.tensorrt_lib_dir
.intra_threads
.fgclip_max_patches
.session_policy
.graph_optimization_level
.build?;
let snapshot = sdk.runtime_snapshot;
println!;
println!;
println!;
println!;