Expand description
§ELID - Embedding Locality IDentifier
ELID enables vector search without a vector store by encoding high-dimensional embeddings into sortable string IDs that preserve locality. Similar vectors produce similar IDs, allowing you to use standard database indexes for similarity search.
ELID also includes a complete suite of fast, zero-dependency string similarity algorithms.
§Feature Sets
§Embedding Encoding (embeddings feature)
Convert embeddings from any ML model into compact, sortable identifiers:
- Mini128: 128-bit SimHash using signed random projections (fast, Hamming distance)
- Morton10x10: Z-order curve encoding (database range queries)
- Hilbert10x10: Hilbert curve encoding (maximum locality preservation)
§String Similarity (strings feature, default)
- Levenshtein Distance: Classic edit distance algorithm
- Normalized Levenshtein: Returns similarity as a value between 0.0 and 1.0
- Jaro-Winkler Similarity: Better for short strings like names
- Hamming Distance: For equal-length strings
- Optimal String Alignment (OSA): Levenshtein with transpositions
- SimHash: Locality-sensitive hashing for string similarity queries
§Feature Flags
strings(default): Zero-dependency string similarity algorithmsembeddings(default): Vector encoding with Mini128, Morton, and Hilbert profilesmodels: Base ONNX model support using tract-onnx (WASM compatible)models-text: Text embedding models (Model2Vec potion-base-8M)models-image: Image embedding models (MobileNetV3-Small)wasm: WebAssembly bindings (includes embeddings)python: Python bindings via PyO3 (includes embeddings + numpy)ffi: C FFI bindings
§Embedding Encoding Example
ⓘ
use elid::embeddings::{encode, Profile, hamming_distance};
// Get embeddings from your ML model
let embedding1 = model.embed("Hello, world!")?;
let embedding2 = model.embed("Hello, universe!")?;
// Encode to sortable ELIDs
let profile = Profile::default(); // Mini128
let elid1 = encode(&embedding1, &profile)?;
let elid2 = encode(&embedding2, &profile)?;
// Compare via Hamming distance (lower = more similar)
let distance = hamming_distance(&elid1, &elid2)?;§String Similarity Example
use elid::{levenshtein, normalized_levenshtein, jaro_winkler, simhash, simhash_similarity};
let distance = levenshtein("kitten", "sitting");
assert_eq!(distance, 3);
let similarity = normalized_levenshtein("kitten", "sitting");
assert!(similarity > 0.5 && similarity < 0.7);
let jw_similarity = jaro_winkler("martha", "marhta");
assert!(jw_similarity > 0.9);
// SimHash for numeric database queries
let hash1 = simhash("iPhone 14");
let hash2 = simhash("iPhone 15");
let sim = simhash_similarity("iPhone 14", "iPhone 15");
assert!(sim > 0.8);Modules§
- embeddings
- Embedding encoding module for ELID
Structs§
- Similarity
Opts - Options for configuring string similarity algorithms
Functions§
- best_
match - Compute the best matching similarity between two strings using multiple algorithms and return the highest score.
- find_
best_ match - Find the best match for a query string in a list of candidates.
- find_
matches_ above_ threshold - Find all matches above a threshold score.
- find_
similar_ hashes - Find all items within a given SimHash distance threshold.
- hamming
- Compute the Hamming distance between two strings.
- jaro
- Compute the Jaro similarity between two strings.
- jaro_
winkler - Compute the Jaro-Winkler similarity between two strings.
- jaro_
winkler_ with_ prefix - Compute the Jaro-Winkler similarity with a custom prefix scale.
- levenshtein
- Compute the Levenshtein distance between two strings.
- levenshtein_
with_ opts - Compute Levenshtein distance with configurable options.
- normalized_
hamming - Compute the normalized Hamming similarity between two strings.
- normalized_
levenshtein - Compute the normalized Levenshtein similarity between two strings.
- normalized_
osa - Compute the normalized OSA similarity between two strings.
- osa_
distance - Compute the Optimal String Alignment distance between two strings.
- simhash
- Compute the SimHash fingerprint of a string.
- simhash_
distance - Compute the Hamming distance between two SimHash values.
- simhash_
similarity - Compute the normalized SimHash similarity between two strings.