ELID - Efficient Levenshtein and String Similarity Library
A fast, zero-dependency Rust library for computing string similarity metrics with bindings for Python, JavaScript (WASM), and C.
Algorithms
| Algorithm | Type | Best For |
|---|---|---|
| Levenshtein | Edit distance | General-purpose comparison, spell checking |
| Normalized Levenshtein | Similarity (0-1) | When you need a percentage match |
| Jaro | Similarity (0-1) | Short strings |
| Jaro-Winkler | Similarity (0-1) | Names and record linkage |
| Hamming | Distance | Fixed-length strings, DNA, error codes |
| OSA | Edit distance | Typo detection (counts transpositions) |
| SimHash | LSH fingerprint | Database-queryable similarity, near-duplicate detection |
| Best Match | Composite (0-1) | When unsure which algorithm fits |
Installation
Rust
[]
= "0.1.0"
Python
JavaScript (WASM)
C/C++
Build with cargo build --release --features ffi to get libelid.so and elid.h.
Quick Start
use *;
// Edit distance
let distance = levenshtein; // 3
// Normalized similarity (0.0 to 1.0)
let similarity = normalized_levenshtein; // 0.8
// Name matching
let similarity = jaro_winkler; // 0.961
// SimHash for database queries
let hash = simhash;
let sim = simhash_similarity; // ~0.92
// Find best match in a list
let candidates = vec!;
let = find_best_match;
Python
# 3
# 0.961
# 0.922
JavaScript
import init from 'elid';
await ;
; // 3
; // 0.961
; // 0.922
Configuration
Use SimilarityOpts for case-insensitive or whitespace-trimmed comparisons:
use ;
let opts = SimilarityOpts ;
let distance = levenshtein_with_opts; // 0
Performance
- Zero external dependencies for core algorithms
- O(min(m,n)) space-optimized Levenshtein
- 1.4M+ string comparisons per second (Python benchmarks)
- ~96KB WASM binary
Building
License
Dual-licensed under MIT or Apache-2.0 at your option.