elid 0.1.0

A fast and efficient string similarity library
Documentation

ELID - Efficient Levenshtein and String Similarity Library

CI License

A fast, zero-dependency Rust library for computing string similarity metrics with bindings for Python, JavaScript (WASM), and C.

Algorithms

Algorithm Type Best For
Levenshtein Edit distance General-purpose comparison, spell checking
Normalized Levenshtein Similarity (0-1) When you need a percentage match
Jaro Similarity (0-1) Short strings
Jaro-Winkler Similarity (0-1) Names and record linkage
Hamming Distance Fixed-length strings, DNA, error codes
OSA Edit distance Typo detection (counts transpositions)
SimHash LSH fingerprint Database-queryable similarity, near-duplicate detection
Best Match Composite (0-1) When unsure which algorithm fits

Installation

Rust

[dependencies]
elid = "0.1.0"

Python

pip install elid

JavaScript (WASM)

npm install elid-wasm

C/C++

Build with cargo build --release --features ffi to get libelid.so and elid.h.

Quick Start

use elid::*;

// Edit distance
let distance = levenshtein("kitten", "sitting"); // 3

// Normalized similarity (0.0 to 1.0)
let similarity = normalized_levenshtein("hello", "hallo"); // 0.8

// Name matching
let similarity = jaro_winkler("Martha", "Marhta"); // 0.961

// SimHash for database queries
let hash = simhash("iPhone 14");
let sim = simhash_similarity("iPhone 14", "iPhone 15"); // ~0.92

// Find best match in a list
let candidates = vec!["apple", "application", "apply"];
let (idx, score) = find_best_match("app", &candidates);

Python

import elid

elid.levenshtein("kitten", "sitting")  # 3
elid.jaro_winkler("martha", "marhta")  # 0.961
elid.simhash_similarity("iPhone 14", "iPhone 15")  # 0.922

JavaScript

import init, { levenshtein, jaroWinkler, simhashSimilarity } from 'elid';

await init();
levenshtein("kitten", "sitting");  // 3
jaroWinkler("martha", "marhta");   // 0.961
simhashSimilarity("iPhone 14", "iPhone 15");  // 0.922

Configuration

Use SimilarityOpts for case-insensitive or whitespace-trimmed comparisons:

use elid::{levenshtein_with_opts, SimilarityOpts};

let opts = SimilarityOpts {
    case_sensitive: false,
    trim_whitespace: true,
    ..Default::default()
};
let distance = levenshtein_with_opts("  HELLO  ", "hello", &opts); // 0

Performance

  • Zero external dependencies for core algorithms
  • O(min(m,n)) space-optimized Levenshtein
  • 1.4M+ string comparisons per second (Python benchmarks)
  • ~96KB WASM binary

Building

git clone https://forge.blackleafdigital.com/BlackLeafDigital/ELID.git
cd ELID

cargo build --release
cargo test
cargo bench
cargo run --example basic_usage

License

Dual-licensed under MIT or Apache-2.0 at your option.