nekosearch-0.0.1

docs.rs

Rust

nekosearch 0.0.1

A Rust toolkit for text search, fuzzy matching and intent detection: tokenization, normalization, TF-IDF, Jaccard, Levenshtein, and ranking pipelines.

Crate
Source
Builds
Feature flags

Size
Source code size: 3.32 kB This is the summed size of all the files inside the crates.io package for this release.
Ø build duration
all releases: 26s Average build duration of successful builds in releases after 2024-10-23.
Links
Homepage
Repository
crates.io
Dependencies
Versions
Owners

nekosearch-0.0.1 is not a library.

Visit the last successful build: nekosearch-0.0.5

nekosearch

A Rust toolkit for text search, fuzzy matching, and intent detection.
From minimal, dependency-free matching to full ranking pipelines with TF-IDF, Jaccard, and Levenshtein.

📋 Feature Checklist

🔹 Core (std-only)

Simple tokenization (split_whitespace)
Basic normalization (lowercase, trim)
Exact equality comparison
Word-by-word comparison (overlap count)
Set similarity (basic Jaccard)
Character similarity (Hamming, if lengths match)

🔹 Normalization & Preprocessing

Remove punctuation
Remove stopwords (customizable list)
Unicode normalization (NFC/NFD)
Accent stripping (configurable)
Stemming or lemmatization (at least English/Portuguese)

🔹 Similarity Metrics

Levenshtein distance
Damerau–Levenshtein (transpositions)
Sørensen–Dice coefficient
Cosine similarity (with TF-IDF vectors)
Advanced Jaccard (n-grams)

🔹 Indexing & Search

Basic inverted index (word → docs)
Ranking by TF (Term Frequency)
Ranking by TF-IDF
Approximate search (configurable threshold)
Typo-tolerant search

🔹 Fuzzy Matching

N-grams (2-gram, 3-gram, etc.)
Fast Levenshtein approximation
Fuzzy ranking (normalized score 0–1)
Partial matching (relevant substrings)

🔹 Advanced Features

Compound queries (AND, OR, NOT)
Custom weighting support
Query expansion (synonyms, related terms)
Intent detection pipeline
Parallel indexing/search (optional rayon feature)
Serialization of index (optional serde feature)