terraphim_automata
Fast text matching and autocomplete engine for knowledge graphs.
Overview
terraphim_automata provides high-performance text processing using Aho-Corasick automata and finite state transducers (FST). It powers Terraphim's autocomplete and knowledge graph linking features with sub-millisecond performance.
Features
- ⚡ Fast Autocomplete: FST-based prefix search with ~1ms response time
- 🔍 Fuzzy Matching: Levenshtein and Jaro-Winkler distance algorithms
- 🔗 Link Generation: Convert terms to Markdown, HTML, or Wiki links
- 📝 Text Processing: Multi-pattern matching with Aho-Corasick
- 🌐 WASM Support: Browser-compatible with TypeScript bindings
- 🚀 Async Loading: HTTP-based thesaurus loading (optional)
Installation
[]
= "1.0.0"
With remote loading support:
[]
= { = "1.0.0", = ["remote-loading", "tokio-runtime"] }
For WASM/browser usage:
[]
= { = "1.0.0", = ["wasm", "typescript"] }
Quick Start
Autocomplete with Fuzzy Matching
use ;
use ;
// Create a thesaurus
let mut thesaurus = new;
thesaurus.insert;
thesaurus.insert;
// Build autocomplete index
let index = build_autocomplete_index.unwrap;
// Fuzzy search (handles typos)
let results = fuzzy_autocomplete_search.unwrap;
println!;
Text Matching and Link Generation
use ;
let json = r#"{
"name": "programming",
"data": {
"rust": {
"id": 1,
"nterm": "rust programming",
"url": "https://rust-lang.org"
}
}
}"#;
let thesaurus = load_thesaurus_from_json.unwrap;
let text = "I love rust programming!";
// Replace with Markdown links
let linked = replace_matches.unwrap;
println!;
// Output: "I love [rust](https://rust-lang.org) programming!"
// Or HTML links
let html = replace_matches.unwrap;
// Output: 'I love <a href="https://rust-lang.org">rust</a> programming!'
// Or Wiki links
let wiki = replace_matches.unwrap;
// Output: "I love [[rust]] programming!"
Loading Thesaurus Files
use ;
#
# async
Performance
- Autocomplete: ~1-2ms for 10,000+ terms
- Fuzzy Search: ~5-10ms with Jaro-Winkler
- Text Matching: O(n+m) with Aho-Corasick (n=text length, m=pattern count)
- Memory: ~100KB per 1,000 terms in FST
WebAssembly Support
Build for the browser:
# Install wasm-pack
# Build for web
# Build for Node.js
Use in JavaScript/TypeScript:
import init, { build_autocomplete_index, fuzzy_autocomplete_search } from './pkg';
await init();
const thesaurus = {
name: "programming",
data: {
"rust": { id: 1, nterm: "rust", url: null },
"rust async": { id: 2, nterm: "rust async", url: null }
}
};
const index = build_autocomplete_index(thesaurus, null);
const results = fuzzy_autocomplete_search(index, "rast", 0.8, 5);
console.log("Matches:", results);
See wasm-test/ for a complete example.
Cargo Features
| Feature | Description |
|---|---|
remote-loading |
Enable async HTTP loading of thesaurus files |
tokio-runtime |
Add tokio runtime support (required for remote-loading) |
typescript |
Generate TypeScript definitions via tsify |
wasm |
Enable WebAssembly compilation |
API Overview
Autocomplete Functions
build_autocomplete_index()- Build FST index from thesaurusautocomplete_search()- Exact prefix matchingfuzzy_autocomplete_search()- Fuzzy matching with Jaro-Winklerfuzzy_autocomplete_search_levenshtein()- Fuzzy matching with Levenshteinserialize_autocomplete_index()/deserialize_autocomplete_index()- Index serialization
Text Matching Functions
find_matches()- Find all pattern matches in textreplace_matches()- Replace matches with linksextract_paragraphs_from_automata()- Extract context around matches
Thesaurus Loading
load_thesaurus()- Load from file or URL (async)load_thesaurus_from_json()- Parse from JSON string (sync)
Link Types
- MarkdownLinks:
[term](url) - HTMLLinks:
<a href="url">term</a> - WikiLinks:
[[term]]
Examples
See the examples/ directory for:
- Complete autocomplete UI
- Knowledge graph linking
- WASM browser integration
- Custom thesaurus builders
Minimum Supported Rust Version (MSRV)
This crate requires Rust 1.70 or later.
License
Licensed under Apache-2.0. See LICENSE for details.
Related Crates
- terraphim_types: Core type definitions
- terraphim_rolegraph: Knowledge graph implementation
- terraphim_service: Main service layer
Support
- Discord: https://discord.gg/VPJXB6BGuY
- Discourse: https://terraphim.discourse.group
- Issues: https://github.com/terraphim/terraphim-ai/issues