Expand description
Fast text matching and autocomplete engine for knowledge graphs.
terraphim_automata provides high-performance text processing using Aho-Corasick
automata and finite state transducers (FST). It powers Terraphim’s autocomplete
and knowledge graph linking features.
§Features
- Fast Autocomplete: Prefix-based search with fuzzy matching (Levenshtein/Jaro-Winkler)
- Text Matching: Find and replace terms using Aho-Corasick automata
- Link Generation: Convert matched terms to Markdown, HTML, or Wiki links
- Paragraph Extraction: Extract context around matched terms
- WASM Support: Browser-compatible autocomplete with TypeScript bindings
- Remote Loading: Async loading of thesaurus files from HTTP (feature-gated)
§Architecture
- Autocomplete Index: FST-based prefix search with metadata
- Aho-Corasick Matcher: Multi-pattern matching for link generation
- Thesaurus Builder: Parse knowledge graphs from JSON/Markdown
§Cargo Features
remote-loading: Enable async HTTP loading of thesaurus files (requires tokio)tokio-runtime: Add tokio runtime supporttypescript: Generate TypeScript definitions via tsifywasm: Enable WebAssembly compilation
§Examples
§Autocomplete with Fuzzy Matching
use terraphim_automata::{build_autocomplete_index, fuzzy_autocomplete_search};
use terraphim_types::{Thesaurus, NormalizedTermValue, NormalizedTerm};
// Create a simple thesaurus
let mut thesaurus = Thesaurus::new("programming".to_string());
thesaurus.insert(
NormalizedTermValue::from("rust"),
NormalizedTerm::new(1, NormalizedTermValue::from("rust"))
);
thesaurus.insert(
NormalizedTermValue::from("rust async"),
NormalizedTerm::new(2, NormalizedTermValue::from("rust async"))
);
// Build autocomplete index
let index = build_autocomplete_index(thesaurus, None).unwrap();
// Fuzzy search (returns Result)
let results = fuzzy_autocomplete_search(&index, "rast", 0.8, Some(5)).unwrap();
assert!(!results.is_empty());§Text Matching and Link Generation
use terraphim_automata::{load_thesaurus_from_json, replace_matches, LinkType};
let json = r#"{
"name": "test",
"data": {
"rust": {
"id": 1,
"nterm": "rust programming",
"url": "https://rust-lang.org"
}
}
}"#;
let thesaurus = load_thesaurus_from_json(json).unwrap();
let text = "I love rust!";
// Replace matches with Markdown links
let linked = replace_matches(text, thesaurus, LinkType::MarkdownLinks).unwrap();
let result = String::from_utf8(linked).unwrap();
println!("{}", result); // "I love [rust](https://rust-lang.org)!"§Loading Thesaurus Files
use terraphim_automata::{AutomataPath, load_thesaurus};
// Load from local file
let local_path = AutomataPath::from_local("thesaurus.json");
let thesaurus = load_thesaurus(&local_path).await.unwrap();
// Load from remote URL (requires 'remote-loading' feature)
let remote_path = AutomataPath::from_remote("https://example.com/thesaurus.json").unwrap();
let thesaurus = load_thesaurus(&remote_path).await.unwrap();§WASM Support
Build for WebAssembly:
wasm-pack build --target web --features wasmSee the WASM example for browser usage.
Re-exports§
pub use self::builder::Logseq;pub use self::builder::ThesaurusBuilder;pub use autocomplete::autocomplete_search;pub use autocomplete::build_autocomplete_index;pub use autocomplete::deserialize_autocomplete_index;pub use autocomplete::fuzzy_autocomplete_search;pub use autocomplete::fuzzy_autocomplete_search_levenshtein;pub use autocomplete::serialize_autocomplete_index;pub use autocomplete::AutocompleteConfig;pub use autocomplete::AutocompleteIndex;pub use autocomplete::AutocompleteMetadata;pub use autocomplete::AutocompleteResult;pub use matcher::extract_paragraphs_from_automata;pub use matcher::find_matches;pub use matcher::replace_matches;pub use matcher::LinkType;pub use matcher::Matched;
Modules§
- autocomplete
- autocomplete_
helpers - builder
- matcher
- url_
protector - URL protection for text replacement.
Enums§
- Automata
Path - Path to a thesaurus/automata file, either local or remote.
- Terraphim
Automata Error - Errors that can occur when working with automata and thesaurus operations.
Functions§
- load_
thesaurus - Load a thesaurus from a local file only (WASM-compatible version)
- load_
thesaurus_ from_ json - Load thesaurus from JSON string (sync version for WASM compatibility)
- load_
thesaurus_ from_ json_ and_ replace - Load thesaurus from JSON string and replace terms using streaming matcher
Type Aliases§
- Result
- Result type alias using
TerraphimAutomataError.