Skip to main content

Crate inputx_fsa

Crate inputx_fsa 

Source
Expand description

inputx-fsa — a compact, zero-dependency ordered map from byte keys to u64, backed by a minimal acyclic finite-state automaton (MA-FSA / DAWG) plus a parallel value table addressed by lexicographic rank.

This is a clean-room implementation written for the Inputx IME to replace the fst crate on the shipped read path — same role (compressed, mmap-friendly dictionary index with get + prefix/range scan), no external dependencies. The algorithms (Revuz minimization, MA-FSA as a minimal perfect hash via per-transition right-language counts) are textbook; the code is original.

§Model

The automaton recognizes the set of keys (it carries no values inside it — that keeps it minimal regardless of the values). Each accepted key has a stable ordinal = its rank in sorted order, computed during the walk by summing, at every step, the number of keys that sort before the branch we take. Values live in a separate fixed-width array indexed by that ordinal. This is the “numbered MA-FSA / minimal perfect hash” construction.

§Format (v3, little-endian)

magic "IXFA" (4) · version u8(=3) · value_width u8 (1|2|4|8)
value_count u32 · root_off u32 · state_count u32        (18-byte header)
states:  byte-offset addressed (no offset table), post-order so every
         target precedes its source. Per state, a flags byte:
           bit0 = final, bit1 = single-transition.
         · single  → [flags, label u8, delta uvarint]   (count omitted)
         · multi   → [flags, n uvarint, (label u8, delta uvarint,
                      count uvarint) × n]
         `delta` is a back-distance to the target state; `count` is the
         target's right-language size (for the ordinal/rank walk).
values:  [value_width bytes; value_count]   (tail of the buffer)

§Example

use inputx_fsa::{Builder, Fsa};

let mut b = Builder::new();
b.insert(b"apple", 1);
b.insert(b"apply", 2);
b.insert(b"banana", 3);
let fsa = Fsa::new(b.finish()).unwrap();

assert_eq!(fsa.get(b"apple"), Some(1));
assert_eq!(fsa.get(b"grape"), None);

// lazy, composable prefix scan
let app: Vec<_> = fsa.range(b"app").collect();
assert_eq!(app, vec![(b"apple".to_vec(), 1), (b"apply".to_vec(), 2)]);

Structs§

Builder
Accumulates (key, value) pairs and serializes a minimal FSA.
Dict
Read-only two-level dictionary over a byte container.
DictBuilder
Accumulates (code, item, value) triples and serializes a Dict.
Fsa
A read-only minimal-FSA map over a byte container.
FsaIter
Lazy, allocation-light iterator over an Fsa’s (key, value) pairs in sorted order — see Fsa::iter / Fsa::range. Pre-order DFS via an explicit stack (depth = key length), so it composes with the Iterator adapters and stops early without walking the whole automaton. On a corrupt buffer it simply ends (consistent with the bounds-safe reader).

Enums§

FsaError
Error parsing an FSA buffer.