Number Words Rust
A Rust crate that explores different algorithmic strategies to solve the number word problem.
The problem involves parsing a string of digits into all possible corresponding words based on a given mapping (e.g., 1 -> 'A', 2 -> 'B', ... 26 -> 'Z').
For example, the input "123" can be parsed as:
- "ABC" (1, 2, 3)
- "AW" (1, 23)
- "LC" (12, 3)
Implementations
This crate provides three different parser implementations, each with distinct performance characteristics, serving as a pedagogical example of optimization in Rust.
1. Naive Parser (NaiveParser)
The original implementation using a functional approach with VecDeque.
- Pros: Purely functional, demonstrates iterator combinators.
- Cons: Extremely slow for large inputs due to excessive allocation and copying ($O(N^2)$ copying behavior).
2. DFS Parser (DfsParser)
A performance-optimized implementation using Depth-First Search with a single mutable buffer.
- Pros: Very fast due to cache locality and minimal heap allocation.
- Cons: Recomputes subproblems if the graph has overlapping paths.
3. Memoized Parser (MemoizedParser)
A hybrid approach using Dynamic Programming to precompute the solution graph and solution counts.
- Pros: The fastest implementation for large, highly ambiguous inputs. It pre-allocates exact memory for results, preventing costly reallocations.
- Cons: Slightly more complex implementation logic.
Usage
use ;
Development
Build
To build the project in release mode:
Test
To run the test suite:
Benchmark
This project uses divan for benchmarking. To run benchmarks:
Benchmarks
Benchmarks were run on an ambiguous input ("1" repeated $N$ times) which triggers the worst-case exponential growth of solutions.
| Input Length (N) | NaiveParser | DfsParser | MemoizedParser | Speedup (Memo vs Naive) |
|---|---|---|---|---|
| 10 | 30.1 µs | 7.6 µs | 3.0 µs | 10x |
| 20 | 4.4 ms | 0.8 ms | 0.25 ms | 17x |
| 25 | 58.3 ms | 8.6 ms | 2.7 ms | 21x |
| 30 | N/A | 100.3 ms | 32.3 ms | - |
- NaiveParser: Struggles with overhead from repeated
VecDequeallocations. - DfsParser: significantly faster by using a single stack, but still does redundant work.
- MemoizedParser: The clear winner. By pre-calculating the graph and solution counts, it avoids all redundant allocations and writes directly to the final memory location.
AI-Assisted Development
This project was significantly improved with the assistance of Gemini, an AI by Google. Gemini was used to:
- Refactor the codebase for more principled error handling using
thiserror. - Expand the test suite to cover critical edge cases and invalid inputs.
- Implement and document high-performance parsing strategies (
DfsParser,MemoizedParser). - Generate comprehensive benchmarks and technical documentation.
License
Licensed under either of
- Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.