ironstorm_lookup
Overview
This library contains the internal data structure used by the ironstrom project
To learn more about ironstorm_lookup, read this README.md and the Crate Documentation
It compiles only with the nightly version of rust due to usage of unstable features.
Design goals
- Lightning fast auto completion / type ahead lookups (~200 microseconds! per lookup)
- Not too much searchable text per entry, e.g: street names for locations or movie titles for movies
- High number of possible candidates (multiple gigabytes)
- It can be recommended, but must not be rquired to fit the whole data set into physical memory
- The LookupTable should use virtual memory and OS level optimization to handle larger data sets
- Full text search capability
- Optimized for hardly ever changing data sets, e.g.: All streets in a country
- No mulithreading if not absolutely required => Buy lookup speed with memory, not processing power!
- Optimize for returning a small number of matches, e.g: Find first 10 of 2 million movies that contain 'hero'
- Only one dimensional coarse sorting required, e.g: Fantasy books should be returnd before science fiction books
- Lazy stream/iterator based lookup implementation
Accepted drawbacks
- Creating a
LookupTable
for multiple gigabytes of data can take a few minutes - A
LookupTable
can not be modified, only recreated - No fine granular sorting possible: e.g: by lexicographical order
Basic Usage
- Create a custom type for the data you want to seacrh for, e.g.: a
Movie
struct - Implement the
Lookup
trait for your custom type. - Create an
Iterator
that will iterate over all the elements you would like to put into theLookupTable
- Create a new
LookupTable
by callingLookupTable::from_iter(myMoviesIterator)
- Call
myMoviesLookupTable.find("hero")
to get an lazy 'Iterator' over all matching elements
Example
Let's build a LookupTable
to find restaurants by name.
use FromIterator;
use ;
// 1. Create a custom struct representing a restaurant
// 2. Implement the `Lookup` trait for `Restaurant` references
// 3. Create some restaurants and the according iterator
let restaurants = vec!;
let iter = restaurants.iter;
// 4. Create the `LookupTable`
let lookup_table = from_iter;
// 5. Find restaurants containing `i`
let mut result_iter = lookup_table.find;
// two times 'Italiano pizza', because it's in the lowest bucket
// two times because it has two lower case `i` in the name
assert_eq!;
assert_eq!;
// 'Sushi House', because it's in the second lowest bucket
assert_eq!;
// 'Ami Guy' or ' India Man'
// They are in the same bucket and there is no order within the same bucket
let indian_or_american_1 = result_iter.next.unwrap.name;
assert!;
// The other one of 'Ami Guy' or ' India Man'
let indian_or_american_2 = result_iter.next.unwrap.name;
assert!;
assert!;
// No more matches
// "Brezel Hut" doesn't contain an "i" and was not part of the result.
assert!;
License
Licensed under either of
- Apache License, Version 2.0 http://www.apache.org/licenses/LICENSE-2.0
- MIT license http://opensource.org/licenses/MIT
at your option
Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.