Crate precomputed_map

Source
Expand description

§Precomputed Map

Precomputed a Map to embed in binary.

§What and Why

The tool attempts to achieve the best in terms of runtime performance (include startup time and single query time), product size, and compilation speed. and avoid any heavyweight dependency libraries and unsafe code, make it possible to be used without supply chain burden.

Under the hood, we use binary search or phf for different sizes of data. For small data, the runtime performance should be faster than a typical hashmap, and there is almost no size overhead. For large data, it only requires one additional memory access at runtime, and only 1 byte of size overhead for every 3 entries.

At same time, in order to ensure the compilation speed, the tool supports packaging a large number of strings into a single blob. this will greatly improve the compilation performance and product size.

When there are 200,000 entries, the demo has the following comparison data:

startupsingle querycompilesize (stripped)
naive (fold)3.6ms25ns / 104cycle6.9s14M
precomputed (fold/position)_23ns / 86cycle0.3s6.3M
  • Data from my Arch Linux (intel ultra 7 258V)

§Usage

For small amounts of data, it is suitable to be executed in build.rs. For large amounts of data, it is better to have a separate xtask stage to generate code without affecting development performance.

fn build(keys: &[&str], values: &[&str]) {
    // compute map
    let mapout = precomputed_map::builder::MapBuilder::<&str>::new()
        .set_seed(prev_seed)
        .set_ord(&|x, y| x.cmp(y))
        .set_hash(&|seed, &k| {
            let mut hasher = MyHasher::with_key(seed);
            k.hash(&mut hasher);
            k.finish()
        })
        .build(&keys)
        .unwrap();

    let dir = PathBuf::from("src/generated");

    // remove old file
    let _ = fs::remove_file(dir.join("mymap.u8"));
    let _ = fs::remove_file(dir.join("mymap.u32"));

    let mut u8seq = U32SeqWriter::new("PrecomputedU8".into(), dir.join("mymap.u8"));
    let mut u32seq = U32SeqWriter::new("PrecomputedU32".into(), dir.join("mymap.u32"));

    // generate code
    let mut builder = precomputed_map::builder::CodeBuilder::new(
        "MyMap".into(),
        "MyHasher".into(),
        &mut u8seq,
        &mut u32seq,
    );

    let kseq = mapout.reorder(keys).map(|s| s.as_bytes());
    let vseq = mapout.reorder(values).map(|s| s.as_bytes());
    let k = builder.create_bytes_keys("MyKeys".into(), &mapout, kseq).unwrap();
    let v = builder.create_bytes_position_seq("MyValues".into(), vseq).unwrap();
    let pair = builder.create_pair(k, v);

    mapout.create_map("MYMAP".into(), pair, &mut builder).unwrap();

    let mut codeout = fs::File::create(dir.join("mymap.rs")).unwrap();
    builder.codegen(&mut codeout).unwrap();
    u8seq.codegen(&mut code_file).unwrap();
    u32seq.codegen(&mut code_file).unwrap();

    // ...
}

§License

This project is licensed under the MIT license.

Modules§

aligned
builder
Precomputed Map builder
equivalent
fork from https://github.com/indexmap-rs/equivalent/blob/v1.0.2/src/lib.rs
phf
seq
store

Macros§

define

Structs§

MediumMap
Medium map
SmallMap
Small map
TinyMap
Tiny map