Crate hashconsing[][src]

Hash consing library.

This library is based on Type-Safe Modular Hash-Consing by Filiâtre and Conchon. It is probably less efficient as uses Rust's HashMaps, not a custom built structure.

If you are not familiar with hashconsing, see the example below or read the paper.

Provides constant time comparison and perfect (maximal) sharing assuming only one consign/factory is created for a given type. This assumption must never be falsified unless you really, really know what you are doing.

Hash consed elements are immutable and therefore thread-safe: HConsed implements Send and Sync.

The consign actually stores weak references to values. This ensures that values are dropped once they are not used anymore.

Example

Simple example for lambda calculus from the paper.

Hashconsing consists in wrapping some tree-like datatype in an immutable container, which in the case of hashconsing is HConsed. In this example we'll call the tree-like datatype ActualTerm and the hashconsed version Term.

A Term is created from an ActualTerm by a factory, which hashconsing calls a consign (see HConsign). The functions for doing so are in the HashConsign trait. The idea is that the consign is a map from actual terms to Arcs of hashconsed terms. When given an actual term, the consign checks whether there's already a hashconsed term for it. If not, then it creates one, memorizes it and returns that. Otherwise it clones the existing one. Hence subterm sharing is maximal/perfect.

A HConsed<T> is exactly two things: a unique identifier uid and an Arc to the real term it represents. (Hence, cloning a hashconsed term is cheap.) This library guarantees that two hashconsed terms refer to structurally identical real terms iff their uids are equal. Hence, equality checking is constant time.

extern crate hashconsing ;
use hashconsing::{ HConsed, HashConsign, HConsign } ;

type Term = HConsed<ActualTerm> ;

#[derive(Debug, Hash, Clone, PartialEq, Eq)]
enum ActualTerm {
    Var(usize),
    Lam(Term),
    App(Term, Term)
}
use self::ActualTerm::* ;

fn main() {
    let mut factory: HConsign<ActualTerm> = HConsign::empty() ;

    assert_eq! { factory.len(), 0 }

    let v = factory.mk( Var(0) ) ;
    assert_eq! { factory.len(), 1 }

    let v2 = factory.mk( Var(3) ) ;
    assert_eq! { factory.len(), 2 }

    let lam = factory.mk(
        Lam( v2.clone() )
    ) ;
    assert_eq! { factory.len(), 3 }

    let v3 = factory.mk( Var(3) ) ;
    // v2 is the same as v3: Var(3). Consign has not created anything new, and
    // v2 and v3 are conceptually the same term.
    assert_eq! { factory.len(), 3 }
    assert_eq! { v2.uid(), v3.uid() }
    assert_eq! { v2.get(), v3.get() }
    assert_eq! { v2,       v3       }

    let lam2 = factory.mk( Lam(v3) ) ;
    // Not new either.
    assert_eq! { factory.len(), 3 }
    assert_eq! { lam.uid(), lam2.uid() }
    assert_eq! { lam.get(), lam2.get() }
    assert_eq! { lam,       lam2       }

    let app = factory.mk( App(lam2, v) ) ;
    assert_eq! { factory.len(), 4 }
}

This library maintains the invariant stated above as long as you never create two consigns for the same type.

Users are free to use the consign however they see fit: one can create a factory directly as in the example above, but passing it around everywhere it's needed is tedious. The author recommends the following workflow instead. It relies on the consign macro which creates a lazy static factory protected by a RwLock for thread-safety. The consign and the constructors are wrapped in an appropriately named module. The consign is invisible and creating terms is easy.

#[macro_use]
extern crate hashconsing ;

pub mod term {
    use hashconsing::{ HConsed, HashConsign } ;
    pub type Term = HConsed<ActualTerm> ;
    #[derive(Debug, Hash, Clone, PartialEq, Eq)]
    pub enum ActualTerm {
        Var(usize),
        Lam(Term),
        App(Term, Term)
    }

    consign! {
        /// Factory for terms.
        let factory = consign(37) for ActualTerm ;
    }
    pub fn var(v: usize) -> Term {
        factory.mk( ActualTerm::Var(v) )
    }
    pub fn lam(t: Term) -> Term {
        factory.mk( ActualTerm::Lam(t) )
    }
    pub fn app(t_1: Term, t_2: Term) -> Term {
        factory.mk( ActualTerm::App(t_1, t_2) )
    }
}

fn main() {
    let v = term::var(0) ;
    let v2 = term::var(3) ;
    let lam = term::lam(v2) ;
    let v3 = term::var(3) ;
    let lam2 = term::lam(v3) ;
    let app = term::app(lam2, v) ;
}

Note that HConsed<T> Derefs to T, so you don't need to extend HConsed<T> using some trait to add the functions you need. Just implement them for T. Functions taking an & mut self won't work since HConsed<T> gives you access to T through an Arc.

impl ::std::fmt::Display for ActualTerm {
    fn fmt(& self, fmt: & mut ::std::fmt::Formatter) -> ::std::fmt::Result {
        match self {
            ActualTerm::Var(i) => write!(fmt, "v{}", i),
            ActualTerm::Lam(t) => write!(fmt, "({})", t.get()),
            ActualTerm::App(u, v) => write!(
                fmt, "{}.{}", u.get(), v.get()
            ),
        }
    }
}
fn main() {
    let v = term::var(0) ;
    let v3 = term::var(3) ;
    let lam2 = term::lam(v3) ;
    let app = term::app(lam2, v) ;
    assert_eq! { & format!("{}", app), "(v3).v0" }
}

Collections with trivial hash function

This library provides two special collections: HConSet and HConMap. They use the trivial hash function over hashconsed values' unique identifier. Read more.

Another way to have efficient sets/maps of/from hashconsed things is to use the BTree sets and maps from the standard library.

Re-exports

pub extern crate lazy_static;

Modules

coll

Efficient collections for hashconsed data.

Macros

consign

Creates a lazy static consign.

lazy_static

Structs

HConsed

A hashconsed value.

HConsign

The consign storing the actual hash consed elements as HConseds.

WHConsed

Weak version of HConsed (internal).

Traits

HashConsed

Internal trait used to recognize hashconsed things.

HashConsign

HConsed element creation.