Expand description

CityHash-sys Crates.io Crates.io

Rust bindings to Google CityHash’s C++ API. CityHash-sys do not load the standard library (a.k.a no_std).

Build Test codecov docs.rs

Table of contents

  1. Introduction
  2. Usage
    1. Using Hasher
    2. Using Portable CityHash functions
    3. Using CityHash functions with CRC-32 intrinsic
    4. Using Rust convenient traits
  3. Performance
  4. For more information

Introduction

CityHash provides hash functions for strings. Functions mix the input bits thoroughly but are not suitable for cryptography. CityHash-sys is tested on little-endian but should work on big-endian architecture.

Usage

Using Hasher

use cityhash_sys::CityHashBuildHasher;
use std::collections::HashMap;
const KEY: &str = "hash";
const VALUE: &str = "me!";

// Create a HashMap that use CityHash64 to hash keys
let mut map = HashMap::with_hasher(CityHashBuildHasher::default());
map.insert(KEY, VALUE);

assert_eq!(map.get(&KEY), Some(&VALUE));

Note CityHashBuildHasher is an alias to the the 64-bits CityHash CityHash64Hasher. CityHash32Hasher and CityHash128Hasher are also available but result are still u64. See documentation for more details.

Using Portable CityHash functions

Rust bindings provides a safe interface to all Google’s CityHash hash functions that do not make use of x86_64 CRC intrinsic:

32-bit hash

// uint32 CityHash32(const char *, size_t);
fn city_hash_32(buf: &[u8]) -> u32;

64-bit hash

// uint64 CityHash64(const char *, size_t);
fn city_hash_64(buf: &[u8]) -> u64;

// uint64 CityHash64WithSeed(const char *, size_t, uint64);
fn city_hash_64_with_seed(buf: &[u8], seed: u64) -> u64; 

// uint64 CityHash64WithSeeds(const char *, size_t, uint64, uint64);
fn city_hash_64_with_seeds(buf: &[u8], seed_0: u64, seed_1: u64) -> u64;

128-bit hash

// uint128 CityHash128(const char *, size_t);
fn city_hash_128(buf: &[u8]) -> u128;

// uint128 CityHash128WithSeed(const char *, size_t, uint128);
fn city_hash_128_with_seed(buf: &[u8], seed: u128) -> u128;

// uint64 Hash128to64(const uint128&);
fn city_hash_128_to_64(hash: u128) -> u64;

Note: Depending on your compiler and hardware, it’s likely faster than CityHash64() on sufficiently long strings. It’s slower than necessary on shorter strings.

Using CityHash functions with CRC-32 intrinsic

Some functions are available only if the target is x86_64 and support at least sse4.2 target feature because of the usage of CRC-32 intrinsic _mm_crc32_u64 . If we want to enable those functions use -C target-feature=+sse4.2 or above (avx or avx2). Note that depending of the length of the buffer you want to hash, it can be faster to use the non-intrinsic version. If the buffer to hash is less than 900 bytes, CityHashCrc128WithSeed and CityHashCrc128 will respectivelly internally call CityHash128WithSeed and CityHash128, in this case, it is better to call directly CityHash128WithSeed or CityHash128.

128-bit hash with CRC-32 intrinsic

// uint128 CityHashCrc128(const char *, size_t);
fn city_hash_crc_128(buf: &[u8]) -> u128;

// uint128 CityHashCrc128WithSeed(const char *, size_t, uint128);
fn city_hash_crc_128_with_seed(buf: &[u8], seed: u128) -> u128;

256-bit hash with CRC-32 intrinsic

// void CityHashCrc256(const char *, size_t, uint64 *);
fn city_hash_crc_256(buf: &[u8]) -> [u64; 4]; //

Using Rust convenient traits

CityHash-sys provides convenient traits to hash.

CityHash trait provides hash functions that do not used the CRC-32 intrinsics.

use cityhash_sys::CityHash;

// Hash the slice with CityHash64
let hash_slice: u64 = [5u8, 4, 3, 2, 1].city_hash_64();
assert_eq!(hash_slice, 0x34EC5F7922A51496);

// Hash the str with CityHash64
let hash_str: u64 = "hash me!".city_hash_64();
assert_eq!(hash_str, 0xF04A0CC67B63A0B4);

CityHashCrc trait provides hash implementation for [u8] and str types with x86_64 CRC-32 intrinsic. (Only available with target-feature=+sse4.2)

use cityhash_sys::CityHashCrc;

// Hash the slice with CityHashCrc128
let hash_crc_slice: u128 = [5u8, 4, 3, 2, 1].city_hash_crc_128();

// Hash the str with CityHashCrc128
let hash_crc_slice: u128 = "hash me!".city_hash_crc_128();

Performance

On 64-bits hardware, CityHash is suitable for short string hashing, e.g., most hash table keys, especially city_hash_64 that is faster than city_hash_128. On 32-bits hardware, CityHash is the nearest competitor of Murmur3 on x86.

For more information

See the Google Cityhash README CityHash provides hash functions for strings. CityHash mix the input bits thoroughly but are not suitable for cryptography. It is a Rust binding of Google CityHash library

CityHash-sys do not load the standard library (a.k.a #![no_std])

It provides 32-bits, 64-bits, 128-bits and 256-bits hash. 256-bits hash is available only for x86_64 target_arch with sse4.2. See the README for more informations

Example

use cityhash_sys::{city_hash_128, city_hash_32, city_hash_64};

// Provides free functions
assert_eq!(city_hash_32(&[0,1,2,3,4]), 0xFE6E37D4u32);
assert_eq!(city_hash_64(&[0,1,2,3,4]), 0xB4BFA9E87732C149u64);
assert_eq!(city_hash_128(&[0,1,2,3,4]), 0xE3CB1F3F3AB9643BEF3668C150012EECu128);

// Provides trait implementation for [u8] and str
use cityhash_sys::CityHash;

assert_eq!([5u8, 4, 3, 2, 1].city_hash_64(), 0x34EC5F7922A51496);
assert_eq!("hash me!".city_hash_64(), 0xF04A0CC67B63A0B4);

// Some 128-bits and the 256-bits hash needs x86_64 architecture with sse 4.2
// Free functions that need it start with city_hash_crc_...
// Trait that provides the implementation is named with CityHashCrc
#[cfg(all(target_arch = "x86_64", target_feature = "sse4.2"))]
use cityhash_sys::CityHashCrc;

#[cfg(all(target_arch = "x86_64", target_feature = "sse4.2"))]
assert_eq!([0u8,1,2,3,4].city_hash_crc_128(), 0xE3CB1F3F3AB9643BEF3668C150012EECu128);

#[cfg(all(target_arch = "x86_64", target_feature = "sse4.2"))]
assert_eq!([0u8,1,2,3,4].city_hash_crc_256(), [0xA7FAC4B64C35C8B4,0xDD83C2CDF35398F6,0xEAF64F6BA6A2C9E8,0x4E72CE1685CE9077]);

Structs

CityHash32 hasher

CityHash64 hasher

CityHash128 hasher

Traits

CityHash trait provides CityHash functions to type that implement this trait. This trait is a syntax sugar for city_hash_... functions.

CityHashCrc trait provides CityHash functions to type that implement this trait. CityHashCrc need the CRC32 intrinsics that is only available for x86_64 target_arch with sse4.2. This trait is a syntax sugar for city_hash_crc_... functions.

Functions

Retrieves a 32-bit hash of a slice of bytes.

Retrieves a 64-bit hash of a slice of bytes.

Retrieves a 64-bit hash of a slice of bytes, a seed is also hashed into the result.

Retrieves a 64-bit hash of a slice of bytes, two seeds is also hashed into the result.

Retrieves a 128-bit hash of a slice of bytes.

Retrieves the 64 bits of a 128 bits input.

Retrieves a 128-bit hash of a slice of bytes, a seed is also hashed into the result.

Retrieves a 128-bit hash of a slice of bytes.

Retrieves a 128-bit hash of a slice of bytes, a seed is also hashed into the result.

Retrieves a 256-bit hash fo a slice of bytes. The hash is a slice of u64 where [0..4] is [low..high] bits.

Type Definitions

A builder for default CityHash32 hashers.

A builder for default CityHash64 hashers.

A builder for default CityHash128 hashers.

A builder for default CityHash hashers.