sequence-generator-rust

64-bit IDs sequence generator based on the concepts outlined in Twitter Server's ID (formerly snowflake). Build on Rust

Installation & build
Description
Usage
Library
Support
Contributing

Installation

  git clone https://github.com/drconopoima/sequence-generator-rust.git
  cd sequence-generator-rust
  cargo build --release

The binary was generated under target/release/sequence-generator-rust

Description

You can generate sequential IDs based on timestamp, sequence number and node/worker ID (based on Twitter snowflake):

By default this package format defines:

44 bits are used to store a custom epoch with precision of 10 samples every millisecond (10^-1). That's enough to store 55 years from a custom epoch
9 bits are used to store worker and/or host information (up to 512)
11 bits are used to store a sequence number (up to 2048)
There are no bits left unused.
Custom epoch is set to the beginning of current decade (2020-01-01)

Usage

Expected output would be like the following

$ cargo run \
  0: 344656800457424896

You can just as easily generate more than one ID from the CLI (-n|--number), as well as measure the time taken by the batch -d|--debug, providing a worker id of 0 (--node-id).

$ cargo run --release -- -n 8 --node-id 505 --debug \
  0: 344680846955905529
  1: 344680846955906041
  2: 344680846955906553
  3: 344680846955907065
  4: 344680846955907577
  5: 344680846955908089
  6: 344680846955908601
  7: 344680846955909113
  It took 611 nanoseconds

Each one of the parameters for the sequence are customizable.

By default the original Twitter snowflake format defines:

1 bit left unused (sign)
41 bits are used to store a custom epoch with millisecond precision (10^3 microseconds for 69 years from a custom epoch)
10 bits are used to store worker and datacenter information (up to 1024)
12 bits are used to store a sequence number (up to 4096)
Uses a custom epoch of 1288834974657 or Nov 04 2010 01:42:54.

You can perfectly and easily recreate Twitter's snowflakes by passing the following command arguments.

$ cargo run --release -- -n 8 -d --unused-bits 1 --node-id-bits 10 --sequence-bits 12 --micros-ten-power 3 --custom-epoch '2010-11-04T01:42:54Z'  --node-id 128
0: 137870923482005632
1: 137870923482006656
2: 137870923482007680
3: 137870923482008704
4: 137870923482009728
5: 137870923482010752
6: 137870923482011776
7: 137870923482012800
It took 571 nanoseconds

I wouldn't use Twitter's values nowadays. Hardware has advanced enough that with a single thread of a laptop average processor I'm generating each ID in ~60-70 nanoseconds, even when reaching close to the maximum sequence and stalling a bit waiting for the next millisecond, while the Twitter's defaults always generate the full sequence and then stalls for 55% of the time waiting (averaging 135-145ns per ID). Twitter's snowflakes are optimal if the hardware takes 245ns per ID or more. Furthermore, having tenths of a millisecond precision means more accurate time & fast sorting if coupled with a Radix sort algorithm.

The specific structure of the integers at the binary level includes:

The first few bits (customizable, by default none) might be unused and set to 0.
The second group of bits store the timestamp in a custom exponential by microseconds (by default 44 bits and sampling every 100 mcs, equivalent to argument --micros-ten-power 2). You cannot customize number of bits of the timestamp directly, but by indirectly setting different values for other bit groups.
The third group of bits store the sequence (by default 11 bits)
The last group of bits store the host/worker ID (by default 9 bits)

You can also customize by dotenv file. Copy the file .env-example into .env

cp .env-example .env

And change the example values to your liking.

The prevalence of the dotenv values is higher than CLI parameters passed.

The only supported custom epoch format is RFC-3339/ISO-8601 both as CLI argument and from the dotenv file.

Library

use std::time::UNIX_EPOCH;
use ::sequence_generator::*;

let custom_epoch = UNIX_EPOCH;  // SystemTime object representing custom epoch time. Use checked_add(Duration) for different time
let node_id_bits = 10;          // 10-bit node/worker ID
let sequence_bits = 12;         // 12-bit sequence
let unused_bits = 1;            // unused (sign) bits at the start of the ID. 1 or 0 generally
let micros_ten_power = 3;       // Operate in milliseconds (10^3 microseconds)
let node_id = 500;              // Current worker/node ID
let cooldown_ns = 1500;         // initial time in nanoseconds for exponential backoff wait after sequence is exhausted

// Generate SequenceProperties
let properties = sequence_generator::SequenceProperties::new(
        custom_epoch,
        node_id_bits,
        node_id,
        sequence_bits,
        micros_ten_power,
        unused_bits,
        cooldown_ns,
    );

// Generate an ID
let id = sequence_generator::generate_id(&mut properties).unwrap();
// Decode ID
// Timestamp
let timestamp_micros = sequence_generator::decode_id_unix_epoch_micros(id, &properties);
// Sequence
let sequence = sequence_generator::decode_sequence_id(id, &properties);
// Node ID
let id_node = sequence_generator::decode_node_id(id, &properties);

Support

Please open an issue for support.

Contributing

Please contribute using Github Flow. Create a branch, add commits, and open a pull request.

sequence-generator-rust 0.2.2