sfid 0.1.2 - Docs.rs

[English](#en) | [中文](#zh)

---

<a id="en"></a>

# sfid : Distributed Snowflake ID Generator with Auto-Allocated Process ID

## Features

- Lock-free atomic ID generation
- Configurable bit layout via `Layout` trait
- Default: 36-bit timestamp (seconds), 13-bit process ID, 15-bit sequence
- Redis-based automatic process ID allocation
- Heartbeat mechanism with auto-release on crash
- Clock drift tolerance (sequence borrowing + warning log)
- Sequence exhaustion handling (timestamp advance)
- Configurable epoch

## Installation

```sh
cargo add sfid
```

With specific features:

```sh
cargo add sfid -F snowflake,auto_pid,parse
```

## Quick Start

### Manual Process ID

```rust
use sfid::{Snowflake, EPOCH};

let sf = Snowflake::new(EPOCH, 1);
let id = sf.next();
println!("{id}");
```

### Auto-Allocated Process ID (Redis)

```rust
use sfid::{Snowflake, EPOCH};

#[tokio::main]
async fn main() -> sfid::Result<()> {
  let sf = Snowflake::auto("myapp", EPOCH).await?;
  let id = sf.next();
  println!("{id}");
  Ok(())
}
```

### Parse ID

```rust
use sfid::parse;

let parsed = parse(id);
println!("ts: {}, pid: {}, seq: {}", parsed.ts, parsed.pid, parsed.seq);
```

### Custom Bit Layout

```rust
use sfid::{Layout, Snowflake};

struct MyLayout;
impl Layout for MyLayout {
  const TS_BITS: u32 = 41;
  const PID_BITS: u32 = 10;
  const SEQ_BITS: u32 = 13;
}

let sf = Snowflake::<MyLayout>::new(my_epoch, 1);
```

## API Reference

### Traits

#### `Layout`

Configurable bit layout for ID generation.

| Constant | Description |
|----------|-------------|
| `TS_BITS` | Timestamp bits |
| `PID_BITS` | Process ID bits |
| `SEQ_BITS` | Sequence bits |
| `SEQ_MASK` | Derived: `(1 << SEQ_BITS) - 1` |
| `PID_MASK` | Derived: `(1 << PID_BITS) - 1` |
| `TS_MASK` | Derived: `(1 << TS_BITS) - 1` |
| `TS_SHIFT` | Derived: `SEQ_BITS + PID_BITS` |
| `MAX_PID` | Derived: `1 << PID_BITS` |

### Constants

| Name | Type | Description |
|------|------|-------------|
| `EPOCH` | `u64` | Default epoch: 2025-12-22 00:00:00 UTC (seconds) |

### Structs

#### `Snowflake<L: Layout = DefaultLayout>`

ID generator with atomic state.

| Method | Description |
|--------|-------------|
| `new(epoch, pid)` | Create with manual process ID |
| `auto(app, epoch)` | Create with Redis-allocated process ID |
| `next()` | Generate next ID |

#### `DefaultLayout`

Default bit layout: 36-13-15.

#### `Pid`

Process ID handle with heartbeat. Stops heartbeat on drop.

| Method | Description |
|--------|-------------|
| `id()` | Get allocated process ID |

#### `ParsedId`

Parsed ID components.

| Field | Type | Description |
|-------|------|-------------|
| `ts` | `u64` | Timestamp offset from epoch (seconds) |
| `pid` | `u16` | Process ID |
| `seq` | `u16` | Sequence number |

### Functions

| Name | Description |
|------|-------------|
| `allocate::<L>(app)` | Allocate process ID from Redis |
| `parse(id)` | Parse ID with default layout |
| `parse_with::<L>(id)` | Parse ID with custom layout |

## ID Structure (Default Layout)

64-bit signed integer with second-precision timestamp:

```
┌───────┬──────────────────────────┬─────────────┬─────────────┐
│ 1 bit │        36 bits           │   13 bits   │   15 bits   │
│ sign  │    timestamp (sec)       │ process ID  │  sequence   │
│  (0)  │   (offset from epoch)    │  (0-8191)   │  (0-32767)  │
└───────┴──────────────────────────┴─────────────┴─────────────┘
```

- Timestamp: 2^36 seconds ≈ **2177 years** from epoch (2025-12-22 to ~4202)
- Process ID: 8192 concurrent instances
- Sequence: 32768 IDs per second per instance

## Architecture

```mermaid
graph TD
  A[Application] --> B[Snowflake]
  B --> C{auto_pid?}
  C -->|Yes| D[allocate]
  D --> E[Redis]
  E --> F[Pid + Heartbeat]
  F --> B
  C -->|No| G[Manual PID]
  G --> B
  B --> H[next]
  H --> I[Atomic State]
  I --> J[ID]
```

## Process ID Allocation

### Redis Key Format

```
sfid:{app}:{pid_le_bytes}
```

### Heartbeat

- Interval: 3 minutes
- Expiration: 10 minutes
- Auto-release on process exit (Drop trait)

## Clock Drift Handling

When clock drifts backward:
- Sequence borrowing continues from last timestamp
- If drift exceeds 1 second, logs warning via `tracing::warn`
- When sequence exhausted, timestamp advances automatically (borrows future time)

This ensures ID uniqueness even under NTP adjustments or VM migrations.

## Tech Stack

| Crate | Purpose |
|-------|---------|
| coarsetime | Fast timestamp retrieval |
| fred | Redis client |
| tokio | Async runtime |
| uuid | Unique identifier generation |
| thiserror | Error handling |
| tracing | Logging |

## Why "Process ID" Instead of "Machine ID"?

Traditional Snowflake implementations use "machine ID" or "worker ID", assuming one generator per physical machine. This assumption breaks in modern deployments:

- Containers: Multiple instances on same host
- Kubernetes: Pods scale dynamically
- Serverless: No persistent machine identity
- Microservices: Multiple services per node

"Process ID" (pid) better reflects reality — each running process needs unique identifier, regardless of physical location.

## History

In 2010, Twitter [announced Snowflake](https://blog.twitter.com/engineering/en_us/a/2010/announcing-snowflake) — composing timestamp, worker number, and sequence number into 64 bits.

Original Twitter bit allocation (millisecond precision):
- 41 bits timestamp: ~69 years
- 10 bits machine ID: 1024 generators
- 12 bits sequence: 4096 IDs/ms

Variations emerged:
- Discord (2015): epoch 2015-01-01
- Instagram: 41-13-10 (ms)
- Sonyflake: adjusted for longer lifespan

**sfid uses second-precision** (36-13-15) for ~2177 years lifespan, with sequence borrowing to handle bursts exceeding 32768/sec.

---

## About

This project is an open-source component of [js0.site ⋅ Refactoring the Internet Plan](https://js0.site).

We are redefining the development paradigm of the Internet in a componentized way. Welcome to follow us:

* [Google Group](https://groups.google.com/g/js0-site)
* [js0site.bsky.social](https://bsky.app/profile/js0site.bsky.social)

---

<a id="zh"></a>

# sfid : 自动分配进程号的分布式雪花 ID 生成器

## 特性

- 无锁原子 ID 生成
- 可配置位布局（`Layout` trait）
- 默认：36 位时间戳（秒）、13 位进程号、15 位序列号
- 基于 Redis 自动分配进程号
- 心跳机制，进程崩溃自动释放
- 时钟回拨容错（序列号借用 + 告警日志）
- 序列号耗尽处理（时间戳推进，借用未来时间）
- 可配置纪元

## 安装

```sh
cargo add sfid
```

指定特性：

```sh
cargo add sfid -F snowflake,auto_pid,parse
```

## 快速开始

### 手动指定进程号

```rust
use sfid::{Snowflake, EPOCH};

let sf = Snowflake::new(EPOCH, 1);
let id = sf.next();
println!("{id}");
```

### 自动分配进程号 (Redis)

```rust
use sfid::{Snowflake, EPOCH};

#[tokio::main]
async fn main() -> sfid::Result<()> {
  let sf = Snowflake::auto("myapp", EPOCH).await?;
  let id = sf.next();
  println!("{id}");
  Ok(())
}
```

### 解析 ID

```rust
use sfid::parse;

let parsed = parse(id);
println!("ts: {}, pid: {}, seq: {}", parsed.ts, parsed.pid, parsed.seq);
```

### 自定义位布局

```rust
use sfid::{Layout, Snowflake};

struct MyLayout;
impl Layout for MyLayout {
  const TS_BITS: u32 = 41;
  const PID_BITS: u32 = 10;
  const SEQ_BITS: u32 = 13;
}

let sf = Snowflake::<MyLayout>::new(my_epoch, 1);
```

## API 参考

### Traits

#### `Layout`

可配置的 ID 位布局。

| 常量 | 说明 |
|------|------|
| `TS_BITS` | 时间戳位数 |
| `PID_BITS` | 进程号位数 |
| `SEQ_BITS` | 序列号位数 |
| `SEQ_MASK` | 派生：`(1 << SEQ_BITS) - 1` |
| `PID_MASK` | 派生：`(1 << PID_BITS) - 1` |
| `TS_MASK` | 派生：`(1 << TS_BITS) - 1` |
| `TS_SHIFT` | 派生：`SEQ_BITS + PID_BITS` |
| `MAX_PID` | 派生：`1 << PID_BITS` |

### 常量

| 名称 | 类型 | 说明 |
|------|------|------|
| `EPOCH` | `u64` | 默认纪元：2025-12-22 00:00:00 UTC（秒） |

### 结构体

#### `Snowflake<L: Layout = DefaultLayout>`

原子状态 ID 生成器。

| 方法 | 说明 |
|------|------|
| `new(epoch, pid)` | 手动指定进程号创建 |
| `auto(app, epoch)` | Redis 自动分配进程号创建 |
| `next()` | 生成下个 ID |

#### `DefaultLayout`

默认位布局：36-13-15。

#### `Pid`

带心跳的进程号句柄，drop 时停止心跳。

| 方法 | 说明 |
|------|------|
| `id()` | 获取分配的进程号 |

#### `ParsedId`

解析后的 ID 组件。

| 字段 | 类型 | 说明 |
|------|------|------|
| `ts` | `u64` | 相对纪元的时间戳偏移（秒） |
| `pid` | `u16` | 进程号 |
| `seq` | `u16` | 序列号 |

### 函数

| 名称 | 说明 |
|------|------|
| `allocate::<L>(app)` | 从 Redis 分配进程号 |
| `parse(id)` | 使用默认布局解析 ID |
| `parse_with::<L>(id)` | 使用自定义布局解析 ID |

## ID 结构（默认布局）

秒精度时间戳的 64 位有符号整数：

```
┌───────┬──────────────────────────┬─────────────┬─────────────┐
│ 1 bit │        36 bits           │   13 bits   │   15 bits   │
│ 符号  │      时间戳（秒）          │   进程号    │   序列号    │
│  (0)  │     (相对纪元偏移)        │  (0-8191)   │  (0-32767)  │
└───────┴──────────────────────────┴─────────────┴─────────────┘
```

- 时间戳：2^36 秒 ≈ **2177 年**（从 2025-12-22 到 ~4202 年）
- 进程号：8192 并发实例
- 序列号：每实例每秒 32768 ID

## 架构

```mermaid
graph TD
  A[应用] --> B[Snowflake]
  B --> C{auto_pid?}
  C -->|是| D[allocate]
  D --> E[Redis]
  E --> F[Pid + 心跳]
  F --> B
  C -->|否| G[手动 PID]
  G --> B
  B --> H[next]
  H --> I[原子状态]
  I --> J[ID]
```

## 进程号分配

### Redis 键格式

```
sfid:{app}:{pid_le_bytes}
```

### 心跳

- 间隔：3 分钟
- 过期：10 分钟
- 进程退出自动释放 (Drop trait)

## 时钟回拨处理

当时钟回拨时：
- 序列号借用，继续使用上次时间戳
- 回拨超过 1 秒，通过 `tracing::warn` 记录告警
- 序列号耗尽时，时间戳自动推进（借用未来时间）

确保 NTP 校时或虚拟机迁移时 ID 唯一性。

## 技术栈

| Crate | 用途 |
|-------|------|
| coarsetime | 快速时间戳获取 |
| fred | Redis 客户端 |
| tokio | 异步运行时 |
| uuid | 唯一标识生成 |
| thiserror | 错误处理 |
| tracing | 日志 |

## 为何用"进程号"而非"机器号"？

传统雪花实现使用"机器号"或"工作节点号"，假设每台物理机运行一个生成器。这一假设在现代部署中已不成立：

- 容器：同一主机运行多个实例
- Kubernetes：Pod 动态伸缩
- Serverless：无持久机器身份
- 微服务：单节点多服务

"进程号"(pid) 更贴合现实——每个运行中的进程需要唯一标识，与物理位置无关。

## 历史

2010 年，Twitter [宣布 Snowflake](https://blog.twitter.com/engineering/en_us/a/2010/announcing-snowflake)——将时间戳、工作节点号、序列号组合成 64 位。

原版 Twitter 位分配（毫秒精度）：
- 41 位时间戳：约 69 年
- 10 位机器号：1024 个生成器
- 12 位序列号：每毫秒 4096 ID

衍生变体：
- Discord (2015)：纪元 2015-01-01
- Instagram：41-13-10（毫秒）
- Sonyflake：调整位分配以延长寿命

**sfid 采用秒精度**（36-13-15），寿命约 2177 年，通过序列号借用处理超过 32768/秒的突发流量。

---

## 关于

本项目为 [js0.site ⋅ 重构互联网计划](https://js0.site) 的开源组件。

我们正在以组件化的方式重新定义互联网的开发范式，欢迎关注：

* [谷歌邮件列表](https://groups.google.com/g/js0-site)
* [js0site.bsky.social](https://bsky.app/profile/js0site.bsky.social)