zset
English
- Overview
- Features
- When to Use
zset? - Design and Tech Stack
- Usage
- File Structure
- A Little Story: The Power of Hybrid Data Structures
Overview
zset is a high-performance, thread-safe sorted set data structure for Rust, inspired by Redis's popular ZSET. It combines the features of a hash map and a sorted list, mapping unique members to scores while maintaining a sorted order based on these scores.
This makes it an ideal choice for applications that require both fast O(1) lookups by member and efficient ordered traversal by score, such as leaderboards, real-time analytics, and priority queues. This implementation is generic and can be used with any data types for members and scores that satisfy the required traits.
Features
- Thread-Safe: Designed for high-concurrency scenarios, using fine-grained locking to minimize contention.
- High Performance: Leverages
DashMapfor O(1) member lookups andSortedVecfor efficient O(log N) rank queries. - Generic: Works with any member and score types that implement
Ord,Hash,Send,Sync, and other necessary traits. - Rich API: Offers a comprehensive set of operations, including adding, removing, querying scores and ranks, retrieving ranges, and performing set operations like union and intersection.
When to Use zset?
zset is particularly useful in scenarios where you need to maintain an ordered collection of unique items and frequently query them by score or rank.
- Leaderboards: Easily manage player scores and retrieve top rankings.
- Real-time Analytics: Track and sort events by timestamp or priority, like displaying the most recent unique visitors.
- Rate Limiting: Keep track of request timestamps in a sliding window to enforce rate limits.
- Priority Queues: Implement job schedulers where jobs are prioritized by a score and can be dynamically added or removed.
Design and Tech Stack
The design of zset revolves around a hybrid approach, using two core data structures to achieve both fast lookups and efficient sorting:
-
DashMap<Arc<M>, S>: A concurrent hash map that provides O(1) average-time complexity for looking up a member's score.DashMapis chosen for its excellent performance in highly contended, multi-threaded environments, as it shards the map internally to allow concurrent access.Arc<M>is used to allow members to be shared between the map and the sorted list without duplicating the data. -
parking_lot::RwLock<SortedVec<ScoreMember<M, S>>>: ASortedVecprotected by aparking_lot::RwLock.SortedVecis a vector-based data structure that keeps its elements sorted, offering O(log N) searches and O(N) insertions/deletions. Theparking_lot::RwLockis used for its high performance, especially in scenarios with many readers, providing faster and more lightweight locking thanstd::sync::RwLock.
This dual-structure approach provides a balanced performance profile:
- Score & Member Lookup: O(1) on average.
- Rank Lookup: O(log N).
- Add/Remove Operations: O(N). This complexity is a result of O(log N) for the search plus O(N) for shifting elements in the sorted vector. This is different from the O(log N) complexity of implementations based on skip lists (like Redis's) or balanced trees.
- Range Queries: O(log N + K), where K is the size of the range.
Usage
Here are some examples based on the integration tests:
1. Basic Operations (Add, Remove, Cardinality)
use ;
let zset = new;
// Add a new member. Returns `false` as it's a new element.
assert!;
assert_eq!;
// Add another new member.
assert!;
assert_eq!;
// Update the score of an existing member. Returns `true`.
assert!;
assert_eq!;
// Remove a member.
assert!;
assert_eq!;
2. Querying Score and Rank The rank is 0-based, with the lowest score having rank 0.
use ;
let zset = new;
zset.add;
zset.add;
zset.add;
// Get the score of a member.
assert_eq!;
// Get the 0-based rank of a member.
assert_eq!;
assert_eq!;
assert_eq!;
3. Working with Ranges You can retrieve ranges of members, with or without their scores.
use ;
let zset = new;
zset.add;
zset.add;
zset.add;
zset.add;
zset.add;
// Get members in rank range 0..3
let r: = zset.range.iter.map.collect;
assert_eq!;
// Get members and scores in rank range 1..3
let r_with_scores: = zset
.range_with_scores
.iter
.map
.collect;
assert_eq!;
4. Set Operations (Union & Intersection)
The library supports | (union) and & (intersection) operators. When members conflict, scores are summed.
use ;
let zset1 = new;
zset1.add;
zset1.add;
let zset2 = new;
zset2.add;
zset2.add;
// Union: combines members, sums scores for common members.
let zset_union = &zset1 | &zset2;
assert_eq!;
assert_eq!; // 2 + 3
// Intersection: keeps only common members, sums their scores.
let zset_intersection = &zset1 & &zset2;
assert_eq!;
assert_eq!;
File Structure
.
├── Cargo.toml # Package manifest
├── README.mdt # Readme template file
├── src
│ ├── lib.rs # Main library file, defines the Api trait and Zset struct
│ ├── score_member.rs # Defines the internal ScoreMember struct
│ └── zset_impl.rs # The core implementation of the Zset logic
└── tests
└── main.rs # Integration tests
A Little Story: The Power of Hybrid Data Structures
The concept of the sorted set was popularized by Redis, an in-memory data structure store created by Salvatore Sanfilippo (antirez). The story goes that Salvatore needed a faster way to handle real-time analytics for a web startup. Existing databases were too slow for operations like finding the top N items in a list that was constantly changing. This led him to create Redis and, within it, the versatile zset.
The brilliance of the zset lies in its hybrid nature. In Redis, it's implemented using a hash table and a skip list. The hash table provides O(1) access to scores, while the skip list keeps the members sorted, allowing for fast range queries. This combination solves a fundamental trade-off in computer science: fast lookups vs. maintaining order. Our Rust zset follows this same philosophy, pairing a modern concurrent hash map (DashMap) with a sorted vector structure (SortedVec) to achieve a similar blend of high performance and powerful functionality, making it a potent tool for building responsive, data-intensive applications in Rust.
中文
概览
zset 是一个为 Rust 设计的高性能、线程安全的排序集数据结构,其灵感来源于 Redis 中广受欢迎的 ZSET。它结合了哈希映射和有序列表的特性,将唯一的成员映射到分数,同时根据这些分数维护一个有序的排列。
这使得它成为需要 O(1) 成员快速查找和高效有序遍历的应用的理想选择,例如排行榜、实时分析和优先级队列。此实现是通用的,可用于任何满足所需 trait 的成员和分数数据类型。
特性
- 线程安全:专为高并发场景设计,使用细粒度锁来最小化竞争。
- 高性能:利用
DashMap实现 O(1) 的成员查找,以及SortedVec实现高效的 O(log N) 排名查询。 - 通用性:适用于任何实现了
Ord、Hash、Send、Sync等必要 trait 的成员和分数类型。 - 丰富的 API:提供全面的操作集,包括添加、删除、查询分数和排名、检索范围以及执行并集和交集等集合操作。
何时使用 zset?
zset 在需要维护唯一项的有序集合,并频繁按分数或排名进行查询的场景中特别有用。
- 排行榜:轻松管理玩家分数并检索最高排名。
- 实时分析:按时间戳或优先级跟踪和排序事件,例如显示最近的独立访客。
- 速率限制:在滑动窗口中跟踪请求时间戳以实施速率限制。
- 优先级队列:实现作业调度器,其中作业按分数确定优先级,并且可以动态添加或删除。
设计与技术栈
zset 的设计围绕一种混合方法,使用两个核心数据结构来实现快速查找和高效排序:
-
DashMap<Arc<M>, S>:一个并发哈希映射,为查找成员分数提供 O(1) 的平均时间复杂度。选择DashMap是因为它在高度竞争的多线程环境中表现出色,它在内部分片以允许并发访问。Arc<M>用于在映射和排序列表之间共享成员,而无需复制数据。 -
parking_lot::RwLock<SortedVec<ScoreMember<M, S>>>:一个由parking_lot::RwLock保护的SortedVec。SortedVec是一个基于向量的数据结构,它使其元素保持排序,提供 O(log N) 的搜索和 O(N) 的插入/删除。parking_lot::RwLock因其高性能而被使用,尤其是在有许多读者的场景中,它提供了比std::sync::RwLock更快、更轻量级的锁定。
这种双重结构的方法提供了均衡的性能特征:
- 分数和成员查找:平均 O(1)。
- 排名查找:O(log N)。
- 添加/删除操作:O(N)。该复杂度源于 O(log N) 的搜索加上 O(N) 的元素移动。需要注意的是,这与基于跳表(如 Redis)或平衡树的实现的 O(log N) 复杂度不同。
- 范围查询:O(log N + K),其中 K 是范围的大小。
使用示例
以下是一些基于集成测试的示例:
1. 基本操作(添加、删除、基数)
use ;
let zset = new;
// 添加一个新成员。返回 `false` 因为它是一个新元素。
assert!;
assert_eq!;
// 添加另一个新成员。
assert!;
assert_eq!;
// 更新一个已存在成员的分数。返回 `true`。
assert!;
assert_eq!;
// 删除一个成员。
assert!;
assert_eq!;
2. 查询分数和排名 排名是 0-based 的,分数最低的成员排名为 0。
use ;
let zset = new;
zset.add;
zset.add;
zset.add;
// 获取成员的分数。
assert_eq!;
// 获取成员的 0-based 排名。
assert_eq!;
assert_eq!;
assert_eq!;
3. 处理范围 您可以检索成员的范围,无论是否带有分数。
use ;
let zset = new;
zset.add;
zset.add;
zset.add;
zset.add;
zset.add;
// 获取排名范围 0..3 内的成员
let r: = zset.range.iter.map.collect;
assert_eq!;
// 获取排名范围 1..3 内的成员及其分数
let r_with_scores: = zset
.range_with_scores
.iter
.map
.collect;
assert_eq!;
4. 集合操作(并集与交集)
该库支持 | (并集) 和 & (交集) 操作符。当成员冲突时,分数会相加。
use ;
let zset1 = new;
zset1.add;
zset1.add;
let zset2 = new;
zset2.add;
zset2.add;
// 并集:合并成员,对共同成员的分数求和。
let zset_union = &zset1 | &zset2;
assert_eq!;
assert_eq!; // 2 + 3
// 交集:仅保留共同成员,并对它们的分数求和。
let zset_intersection = &zset1 & &zset2;
assert_eq!;
assert_eq!;
文件结构
.
├── Cargo.toml # 包清单文件
├── README.mdt # Readme 模板文件
├── src
│ ├── lib.rs # 主库文件,定义了 Api trait 和 Zset 结构体
│ ├── score_member.rs # 定义了内部的 ScoreMember 结构体
│ └── zset_impl.rs # Zset 逻辑的核心实现
└── tests
└── main.rs # 集成测试
一个小故事:混合数据结构的力量
排序集(Sorted Set)的概念由内存数据结构存储 Redis 的创始人 Salvatore Sanfilippo (antirez) 推广开来。据说,Salvatore 当时需要一种更快的方法来为一家网络创业公司处理实时分析。他发现现有的数据库对于他需要的操作来说太慢了,比如在一个不断变化的列表中查找前 N 个项目。这促使他创建了 Redis,并在其中设计了功能多样的 zset。
zset 的精妙之处在于其混合性质。在 Redis 中,它由哈希表和跳表(Skip List)结合实现。哈希表提供 O(1) 的分数访问,而跳表则保持成员的有序性,从而实现快速的范围查询。这种组合解决了计算机科学中的一个基本权衡:快速查找与维护顺序。我们的 Rust zset 遵循同样的理念,将现代的并发哈希映射(DashMap)与排序向量结构(SortedVec)配对,以实现高性能和强大功能的类似融合,使其成为在 Rust 中构建响应迅速、数据密集型应用的有力工具。
About
This project is an open-source component of i18n.site ⋅ Internationalization Solution.
-
i18 : MarkDown Command Line Translation Tool
The translation perfectly maintains the Markdown format.
It recognizes file changes and only translates the modified files.
The translated Markdown content is editable; if you modify the original text and translate it again, manually edited translations will not be overwritten (as long as the original text has not been changed).
-
i18n.site : MarkDown Multi-language Static Site Generator
Optimized for a better reading experience
关于
本项目为 i18n.site ⋅ 国际化解决方案 的开源组件。
-
翻译能够完美保持 Markdown 的格式。能识别文件的修改,仅翻译有变动的文件。
Markdown 翻译内容可编辑;如果你修改原文并再次机器翻译,手动修改过的翻译不会被覆盖 ( 如果这段原文没有被修改 )。
-
i18n.site : MarkDown 多语言静态站点生成器 为阅读体验而优化。