bytesbox/lib.rs
1//! # ByteBox
2//!
3//! `ByteBox` is a high-performance, memory-efficient hash table implemented in Rust, designed specifically for scenarios where keys and values are naturally represented as byte arrays (`Vec<u8>`). This crate offers a robust and flexible solution for storing and managing byte-based key-value pairs, making it ideal for applications in networking, data serialization, caching, and more.
4//!
5//! ## Key Features
6//!
7//! - **Efficient Storage:** Utilizes separate chaining with linked lists to handle hash collisions, ensuring quick insertion and retrieval even under high load.
8//! - **Dynamic Resizing:** Automatically resizes the underlying storage when the load factor exceeds a predefined threshold, maintaining optimal performance and preventing excessive collisions.
9//! - **Primitive Type Support:** Provides convenient methods to insert primitive types by converting them into their byte representations, simplifying the process of storing numerical and other basic data types.
10//! - **Iterative Access:** Implements iterator traits, allowing seamless traversal of all key-value pairs within the `ByteBox`, facilitating operations like searching, filtering, and bulk processing.
11//! - **Customizable Hashing:** Leverages Rust’s `DefaultHasher` for hashing keys, ensuring a good distribution of entries across the hash table and minimizing collision rates.
12//! - **User-Friendly Display:** Offers a formatted and colored visualization of the hash table’s structure, aiding in debugging and providing insights into the distribution of entries.
13//! - **Comprehensive Documentation:** Comes with detailed documentation for all public interfaces, making it easy for developers to integrate and utilize `ByteBox` effectively in their projects.
14//!
15//! ## Design and Implementation
16//!
17//! `ByteBox` is built around the concept of storing keys and values as byte vectors, allowing for a wide range of applications where data is naturally in byte form or can be easily converted. The core structure consists of a vector of optional `Entry` boxes, each representing a key-value pair. By using separate chaining, `ByteBox` efficiently manages collisions, ensuring that even with a large number of entries, performance remains consistent.
18//!
19//! The crate emphasizes simplicity and efficiency, providing a straightforward API for common operations such as insertion, retrieval, and removal of entries. Additionally, the support for primitive types through the `BytesPrimitives` trait simplifies the process of working with numerical data, reducing the overhead of manual byte conversions.
20//!
21//! ## Use Cases
22//!
23//! - **Networking:** Ideal for managing protocol headers and payloads where data is inherently byte-oriented.
24//! - **Data Serialization:** Facilitates the storage and retrieval of serialized data structures, enabling efficient caching and quick access.
25//! - **Caching Systems:** Serves as an effective in-memory cache for applications requiring fast lookup and storage of byte-based data.
26//! - **Configuration Management:** Allows for the storage of configuration settings and parameters in a compact byte format, enhancing performance and reducing memory footprint.
27//!
28//! ## Performance Considerations
29//!
30//! `ByteBox` is optimized for speed and memory usage, making it suitable for performance-critical applications. Its dynamic resizing mechanism ensures that the hash table maintains a low load factor, minimizing collisions and ensuring that operations remain efficient as the number of entries grows. By leveraging Rust’s ownership and borrowing principles, `ByteBox` ensures safe and concurrent access patterns without sacrificing performance.
31//!
32//! ## Getting Started
33//!
34//! Integrating `ByteBox` into your Rust project is straightforward. Simply add it as a dependency in your `Cargo.toml` and start utilizing its powerful API to manage your byte-based key-value pairs with ease and efficiency.
35//!
36//! ---
37//!
38//! ## Safety Considerations
39//!
40//!The `remove` method uses `unsafe` code to manipulate pointers for efficient removal of entries. Care has been taken to ensure this is safe, but users should be aware of the risks associated with `unsafe` blocks.
41pub mod iterator;
42pub mod primitives;
43
44use iterator::*;
45use primitives::*;
46
47#[cfg(feature = "color")]
48use bytescolor::ByteColor;
49
50use std::collections::hash_map::DefaultHasher;
51use std::fmt::{self, Display};
52use std::hash::{Hash, Hasher};
53
54/// Represents a key-value pair within the `ByteBox` hash table.
55/// Each `Entry` may point to the next entry in case of hash collisions.
56#[derive(Debug, Clone)]
57struct Entry {
58 key: Vec<u8>,
59 value: Vec<u8>,
60 next: Option<Box<Entry>>,
61}
62
63/// A hash table implementation that stores key-value pairs as byte vectors.
64/// Uses separate chaining to handle hash collisions.
65///
66/// # Examples
67///
68/// ```rust
69/// use bytesbox::ByteBox;
70///
71/// let mut bytebox = ByteBox::new();
72/// bytebox.insert(b"key1", b"value1");
73/// bytebox.insert(b"key2", b"value2");
74///
75/// assert_eq!(bytebox.get(b"key1"), Some(&b"value1"[..]));
76/// assert_eq!(bytebox.len(), 2);
77/// ```
78#[derive(Clone, Debug)]
79pub struct ByteBox {
80 cells: Vec<Option<Box<Entry>>>,
81 alloc: usize,
82 len: usize,
83 load_factor_threshold: f32,
84}
85
86impl Display for ByteBox {
87 /// Formats the `ByteBox` for display purposes.
88 ///
89 /// This implementation displays the contents in a readable key-value format.
90 ///
91 /// # Examples
92 ///
93 /// ```rust
94 /// use bytesbox::ByteBox;
95 ///
96 /// let mut bytebox = ByteBox::new();
97 /// bytebox.insert(b"key", b"value");
98 /// println!("{}", bytebox);
99 /// ```
100 fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
101 write!(f, "{{")?;
102
103 let mut first = true;
104 for (_, cell) in self.cells.iter().enumerate() {
105 let mut current = cell.as_ref();
106 while let Some(entry) = current {
107 if !first {
108 write!(f, ", ")?;
109 }
110 write!(
111 f,
112 "{:?}: {:?}",
113 String::from_utf8_lossy(&entry.key),
114 String::from_utf8_lossy(&entry.value)
115 )?;
116 current = entry.next.as_ref();
117 }
118 first = false;
119 }
120
121 write!(f, "}}")
122 }
123}
124
125impl ByteBox {
126 /// Creates a new `ByteBox` with a default initial capacity of 16 cells.
127 ///
128 /// # Examples
129 ///
130 /// ```rust
131 /// use bytesbox::ByteBox;
132 ///
133 /// let bytebox = ByteBox::new();
134 /// assert_eq!(bytebox.len(), 0);
135 /// ```
136 pub fn new() -> Self {
137 Self::prealloc(16)
138 }
139
140 /// Creates a new `ByteBox` with a specified initial capacity.
141 ///
142 /// # Arguments
143 ///
144 /// * `size` - The initial number of cells to allocate.
145 ///
146 /// # Examples
147 ///
148 /// ```rust
149 /// use bytesbox::ByteBox;
150 ///
151 /// let bytebox = ByteBox::prealloc(32);
152 /// assert_eq!(bytebox.allocation(), 32);
153 /// ```
154 pub fn prealloc(size: usize) -> Self {
155 ByteBox {
156 cells: vec![None; size],
157 alloc: size,
158 len: 0,
159 load_factor_threshold: 0.75,
160 }
161 }
162
163 /// Returns the number of key-value pairs stored in the `ByteBox`.
164 ///
165 /// # Examples
166 ///
167 /// ```rust
168 /// use bytesbox::ByteBox;
169 ///
170 /// let mut bytebox = ByteBox::new();
171 /// assert_eq!(bytebox.len(), 0);
172 /// bytebox.insert(b"key", b"value");
173 /// assert_eq!(bytebox.len(), 1);
174 /// ```
175 pub fn len(&self) -> usize {
176 self.len
177 }
178
179 /// Returns the current allocation size (number of cells) of the `ByteBox`.
180 ///
181 /// # Examples
182 ///
183 /// ```rust
184 /// use bytesbox::ByteBox;
185 ///
186 /// let bytebox = ByteBox::new();
187 /// assert_eq!(bytebox.allocation(), 16);
188 /// ```
189 pub fn allocation(&self) -> usize {
190 self.alloc
191 }
192
193 /// Inserts a key-value pair into the `ByteBox`.
194 ///
195 /// If the key already exists, its value is updated.
196 /// If the load factor exceeds the threshold after insertion, the table is resized.
197 ///
198 /// # Arguments
199 ///
200 /// * `key` - A byte slice representing the key.
201 /// * `value` - A byte slice representing the value.
202 ///
203 /// # Returns
204 ///
205 /// * `true` if a new key-value pair was inserted.
206 /// * `false` if an existing key was updated.
207 ///
208 /// # Examples
209 ///
210 /// ```rust
211 /// use bytesbox::ByteBox;
212 ///
213 /// let mut bytebox = ByteBox::new();
214 /// assert!(bytebox.insert(b"key1", b"value1"));
215 /// assert!(!bytebox.insert(b"key1", b"value2"));
216 /// assert_eq!(bytebox.get(b"key1"), Some(&b"value2"[..]));
217 /// ```
218 pub fn insert(&mut self, key: &[u8], value: &[u8]) -> bool {
219 if (self.len as f32) / (self.alloc as f32) >= self.load_factor_threshold {
220 self.resize();
221 }
222
223 let idx = Self::hash(key, self.alloc);
224 let mut current = &mut self.cells[idx];
225
226 while let Some(entry) = current {
227 if entry.key == key {
228 entry.value = value.to_vec();
229 return false;
230 }
231 current = &mut entry.next;
232 }
233
234 let new_entry = Box::new(Entry {
235 key: key.to_vec(),
236 value: value.to_vec(),
237 next: self.cells[idx].take(),
238 });
239 self.cells[idx] = Some(new_entry);
240 self.len += 1;
241
242 true
243 }
244
245 /// Inserts a key and a primitive value into the `ByteBox`.
246 ///
247 /// The primitive value is converted to its byte representation using the `BytesPrimitives` trait.
248 ///
249 /// # Type Parameters
250 ///
251 /// * `T` - A type that implements the `BytesPrimitives` trait.
252 ///
253 /// # Arguments
254 ///
255 /// * `key` - A byte slice representing the key.
256 /// * `value` - A primitive value that can be converted into bytes.
257 ///
258 /// # Examples
259 ///
260 /// ```rust
261 /// use bytesbox::ByteBox;
262 /// use bytesbox::primitives::BytesPrimitives;
263 ///
264 /// let mut bytebox = ByteBox::new();
265 /// bytebox.insert_primitive(b"number", 42u32);
266 /// assert_eq!(bytebox.get(b"number"), Some(&b"42"[..]));
267 /// ```
268 pub fn insert_primitive<T: BytesPrimitives>(&mut self, key: &[u8], value: T) {
269 self.insert(key, &value.to_bytes());
270 }
271
272 /// Retrieves the value associated with the given key.
273 ///
274 /// # Arguments
275 ///
276 /// * `key` - A byte slice representing the key to look up.
277 ///
278 /// # Returns
279 ///
280 /// * `Some(&[u8])` containing the value if the key exists.
281 /// * `None` if the key does not exist in the `ByteBox`.
282 ///
283 /// # Examples
284 ///
285 /// ```rust
286 /// use bytesbox::ByteBox;
287 ///
288 /// let mut bytebox = ByteBox::new();
289 /// bytebox.insert(b"key", b"value");
290 /// assert_eq!(bytebox.get(b"key"), Some(&b"value"[..]));
291 /// assert_eq!(bytebox.get(b"nonexistent"), None);
292 /// ```
293 pub fn get(&self, key: &[u8]) -> Option<&[u8]> {
294 let idx = Self::hash(key, self.alloc);
295 let mut current = self.cells[idx].as_ref();
296
297 while let Some(entry) = current {
298 if entry.key == key {
299 return Some(&entry.value.as_slice());
300 }
301 current = entry.next.as_ref();
302 }
303
304 None
305 }
306
307 /// Removes the key-value pair associated with the given key from the `ByteBox`.
308 ///
309 /// # Arguments
310 ///
311 /// * `key` - A byte slice representing the key to remove.
312 ///
313 /// # Returns
314 ///
315 /// * `Some(Vec<u8>)` containing the removed value if the key existed.
316 /// * `None` if the key was not found.
317 ///
318 /// # Examples
319 ///
320 /// ```rust
321 /// use bytesbox::ByteBox;
322 ///
323 /// let mut bytebox = ByteBox::new();
324 /// bytebox.insert(b"key", b"value");
325 /// assert_eq!(bytebox.remove(b"key"), Some(b"value".to_vec()));
326 /// assert_eq!(bytebox.remove(b"key"), None);
327 /// ```
328 pub fn remove(&mut self, key: &[u8]) -> Option<Vec<u8>> {
329 let idx = Self::hash(key, self.alloc);
330 let cell = &mut self.cells[idx];
331
332 let mut prev = cell as *mut Option<Box<Entry>>;
333 let mut curr = cell.as_mut();
334
335 while let Some(entry) = curr {
336 if entry.key == key {
337 let removed_val = entry.value.clone();
338 unsafe {
339 *prev = entry.next.take();
340 }
341 self.len -= 1;
342 return Some(removed_val);
343 }
344 prev = &mut entry.next as *mut Option<Box<Entry>>;
345 curr = entry.next.as_mut();
346 }
347
348 None
349 }
350
351 /// Removes all key-value pairs from the `ByteBox`, resetting it to an empty state.
352 ///
353 /// This method retains the current capacity of the hash table, allowing it to be reused
354 /// without the overhead of reallocating memory. All existing entries are removed, and
355 /// the length (`len`) is set to zero.
356 ///
357 /// # Examples
358 ///
359 /// ```rust
360 /// use bytesbox::ByteBox;
361 ///
362 /// let mut bytebox = ByteBox::new();
363 /// bytebox.insert(b"key1", b"value1");
364 /// bytebox.insert(b"key2", b"value2");
365 /// assert_eq!(bytebox.len(), 2);
366 ///
367 /// bytebox.clear();
368 /// assert_eq!(bytebox.len(), 0);
369 /// assert_eq!(bytebox.get(b"key1"), None);
370 /// assert_eq!(bytebox.get(b"key2"), None);
371 /// ```
372 pub fn clear(&mut self) {
373 for cell in &mut self.cells {
374 *cell = None;
375 }
376 self.len = 0;
377 }
378
379 /// Doubles the current capacity of the `ByteBox` and rehashes all existing entries.
380 ///
381 /// This method is called internally when the load factor exceeds the threshold.
382 fn resize(&mut self) {
383 let new_cap = self.alloc * 2;
384 let mut new_cells: Vec<Option<Box<Entry>>> = vec![None; new_cap];
385
386 for (_, cell) in self.cells.iter_mut().enumerate() {
387 let mut current = cell.take();
388 while let Some(mut entry) = current {
389 let idx = Self::hash(&entry.key, new_cap);
390 current = entry.next.take();
391 entry.next = new_cells[idx].take();
392 new_cells[idx] = Some(entry);
393 }
394 }
395
396 self.cells = new_cells;
397 self.alloc = new_cap;
398 }
399
400 /// Computes the hash index for a given key based on the current capacity.
401 ///
402 /// # Arguments
403 ///
404 /// * `key` - A byte slice representing the key to hash.
405 /// * `capacity` - The current or new capacity of the hash table.
406 ///
407 /// # Returns
408 ///
409 /// * `usize` representing the index in the cells vector.
410 fn hash(key: &[u8], capacity: usize) -> usize {
411 let mut hasher = DefaultHasher::new();
412 key.hash(&mut hasher);
413 let hash = hasher.finish();
414 let index = (hash as usize) % capacity;
415 index
416 }
417
418 /// Provides an iterator over the `ByteBox` that allows for iteration using `for` loops.
419 ///
420 /// This enables the use of `ByteBox` in contexts where an iterator is expected.
421 ///
422 /// # Examples
423 ///
424 /// ```rust
425 /// use bytesbox::ByteBox;
426 ///
427 /// let mut bytebox = ByteBox::new();
428 /// bytebox.insert(b"key1", b"value1");
429 /// bytebox.insert(b"key2", b"value2");
430 ///
431 /// for (key, value) in bytebox.iter() {
432 /// println!("{:?}: {:?}", key, value);
433 /// }
434 /// ```
435 pub fn iter(&self) -> ByteBoxIterator {
436 ByteBoxIterator {
437 byte_box: &self,
438 entry: None,
439 index: 0,
440 }
441 }
442 /// Provides a detailed, colored visualization of the hash table.
443 ///
444 /// This function prints the structure of the `ByteBox`, including each cell and its entries.
445 ///
446 /// # Examples
447 ///
448 /// ```rust
449 /// use bytesbox::ByteBox;
450 ///
451 /// let mut bytebox = ByteBox::new();
452 /// bytebox.insert(b"key", b"value");
453 /// bytebox.view_table();
454 /// ```
455 #[cfg(feature = "color")]
456 pub fn view_table(&self) {
457 // Cell Header
458 let bytebox_header = format!(
459 "{}, number of cell ({}), allocation ({})",
460 b"ByteBox".blue().bold().underline(),
461 self.len().red(),
462 self.allocation().red()
463 );
464 // Print separator before each cell
465 println!(
466 "{}",
467 "────────────────────────────────────────────────".blue()
468 );
469 println!("{}", bytebox_header);
470 for (index, cell) in self.cells.iter().enumerate() {
471 let mut current = cell.as_ref();
472 // Cell Header
473 let cell_header = format!(" Cell {}:", index).magenta();
474 // Print separator before each cell
475 println!(
476 "{}",
477 "────────────────────────────────────────────────".red()
478 );
479 println!("{}", cell_header);
480
481 while let Some(entry) = current {
482 let mut max_key_len = 0;
483 let mut max_val_len = 0;
484
485 let k_len = entry.key.len();
486 let v_len = entry.value.len();
487
488 if k_len > max_key_len {
489 max_key_len = k_len;
490 }
491 if v_len > max_val_len {
492 max_val_len = v_len;
493 }
494
495 // Determine the longest length
496 let get_longest_len = std::cmp::max(max_key_len, max_val_len);
497 let k_closing_pipe = get_longest_len - k_len;
498 let v_closing_pipe = get_longest_len - v_len;
499 // Start of the cell box
500 // key val display Start
501 println!(
502 " {}",
503 format!("+---+ +-{}-+", "-".repeat(get_longest_len))
504 );
505 // Key and value with arrows
506 println!(
507 " {}",
508 format!(
509 "| {} |->| {}{} |",
510 "k".red(),
511 format!("{}", String::from_utf8_lossy(&entry.key)).green(),
512 " ".repeat(k_closing_pipe)
513 )
514 );
515 println!(
516 " {}",
517 format!("+---+ +-{}-+", "-".repeat(get_longest_len))
518 );
519 println!(
520 " {}",
521 format!(
522 "| {} |->| {}{} |",
523 "v".red(),
524 format!("{}", String::from_utf8_lossy(&entry.value)).yellow(),
525 " ".repeat(v_closing_pipe)
526 )
527 );
528 println!(
529 " {}",
530 format!("+---+ +-{}-+", "-".repeat(get_longest_len))
531 );
532 // key val display END
533
534 // represantation on the Entry START
535 println!(" | byte_box | contains:");
536 let box_container = format!(
537 " {}{}+",
538 "| +-------------------------------",
539 "-".repeat(get_longest_len)
540 );
541 println!("{}", box_container);
542 let box_container_len = box_container.len() - 36;
543 println!(
544 " {}{}|",
545 "| | Entry: ",
546 " ".repeat(get_longest_len)
547 );
548 println!(
549 " {}{}|",
550 format!(
551 "| | - key: Vec<u8> ({})",
552 format!("{}", String::from_utf8_lossy(&entry.key)).green()
553 ),
554 " ".repeat(box_container_len - k_len)
555 );
556 println!(
557 " {}{}|",
558 format!(
559 "| | - val: Vec<u8> ({})",
560 format!("{}", String::from_utf8_lossy(&entry.value)).yellow()
561 ),
562 " ".repeat(box_container_len - v_len)
563 );
564 println!(
565 " | | - next: None {}|",
566 " ".repeat(get_longest_len)
567 );
568 println!(
569 " {}{}+",
570 "| +-------------------------------",
571 "-".repeat(get_longest_len)
572 );
573 println!(" {}{}+", "+-------", "-".repeat(box_container_len + 24));
574 current = entry.next.as_ref();
575 }
576 // Indicate that the cell is empty in red
577 println!(" {}", b"Empty".red());
578
579 // representation of the Entry END
580 }
581
582 // Separator line
583 println!(
584 "{}",
585 "────────────────────────────────────────────────".red()
586 );
587 println!(
588 "{}",
589 "────────────────────────────────────────────────".blue()
590 );
591 }
592 #[cfg(not(feature = "color"))]
593 pub fn view_table(&self) {
594 // Cell Header
595 let bytebox_header = format!(
596 "{}, number of cell ({}), allocation ({})",
597 "ByteBox",
598 self.len(),
599 self.allocation()
600 );
601 // Print separator before each cell
602 println!("{}", "────────────────────────────────────────────────");
603 println!("{}", bytebox_header);
604 for (index, cell) in self.cells.iter().enumerate() {
605 let mut current = cell.as_ref();
606 // Cell Header
607 let cell_header = format!(" Cell {}:", index);
608 // Print separator before each cell
609 println!("{}", "────────────────────────────────────────────────");
610 println!("{}", cell_header);
611
612 while let Some(entry) = current {
613 let mut max_key_len = 0;
614 let mut max_val_len = 0;
615
616 let k_len = entry.key.len();
617 let v_len = entry.value.len();
618
619 if k_len > max_key_len {
620 max_key_len = k_len;
621 }
622 if v_len > max_val_len {
623 max_val_len = v_len;
624 }
625
626 // Determine the longest length
627 let get_longest_len = std::cmp::max(max_key_len, max_val_len);
628 let k_closing_pipe = get_longest_len - k_len;
629 let v_closing_pipe = get_longest_len - v_len;
630 // Start of the cell box
631 // key val display Start
632 println!(
633 " {}",
634 format!("+---+ +-{}-+", "-".repeat(get_longest_len))
635 );
636 // Key and value with arrows
637 println!(
638 " {}",
639 format!(
640 "| {} |->| {}{} |",
641 "k",
642 format!("{}", String::from_utf8_lossy(&entry.key)),
643 " ".repeat(k_closing_pipe)
644 )
645 );
646 println!(
647 " {}",
648 format!("+---+ +-{}-+", "-".repeat(get_longest_len))
649 );
650 println!(
651 " {}",
652 format!(
653 "| {} |->| {}{} |",
654 "v",
655 format!("{}", String::from_utf8_lossy(&entry.value)),
656 " ".repeat(v_closing_pipe)
657 )
658 );
659 println!(
660 " {}",
661 format!("+---+ +-{}-+", "-".repeat(get_longest_len))
662 );
663 // key val display END
664
665 // represantation on the Entry START
666 println!(" | byte_box | contains:");
667 let box_container = format!(
668 " {}{}+",
669 "| +-------------------------------",
670 "-".repeat(get_longest_len)
671 );
672 println!("{}", box_container);
673 let box_container_len = box_container.len() - 36;
674 println!(
675 " {}{}|",
676 "| | Entry: ",
677 " ".repeat(get_longest_len)
678 );
679 println!(
680 " {}{}|",
681 format!(
682 "| | - key: Vec<u8> ({})",
683 format!("{}", String::from_utf8_lossy(&entry.key))
684 ),
685 " ".repeat(box_container_len - k_len)
686 );
687 println!(
688 " {}{}|",
689 format!(
690 "| | - val: Vec<u8> ({})",
691 format!("{}", String::from_utf8_lossy(&entry.value))
692 ),
693 " ".repeat(box_container_len - v_len)
694 );
695 println!(
696 " | | - next: None {}|",
697 " ".repeat(get_longest_len)
698 );
699 println!(
700 " {}{}+",
701 "| +-------------------------------",
702 "-".repeat(get_longest_len)
703 );
704 println!(" {}{}+", "+-------", "-".repeat(box_container_len + 24));
705 current = entry.next.as_ref();
706 }
707 // Indicate that the cell is empty in red
708 println!(" {}", "Empty");
709
710 // representation of the Entry END
711 }
712
713 // Separator line
714 println!("{}", "────────────────────────────────────────────────");
715 println!("{}", "────────────────────────────────────────────────");
716 }
717}