bytesbox/
lib.rs

1//! # ByteBox
2//!
3//! `ByteBox` is a high-performance, memory-efficient hash table implemented in Rust, designed specifically for scenarios where keys and values are naturally represented as byte arrays (`Vec<u8>`). This crate offers a robust and flexible solution for storing and managing byte-based key-value pairs, making it ideal for applications in networking, data serialization, caching, and more.
4//!
5//! ## Key Features
6//!
7//! - **Efficient Storage:** Utilizes separate chaining with linked lists to handle hash collisions, ensuring quick insertion and retrieval even under high load.
8//! - **Dynamic Resizing:** Automatically resizes the underlying storage when the load factor exceeds a predefined threshold, maintaining optimal performance and preventing excessive collisions.
9//! - **Primitive Type Support:** Provides convenient methods to insert primitive types by converting them into their byte representations, simplifying the process of storing numerical and other basic data types.
10//! - **Iterative Access:** Implements iterator traits, allowing seamless traversal of all key-value pairs within the `ByteBox`, facilitating operations like searching, filtering, and bulk processing.
11//! - **Customizable Hashing:** Leverages Rust’s `DefaultHasher` for hashing keys, ensuring a good distribution of entries across the hash table and minimizing collision rates.
12//! - **User-Friendly Display:** Offers a formatted and colored visualization of the hash table’s structure, aiding in debugging and providing insights into the distribution of entries.
13//! - **Comprehensive Documentation:** Comes with detailed documentation for all public interfaces, making it easy for developers to integrate and utilize `ByteBox` effectively in their projects.
14//!
15//! ## Design and Implementation
16//!
17//! `ByteBox` is built around the concept of storing keys and values as byte vectors, allowing for a wide range of applications where data is naturally in byte form or can be easily converted. The core structure consists of a vector of optional `Entry` boxes, each representing a key-value pair. By using separate chaining, `ByteBox` efficiently manages collisions, ensuring that even with a large number of entries, performance remains consistent.
18//!
19//! The crate emphasizes simplicity and efficiency, providing a straightforward API for common operations such as insertion, retrieval, and removal of entries. Additionally, the support for primitive types through the `BytesPrimitives` trait simplifies the process of working with numerical data, reducing the overhead of manual byte conversions.
20//!
21//! ## Use Cases
22//!
23//! - **Networking:** Ideal for managing protocol headers and payloads where data is inherently byte-oriented.
24//! - **Data Serialization:** Facilitates the storage and retrieval of serialized data structures, enabling efficient caching and quick access.
25//! - **Caching Systems:** Serves as an effective in-memory cache for applications requiring fast lookup and storage of byte-based data.
26//! - **Configuration Management:** Allows for the storage of configuration settings and parameters in a compact byte format, enhancing performance and reducing memory footprint.
27//!
28//! ## Performance Considerations
29//!
30//! `ByteBox` is optimized for speed and memory usage, making it suitable for performance-critical applications. Its dynamic resizing mechanism ensures that the hash table maintains a low load factor, minimizing collisions and ensuring that operations remain efficient as the number of entries grows. By leveraging Rust’s ownership and borrowing principles, `ByteBox` ensures safe and concurrent access patterns without sacrificing performance.
31//!
32//! ## Getting Started
33//!
34//! Integrating `ByteBox` into your Rust project is straightforward. Simply add it as a dependency in your `Cargo.toml` and start utilizing its powerful API to manage your byte-based key-value pairs with ease and efficiency.
35//!
36//! ---
37//!
38//! ## Safety Considerations
39//!
40//!The `remove` method uses `unsafe` code to manipulate pointers for efficient removal of entries. Care has been taken to ensure this is safe, but users should be aware of the risks associated with `unsafe` blocks.
41pub mod iterator;
42pub mod primitives;
43
44use iterator::*;
45use primitives::*;
46
47#[cfg(feature = "color")]
48use bytescolor::ByteColor;
49
50use std::collections::hash_map::DefaultHasher;
51use std::fmt::{self, Display};
52use std::hash::{Hash, Hasher};
53
54/// Represents a key-value pair within the `ByteBox` hash table.
55/// Each `Entry` may point to the next entry in case of hash collisions.
56#[derive(Debug, Clone)]
57struct Entry {
58    key: Vec<u8>,
59    value: Vec<u8>,
60    next: Option<Box<Entry>>,
61}
62
63/// A hash table implementation that stores key-value pairs as byte vectors.
64/// Uses separate chaining to handle hash collisions.
65///
66/// # Examples
67///
68/// ```rust
69/// use bytesbox::ByteBox;
70///
71/// let mut bytebox = ByteBox::new();
72/// bytebox.insert(b"key1", b"value1");
73/// bytebox.insert(b"key2", b"value2");
74///
75/// assert_eq!(bytebox.get(b"key1"), Some(&b"value1"[..]));
76/// assert_eq!(bytebox.len(), 2);
77/// ```
78#[derive(Clone, Debug)]
79pub struct ByteBox {
80    cells: Vec<Option<Box<Entry>>>,
81    alloc: usize,
82    len: usize,
83    load_factor_threshold: f32,
84}
85
86impl Display for ByteBox {
87    /// Formats the `ByteBox` for display purposes.
88    ///
89    /// This implementation displays the contents in a readable key-value format.
90    ///
91    /// # Examples
92    ///
93    /// ```rust
94    /// use bytesbox::ByteBox;
95    ///
96    /// let mut bytebox = ByteBox::new();
97    /// bytebox.insert(b"key", b"value");
98    /// println!("{}", bytebox);
99    /// ```
100    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
101        write!(f, "{{")?;
102
103        let mut first = true;
104        for (_, cell) in self.cells.iter().enumerate() {
105            let mut current = cell.as_ref();
106            while let Some(entry) = current {
107                if !first {
108                    write!(f, ", ")?;
109                }
110                write!(
111                    f,
112                    "{:?}: {:?}",
113                    String::from_utf8_lossy(&entry.key),
114                    String::from_utf8_lossy(&entry.value)
115                )?;
116                current = entry.next.as_ref();
117            }
118            first = false;
119        }
120
121        write!(f, "}}")
122    }
123}
124
125impl ByteBox {
126    /// Creates a new `ByteBox` with a default initial capacity of 16 cells.
127    ///
128    /// # Examples
129    ///
130    /// ```rust
131    /// use bytesbox::ByteBox;
132    ///
133    /// let bytebox = ByteBox::new();
134    /// assert_eq!(bytebox.len(), 0);
135    /// ```
136    pub fn new() -> Self {
137        Self::prealloc(16)
138    }
139
140    /// Creates a new `ByteBox` with a specified initial capacity.
141    ///
142    /// # Arguments
143    ///
144    /// * `size` - The initial number of cells to allocate.
145    ///
146    /// # Examples
147    ///
148    /// ```rust
149    /// use bytesbox::ByteBox;
150    ///
151    /// let bytebox = ByteBox::prealloc(32);
152    /// assert_eq!(bytebox.allocation(), 32);
153    /// ```
154    pub fn prealloc(size: usize) -> Self {
155        ByteBox {
156            cells: vec![None; size],
157            alloc: size,
158            len: 0,
159            load_factor_threshold: 0.75,
160        }
161    }
162
163    /// Returns the number of key-value pairs stored in the `ByteBox`.
164    ///
165    /// # Examples
166    ///
167    /// ```rust
168    /// use bytesbox::ByteBox;
169    ///
170    /// let mut bytebox = ByteBox::new();
171    /// assert_eq!(bytebox.len(), 0);
172    /// bytebox.insert(b"key", b"value");
173    /// assert_eq!(bytebox.len(), 1);
174    /// ```
175    pub fn len(&self) -> usize {
176        self.len
177    }
178
179    /// Returns the current allocation size (number of cells) of the `ByteBox`.
180    ///
181    /// # Examples
182    ///
183    /// ```rust
184    /// use bytesbox::ByteBox;
185    ///
186    /// let bytebox = ByteBox::new();
187    /// assert_eq!(bytebox.allocation(), 16);
188    /// ```
189    pub fn allocation(&self) -> usize {
190        self.alloc
191    }
192
193    /// Inserts a key-value pair into the `ByteBox`.
194    ///
195    /// If the key already exists, its value is updated.
196    /// If the load factor exceeds the threshold after insertion, the table is resized.
197    ///
198    /// # Arguments
199    ///
200    /// * `key` - A byte slice representing the key.
201    /// * `value` - A byte slice representing the value.
202    ///
203    /// # Returns
204    ///
205    /// * `true` if a new key-value pair was inserted.
206    /// * `false` if an existing key was updated.
207    ///
208    /// # Examples
209    ///
210    /// ```rust
211    /// use bytesbox::ByteBox;
212    ///
213    /// let mut bytebox = ByteBox::new();
214    /// assert!(bytebox.insert(b"key1", b"value1"));
215    /// assert!(!bytebox.insert(b"key1", b"value2"));
216    /// assert_eq!(bytebox.get(b"key1"), Some(&b"value2"[..]));
217    /// ```
218    pub fn insert(&mut self, key: &[u8], value: &[u8]) -> bool {
219        if (self.len as f32) / (self.alloc as f32) >= self.load_factor_threshold {
220            self.resize();
221        }
222
223        let idx = Self::hash(key, self.alloc);
224        let mut current = &mut self.cells[idx];
225
226        while let Some(entry) = current {
227            if entry.key == key {
228                entry.value = value.to_vec();
229                return false;
230            }
231            current = &mut entry.next;
232        }
233
234        let new_entry = Box::new(Entry {
235            key: key.to_vec(),
236            value: value.to_vec(),
237            next: self.cells[idx].take(),
238        });
239        self.cells[idx] = Some(new_entry);
240        self.len += 1;
241
242        true
243    }
244
245    /// Inserts a key and a primitive value into the `ByteBox`.
246    ///
247    /// The primitive value is converted to its byte representation using the `BytesPrimitives` trait.
248    ///
249    /// # Type Parameters
250    ///
251    /// * `T` - A type that implements the `BytesPrimitives` trait.
252    ///
253    /// # Arguments
254    ///
255    /// * `key` - A byte slice representing the key.
256    /// * `value` - A primitive value that can be converted into bytes.
257    ///
258    /// # Examples
259    ///
260    /// ```rust
261    /// use bytesbox::ByteBox;
262    /// use bytesbox::primitives::BytesPrimitives;
263    ///
264    /// let mut bytebox = ByteBox::new();
265    /// bytebox.insert_primitive(b"number", 42u32);
266    /// assert_eq!(bytebox.get(b"number"), Some(&b"42"[..]));
267    /// ```
268    pub fn insert_primitive<T: BytesPrimitives>(&mut self, key: &[u8], value: T) {
269        self.insert(key, &value.to_bytes());
270    }
271
272    /// Retrieves the value associated with the given key.
273    ///
274    /// # Arguments
275    ///
276    /// * `key` - A byte slice representing the key to look up.
277    ///
278    /// # Returns
279    ///
280    /// * `Some(&[u8])` containing the value if the key exists.
281    /// * `None` if the key does not exist in the `ByteBox`.
282    ///
283    /// # Examples
284    ///
285    /// ```rust
286    /// use bytesbox::ByteBox;
287    ///
288    /// let mut bytebox = ByteBox::new();
289    /// bytebox.insert(b"key", b"value");
290    /// assert_eq!(bytebox.get(b"key"), Some(&b"value"[..]));
291    /// assert_eq!(bytebox.get(b"nonexistent"), None);
292    /// ```
293    pub fn get(&self, key: &[u8]) -> Option<&[u8]> {
294        let idx = Self::hash(key, self.alloc);
295        let mut current = self.cells[idx].as_ref();
296
297        while let Some(entry) = current {
298            if entry.key == key {
299                return Some(&entry.value.as_slice());
300            }
301            current = entry.next.as_ref();
302        }
303
304        None
305    }
306
307    /// Removes the key-value pair associated with the given key from the `ByteBox`.
308    ///
309    /// # Arguments
310    ///
311    /// * `key` - A byte slice representing the key to remove.
312    ///
313    /// # Returns
314    ///
315    /// * `Some(Vec<u8>)` containing the removed value if the key existed.
316    /// * `None` if the key was not found.
317    ///
318    /// # Examples
319    ///
320    /// ```rust
321    /// use bytesbox::ByteBox;
322    ///
323    /// let mut bytebox = ByteBox::new();
324    /// bytebox.insert(b"key", b"value");
325    /// assert_eq!(bytebox.remove(b"key"), Some(b"value".to_vec()));
326    /// assert_eq!(bytebox.remove(b"key"), None);
327    /// ```
328    pub fn remove(&mut self, key: &[u8]) -> Option<Vec<u8>> {
329        let idx = Self::hash(key, self.alloc);
330        let cell = &mut self.cells[idx];
331
332        let mut prev = cell as *mut Option<Box<Entry>>;
333        let mut curr = cell.as_mut();
334
335        while let Some(entry) = curr {
336            if entry.key == key {
337                let removed_val = entry.value.clone();
338                unsafe {
339                    *prev = entry.next.take();
340                }
341                self.len -= 1;
342                return Some(removed_val);
343            }
344            prev = &mut entry.next as *mut Option<Box<Entry>>;
345            curr = entry.next.as_mut();
346        }
347
348        None
349    }
350
351    /// Removes all key-value pairs from the `ByteBox`, resetting it to an empty state.
352    ///
353    /// This method retains the current capacity of the hash table, allowing it to be reused
354    /// without the overhead of reallocating memory. All existing entries are removed, and
355    /// the length (`len`) is set to zero.
356    ///
357    /// # Examples
358    ///
359    /// ```rust
360    /// use bytesbox::ByteBox;
361    ///
362    /// let mut bytebox = ByteBox::new();
363    /// bytebox.insert(b"key1", b"value1");
364    /// bytebox.insert(b"key2", b"value2");
365    /// assert_eq!(bytebox.len(), 2);
366    ///
367    /// bytebox.clear();
368    /// assert_eq!(bytebox.len(), 0);
369    /// assert_eq!(bytebox.get(b"key1"), None);
370    /// assert_eq!(bytebox.get(b"key2"), None);
371    /// ```
372    pub fn clear(&mut self) {
373        for cell in &mut self.cells {
374            *cell = None;
375        }
376        self.len = 0;
377    }
378
379    /// Doubles the current capacity of the `ByteBox` and rehashes all existing entries.
380    ///
381    /// This method is called internally when the load factor exceeds the threshold.
382    fn resize(&mut self) {
383        let new_cap = self.alloc * 2;
384        let mut new_cells: Vec<Option<Box<Entry>>> = vec![None; new_cap];
385
386        for (_, cell) in self.cells.iter_mut().enumerate() {
387            let mut current = cell.take();
388            while let Some(mut entry) = current {
389                let idx = Self::hash(&entry.key, new_cap);
390                current = entry.next.take();
391                entry.next = new_cells[idx].take();
392                new_cells[idx] = Some(entry);
393            }
394        }
395
396        self.cells = new_cells;
397        self.alloc = new_cap;
398    }
399
400    /// Computes the hash index for a given key based on the current capacity.
401    ///
402    /// # Arguments
403    ///
404    /// * `key` - A byte slice representing the key to hash.
405    /// * `capacity` - The current or new capacity of the hash table.
406    ///
407    /// # Returns
408    ///
409    /// * `usize` representing the index in the cells vector.
410    fn hash(key: &[u8], capacity: usize) -> usize {
411        let mut hasher = DefaultHasher::new();
412        key.hash(&mut hasher);
413        let hash = hasher.finish();
414        let index = (hash as usize) % capacity;
415        index
416    }
417
418    /// Provides an iterator over the `ByteBox` that allows for iteration using `for` loops.
419    ///
420    /// This enables the use of `ByteBox` in contexts where an iterator is expected.
421    ///
422    /// # Examples
423    ///
424    /// ```rust
425    /// use bytesbox::ByteBox;
426    ///
427    /// let mut bytebox = ByteBox::new();
428    /// bytebox.insert(b"key1", b"value1");
429    /// bytebox.insert(b"key2", b"value2");
430    ///
431    /// for (key, value) in bytebox.iter() {
432    ///     println!("{:?}: {:?}", key, value);
433    /// }
434    /// ```
435    pub fn iter(&self) -> ByteBoxIterator {
436        ByteBoxIterator {
437            byte_box: &self,
438            entry: None,
439            index: 0,
440        }
441    }
442    /// Provides a detailed, colored visualization of the hash table.
443    ///
444    /// This function prints the structure of the `ByteBox`, including each cell and its entries.
445    ///
446    /// # Examples
447    ///
448    /// ```rust
449    /// use bytesbox::ByteBox;
450    ///
451    /// let mut bytebox = ByteBox::new();
452    /// bytebox.insert(b"key", b"value");
453    /// bytebox.view_table();
454    /// ```
455    #[cfg(feature = "color")]
456    pub fn view_table(&self) {
457        // Cell Header
458        let bytebox_header = format!(
459            "{}, number of cell ({}), allocation ({})",
460            b"ByteBox".blue().bold().underline(),
461            self.len().red(),
462            self.allocation().red()
463        );
464        // Print separator before each cell
465        println!(
466            "{}",
467            "────────────────────────────────────────────────".blue()
468        );
469        println!("{}", bytebox_header);
470        for (index, cell) in self.cells.iter().enumerate() {
471            let mut current = cell.as_ref();
472            // Cell Header
473            let cell_header = format!("  Cell {}:", index).magenta();
474            // Print separator before each cell
475            println!(
476                "{}",
477                "────────────────────────────────────────────────".red()
478            );
479            println!("{}", cell_header);
480
481            while let Some(entry) = current {
482                let mut max_key_len = 0;
483                let mut max_val_len = 0;
484
485                let k_len = entry.key.len();
486                let v_len = entry.value.len();
487
488                if k_len > max_key_len {
489                    max_key_len = k_len;
490                }
491                if v_len > max_val_len {
492                    max_val_len = v_len;
493                }
494
495                // Determine the longest length
496                let get_longest_len = std::cmp::max(max_key_len, max_val_len);
497                let k_closing_pipe = get_longest_len - k_len;
498                let v_closing_pipe = get_longest_len - v_len;
499                // Start of the cell box
500                // key val display Start
501                println!(
502                    "    {}",
503                    format!("+---+  +-{}-+", "-".repeat(get_longest_len))
504                );
505                // Key and value with arrows
506                println!(
507                    "    {}",
508                    format!(
509                        "| {} |->| {}{} |",
510                        "k".red(),
511                        format!("{}", String::from_utf8_lossy(&entry.key)).green(),
512                        " ".repeat(k_closing_pipe)
513                    )
514                );
515                println!(
516                    "    {}",
517                    format!("+---+  +-{}-+", "-".repeat(get_longest_len))
518                );
519                println!(
520                    "    {}",
521                    format!(
522                        "| {} |->| {}{} |",
523                        "v".red(),
524                        format!("{}", String::from_utf8_lossy(&entry.value)).yellow(),
525                        " ".repeat(v_closing_pipe)
526                    )
527                );
528                println!(
529                    "    {}",
530                    format!("+---+  +-{}-+", "-".repeat(get_longest_len))
531                );
532                // key val display END
533
534                // represantation on the Entry START
535                println!("    | byte_box | contains:");
536                let box_container = format!(
537                    "    {}{}+",
538                    "|           +-------------------------------",
539                    "-".repeat(get_longest_len)
540                );
541                println!("{}", box_container);
542                let box_container_len = box_container.len() - 36;
543                println!(
544                    "    {}{}|",
545                    "|           | Entry:                        ",
546                    " ".repeat(get_longest_len)
547                );
548                println!(
549                    "    {}{}|",
550                    format!(
551                        "|           | - key: Vec<u8> ({})",
552                        format!("{}", String::from_utf8_lossy(&entry.key)).green()
553                    ),
554                    " ".repeat(box_container_len - k_len)
555                );
556                println!(
557                    "    {}{}|",
558                    format!(
559                        "|           | - val: Vec<u8> ({})",
560                        format!("{}", String::from_utf8_lossy(&entry.value)).yellow()
561                    ),
562                    " ".repeat(box_container_len - v_len)
563                );
564                println!(
565                    "    |           | - next: None                  {}|",
566                    " ".repeat(get_longest_len)
567                );
568                println!(
569                    "    {}{}+",
570                    "|           +-------------------------------",
571                    "-".repeat(get_longest_len)
572                );
573                println!("    {}{}+", "+-------", "-".repeat(box_container_len + 24));
574                current = entry.next.as_ref();
575            }
576            // Indicate that the cell is empty in red
577            println!("    {}", b"Empty".red());
578
579            // representation of the Entry END
580        }
581
582        // Separator line
583        println!(
584            "{}",
585            "────────────────────────────────────────────────".red()
586        );
587        println!(
588            "{}",
589            "────────────────────────────────────────────────".blue()
590        );
591    }
592    #[cfg(not(feature = "color"))]
593    pub fn view_table(&self) {
594        // Cell Header
595        let bytebox_header = format!(
596            "{}, number of cell ({}), allocation ({})",
597            "ByteBox",
598            self.len(),
599            self.allocation()
600        );
601        // Print separator before each cell
602        println!("{}", "────────────────────────────────────────────────");
603        println!("{}", bytebox_header);
604        for (index, cell) in self.cells.iter().enumerate() {
605            let mut current = cell.as_ref();
606            // Cell Header
607            let cell_header = format!("  Cell {}:", index);
608            // Print separator before each cell
609            println!("{}", "────────────────────────────────────────────────");
610            println!("{}", cell_header);
611
612            while let Some(entry) = current {
613                let mut max_key_len = 0;
614                let mut max_val_len = 0;
615
616                let k_len = entry.key.len();
617                let v_len = entry.value.len();
618
619                if k_len > max_key_len {
620                    max_key_len = k_len;
621                }
622                if v_len > max_val_len {
623                    max_val_len = v_len;
624                }
625
626                // Determine the longest length
627                let get_longest_len = std::cmp::max(max_key_len, max_val_len);
628                let k_closing_pipe = get_longest_len - k_len;
629                let v_closing_pipe = get_longest_len - v_len;
630                // Start of the cell box
631                // key val display Start
632                println!(
633                    "    {}",
634                    format!("+---+  +-{}-+", "-".repeat(get_longest_len))
635                );
636                // Key and value with arrows
637                println!(
638                    "    {}",
639                    format!(
640                        "| {} |->| {}{} |",
641                        "k",
642                        format!("{}", String::from_utf8_lossy(&entry.key)),
643                        " ".repeat(k_closing_pipe)
644                    )
645                );
646                println!(
647                    "    {}",
648                    format!("+---+  +-{}-+", "-".repeat(get_longest_len))
649                );
650                println!(
651                    "    {}",
652                    format!(
653                        "| {} |->| {}{} |",
654                        "v",
655                        format!("{}", String::from_utf8_lossy(&entry.value)),
656                        " ".repeat(v_closing_pipe)
657                    )
658                );
659                println!(
660                    "    {}",
661                    format!("+---+  +-{}-+", "-".repeat(get_longest_len))
662                );
663                // key val display END
664
665                // represantation on the Entry START
666                println!("    | byte_box | contains:");
667                let box_container = format!(
668                    "    {}{}+",
669                    "|           +-------------------------------",
670                    "-".repeat(get_longest_len)
671                );
672                println!("{}", box_container);
673                let box_container_len = box_container.len() - 36;
674                println!(
675                    "    {}{}|",
676                    "|           | Entry:                        ",
677                    " ".repeat(get_longest_len)
678                );
679                println!(
680                    "    {}{}|",
681                    format!(
682                        "|           | - key: Vec<u8> ({})",
683                        format!("{}", String::from_utf8_lossy(&entry.key))
684                    ),
685                    " ".repeat(box_container_len - k_len)
686                );
687                println!(
688                    "    {}{}|",
689                    format!(
690                        "|           | - val: Vec<u8> ({})",
691                        format!("{}", String::from_utf8_lossy(&entry.value))
692                    ),
693                    " ".repeat(box_container_len - v_len)
694                );
695                println!(
696                    "    |           | - next: None                  {}|",
697                    " ".repeat(get_longest_len)
698                );
699                println!(
700                    "    {}{}+",
701                    "|           +-------------------------------",
702                    "-".repeat(get_longest_len)
703                );
704                println!("    {}{}+", "+-------", "-".repeat(box_container_len + 24));
705                current = entry.next.as_ref();
706            }
707            // Indicate that the cell is empty in red
708            println!("    {}", "Empty");
709
710            // representation of the Entry END
711        }
712
713        // Separator line
714        println!("{}", "────────────────────────────────────────────────");
715        println!("{}", "────────────────────────────────────────────────");
716    }
717}