compressed_intvec/fixed/mod.rs
1//! A generic, compressed, and randomly accessible vector with fixed-width encoding.
2//!
3//! This module provides [`FixedVec`] and its thread-safe counterpart, [`AtomicFixedVec`],
4//! two data structures for storing sequences of integers where each element is encoded
5//! using the same number of bits. This strategy, known as fixed-width encoding,
6//! is suitable for data where values are bounded within a known range, as it
7//! allows for O(1) random access by directly calculating the memory location of any element.
8//!
9//! # Core Concepts
10//!
11//! The design is generic over four parameters: `FixedVec<T, W, E, B>`.
12//! - `T`: The element type as seen by the user (e.g., [`u32`], [`i16`]), constrained by the [`Storable`] trait.
13//! - `W`: The underlying unsigned integer type used for storage (e.g., [`u64`], [`usize`]), constrained by the [`Word`] trait.
14//! - `E`: The [`Endianness`] (e.g., [`LE`] or [`BE`]) for bit-level operations.
15//! - `B`: The backing buffer, which abstracts over ownership. This allows [`FixedVec`] to be an owned container (e.g., `Vec<W>`) or a zero-copy, borrowed view (e.g., `&[W]`).
16//!
17//! ### Immutability and Zero-Copy Views
18//! Immutable access is performed in O(1) time by calculating an element's bit-level offset.
19//! Structures like [`FixedVecSlice`](slice::FixedVecSlice) provide zero-copy views into a vector, representing
20//! a sub-region of the data without requiring data duplication or new allocations.
21//!
22//! ### Mutability via Proxy Objects
23//! Direct mutable references (`&mut T`) to individual bit-packed elements are not possible, as
24//! elements do not align with byte boundaries and may not even exist as discrete entities in memory.
25//! Instead, mutability is managed through the proxy object pattern. Methods like [`at_mut`](FixedVec::at_mut) return a temporary proxy ([`MutProxy`]) that holds a decoded copy of the value.
26//! Modifications are applied to this copy, and the value is automatically encoded and written
27//! back into the vector's bitstream when the proxy object goes out of scope (i.e., is dropped).
28//!
29//! # Core Data Structures
30//!
31//! - [`FixedVec`]: The primary implementation for single-threaded contexts.
32//! - [`AtomicFixedVec`]: A thread-safe variant for concurrent applications. It provides an
33//! API analogous to Rust's standard atomic types ([`load`](AtomicFixedVec::load), [`store`](AtomicFixedVec::store), [`fetch_add`](AtomicFixedVec::fetch_add), etc.).
34//! It uses lock-free atomic instructions for elements contained within a single machine word and
35//! a fine-grained locking strategy (lock striping) for elements that span word boundaries. This
36//! hybrid approach ensures thread safety for any `bit-width` configuration.
37//!
38//! # Main Components
39//!
40//! - [`FixedVec`] and [`AtomicFixedVec`]
41//! - [`BitWidth`]: An enum to control the bit-width selection strategy. Options include:
42//! - [`BitWidth::Minimal`]: Selects the minimal bit-width required.
43//! - [`BitWidth::PowerOfTwo`]: Rounds up to the nearest power of two.
44//! - [`BitWidth::Explicit`]: Allows specifying a fixed bit-width.
45//! - **Builders**: [`FixedVecBuilder`](builder::FixedVecBuilder) and [`FixedVecFromIterBuilder`](builder::FixedVecFromIterBuilder)
46//! - **Slices**: [`FixedVecSlice`](slice::FixedVecSlice) for creating immutable or mutable views.
47//!
48//! # Examples
49//!
50//! Create a [`FixedVec`] from a slice of data. The builder will
51//! automatically determine the minimal number of bits required.
52//!
53//! ```
54//! use compressed_intvec::fixed::{FixedVec, UFixedVec};
55//!
56//! // The numbers 0-7 can all be represented in 3 bits.
57//! let data: Vec<u32> = (0..8).collect();
58//!
59//! // The builder infers that `bit_width` should be 3.
60//! let vec: UFixedVec<u32> = FixedVec::builder()
61//! .build(&data)
62//! .unwrap();
63//!
64//! assert_eq!(vec.len(), 8);
65//! assert_eq!(vec.bit_width(), 3);
66//! assert_eq!(vec.get(5), Some(5));
67//! ```
68//!
69//! ## Storing Signed Integers
70//!
71//! [`FixedVec`] can store signed integers. The underlying storage uses zig-zag encoding,
72//! which maps small negative and positive numbers to small unsigned integers.
73//!
74//! ```
75//! use compressed_intvec::fixed::{FixedVec, SFixedVec};
76//!
77//! // The values range from -2 to 1. Zig-zag encoding maps these to
78//! // unsigned values, so the maximum value is 3, which
79//! // requires 2 bits.
80//! let data: &[i16] = &[-2, -1, 0, 1];
81//! let vec: SFixedVec<i16> = FixedVec::builder().build(data).unwrap();
82//!
83//! assert_eq!(vec.bit_width(), 2);
84//! assert_eq!(vec.get(0), Some(-2));
85//! assert_eq!(vec.get(3), Some(1));
86//! ```
87//!
88//! # Implementation Notes
89//!
90//! To ensure safe and efficient memory access, [`FixedVec`] adds one padding
91//! word at the end of its storage buffer. This padding prevents out-of-bounds
92//! reads in methods like [`get_unchecked`](FixedVec::get_unchecked) and is a
93//! prerequisite for unaligned access with [`get_unaligned_unchecked`](FixedVec::get_unaligned_unchecked).
94//! The builders handle this padding automatically. When creating a [`FixedVec`] from raw parts,
95//! it is the caller's responsibility to ensure this padding is present.
96//!
97//! # Common Type Aliases
98//!
99//! To simplify usage, this crate provides several type aliases for common [`FixedVec`]
100//! configurations. In most cases, you should prefer using these aliases over the
101//! fully generic `FixedVec<T, W, E, B>` struct.
102//!
103//! ### General-Purpose Aliases
104//!
105//! These aliases use [`usize`] for the storage word, which is often the most
106//! efficient choice for the target architecture.
107//!
108//! - [`UFixedVec<T>`]: For unsigned integers (e.g., [`u8`], [`u16`], [`u32`]).
109//! - [`SFixedVec<T>`]: For signed integers (e.g., [`i8`], [`i16`], [`i32`]).
110//!
111//! ### Concrete Aliases for [`u64`]/[`i64`]
112//!
113//! These aliases are specialized for [`u64`]/[`i64`] elements stored in [`u64`] words:
114//!
115//! - [`LEFixedVec`]: [`u64`] elements, Little-Endian.
116//! - [`BEFixedVec`]: [`u64`] elements, Big-Endian.
117//! - [`LESFixedVec`]: [`i64`] elements, Little-Endian.
118//! - [`BESFixedVec`]: [`i64`] elements, Big-Endian.
119//!
120//! The [`atomic`] module provides a similar set of aliases like [`UAtomicFixedVec`]
121//! and [`SAtomicFixedVec`].
122
123// Declare and export submodules.
124#[macro_use]
125pub mod macros;
126pub mod builder;
127pub mod iter;
128pub mod iter_mut;
129pub mod parallel;
130pub mod proxy;
131pub mod slice;
132pub mod traits;
133
134pub mod atomic;
135
136// Conditionally compile the serde module.
137#[cfg(feature = "serde")]
138mod serde;
139
140use dsi_bitstream::{
141 prelude::Endianness,
142 traits::{BE, LE},
143};
144use mem_dbg::{MemDbg, MemSize};
145use num_traits::ToPrimitive;
146use std::{error::Error as StdError, fmt, iter::FromIterator, marker::PhantomData};
147use traits::{Storable, Word};
148
149use crate::fixed::{proxy::MutProxy, traits::DefaultParams};
150
151// Re-export atomic aliases for convenience.
152pub use atomic::{AtomicFixedVec, SAtomicFixedVec, UAtomicFixedVec};
153
154// Type aliases for common `FixedVec` configurations.
155
156/// A [`FixedVec`] for unsigned integers with a `usize` word and Little-Endian layout.
157///
158/// This is a convenient alias for a common configuration. The element type `T`
159/// can be any unsigned integer that implements [`Storable`], such as `u8`, `u16`,
160/// `u32`, [`u64`], `u128`, or `usize`.
161pub type UFixedVec<T, B = Vec<usize>> = FixedVec<T, usize, LE, B>;
162
163/// A [`FixedVec`] for signed integers with a `usize` word and Little-Endian layout.
164///
165/// This alias is suitable for general-purpose use with signed types like `i8`,
166/// `i16`, `i32`, [`i64`], `i128`, or `isize`.
167pub type SFixedVec<T, B = Vec<usize>> = FixedVec<T, usize, LE, B>;
168
169// --- Concrete Aliases for [`u64`]/[`i64`] elements ---
170// These are provided for common use cases and backward compatibility.
171
172/// A [`FixedVec`] for [`u64`] elements with a [`u64`] backend and Little-Endian layout.
173pub type LEFixedVec<B = Vec<u64>> = FixedVec<u64, u64, LE, B>;
174/// A [`FixedVec`] for [`i64`] elements with a [`u64`] backend and Little-Endian layout.
175pub type LESFixedVec<B = Vec<u64>> = FixedVec<i64, u64, LE, B>;
176
177/// A [`FixedVec`] for [`u64`] elements with a [`u64`] backend and Big-Endian layout.
178pub type BEFixedVec<B = Vec<u64>> = FixedVec<u64, u64, BE, B>;
179/// A [`FixedVec`] for [`i64`] elements with a [`u64`] backend and Big-Endian layout.
180pub type BESFixedVec<B = Vec<u64>> = FixedVec<i64, u64, BE, B>;
181
182/// Specifies the strategy for determining the number of bits for each integer.
183///
184/// This enum controls how the bit width of a [`FixedVec`] is determined during
185/// its construction. The choice of strategy involves a trade-off between memory
186/// usage and random access performance.
187///
188/// # Performance Considerations
189///
190/// - **[`Minimal`](BitWidth::Minimal) vs. [`PowerOfTwo`](BitWidth::PowerOfTwo)**: While [`Minimal`](BitWidth::Minimal) provides the most compact
191/// storage, [`PowerOfTwo`](BitWidth::PowerOfTwo) can offer better performance for certain operations.
192/// When the `bit_width` is a power of two (e.g., 8, 16, 32) and aligns with
193/// word boundaries, some in-place operations like [`map_in_place`](FixedVec::map_in_place) can use a
194/// faster, word-at-a-time algorithm.
195///
196/// - **[`Explicit`](BitWidth::Explicit)**: This is the fastest strategy at construction time, as it
197/// avoids the need to iterate through the input data to find the maximum
198/// value. Use this when the required bit width is known in advance.
199#[derive(Debug, Clone, Copy, Default, PartialEq, Eq)]
200pub enum BitWidth {
201 /// Use the minimum number of bits required to store the largest value.
202 ///
203 /// This strategy analyzes the input data to find the maximum value and sets
204 /// the bit width accordingly. It ensures the most compact memory representation.
205 #[default]
206 Minimal,
207
208 /// Round the bit width up to the next power of two (e.g., 8, 16, 32).
209 ///
210 /// This strategy can improve random access performance for some in-place
211 /// operations, as they can be implemented more efficiently with bit-shift operations on aligned data.
212 PowerOfTwo,
213
214 /// Use a specific number of bits.
215 ///
216 /// This strategy enforces a user-defined bit width. If any value in the
217 /// input data exceeds what can be stored in this many bits, the build
218 /// process will fail.
219 Explicit(usize),
220}
221
222/// Defines errors that can occur during [`FixedVec`] operations.
223#[derive(Debug)]
224pub enum Error {
225 /// A value in the input data is too large to be stored with the configured
226 /// bit width.
227 ValueTooLarge {
228 /// The value that could not be stored.
229 value: u128,
230 /// The index of the value in the input data.
231 index: usize,
232 /// The configured number of bits.
233 bit_width: usize,
234 },
235 /// A parameter is invalid for the requested operation.
236 ///
237 /// This typically occurs if `bit_width` is larger than the storage word size
238 /// or if a provided buffer is too small.
239 InvalidParameters(String),
240}
241
242impl fmt::Display for Error {
243 fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
244 match self {
245 Error::ValueTooLarge {
246 value,
247 index,
248 bit_width,
249 } => write!(
250 f,
251 "value {value} at index {index} does not fit in {bit_width} bits"
252 ),
253 Error::InvalidParameters(s) => write!(f, "Invalid parameters: {s}"),
254 }
255 }
256}
257
258impl StdError for Error {}
259
260/// A compressed vector of integers with fixed-width encoding.
261///
262/// [`FixedVec`] stores a sequence of integers where each element is encoded using
263/// the same number of bits. This allows for O(1) random access by calculating
264/// the memory location of any element. It is suitable for data where values are
265/// bounded within a known range.
266///
267/// # Type Parameters
268///
269/// - `T`: The integer type for the elements (e.g., `u32`, `i16`). It must
270/// implement the [`Storable`] trait.
271/// - `W`: The underlying storage word (e.g., [`u64`], `usize`). It must implement
272/// the [`Word`] trait.
273/// - `E`: The [`Endianness`] (e.g., [`LE`] or [`BE`]) for bit-level operations.
274/// - `B`: The backend storage buffer, such as `Vec<W>` for an owned vector or
275/// `&[W]` for a borrowed view.
276///
277/// For common configurations, several [type aliases] are provided.
278///
279/// [type aliases]: crate::fixed#type-aliases
280#[derive(Debug, Clone, MemDbg, MemSize)]
281pub struct FixedVec<T: Storable<W>, W: Word, E: Endianness, B: AsRef<[W]> = Vec<W>> {
282 /// The underlying storage for the bit-packed data.
283 pub(crate) bits: B,
284 /// The number of bits used to encode each element.
285 pub(crate) bit_width: usize,
286 /// A bitmask with the lowest `bit_width` bits set to one.
287 pub(crate) mask: W,
288 /// The number of elements in the vector.
289 pub(crate) len: usize,
290 /// Zero-sized markers for the generic type parameters `T`, `W`, and `E`.
291 pub(crate) _phantom: PhantomData<(T, W, E)>,
292}
293
294// [`FixedVec`] builder implementation.
295impl<T, W, E> FixedVec<T, W, E, Vec<W>>
296where
297 T: Storable<W>,
298 W: Word,
299 E: Endianness,
300 dsi_bitstream::impls::BufBitWriter<E, dsi_bitstream::impls::MemWordWriterVec<W, Vec<W>>>:
301 dsi_bitstream::prelude::BitWrite<E, Error = std::convert::Infallible>,
302{
303 /// Creates a builder for constructing a [`FixedVec`] from a slice.
304 ///
305 /// The builder provides methods to customize the vector's properties, such
306 /// as the [`BitWidth`] strategy.
307 ///
308 /// # Examples
309 ///
310 /// ```
311 /// use compressed_intvec::fixed::{FixedVec, BitWidth, UFixedVec};
312 ///
313 /// let data: &[u32] = &[10, 20, 30, 40, 50];
314 /// let vec: UFixedVec<u32> = FixedVec::builder()
315 /// .bit_width(BitWidth::Minimal)
316 /// .build(data)
317 /// .unwrap();
318 ///
319 /// assert_eq!(vec.get(1), Some(20));
320 /// ```
321 pub fn builder() -> builder::FixedVecBuilder<T, W, E> {
322 builder::FixedVecBuilder::new()
323 }
324
325 /// Creates a builder for constructing a [`FixedVec`] from an iterator.
326 ///
327 /// This builder is suitable for large datasets provided by an iterator,
328 /// as it processes the data in a streaming fashion.
329 ///
330 /// # Arguments
331 ///
332 /// * `iter`: An iterator that yields the integer values.
333 /// * `bit_width`: The number of bits to use for each element. This must be
334 /// specified, as the builder cannot analyze the data in advance.
335 ///
336 /// # Examples
337 ///
338 /// ```
339 /// use compressed_intvec::fixed::{FixedVec, UFixedVec};
340 ///
341 /// let data = 0..100u32;
342 /// let vec: UFixedVec<u32> = FixedVec::from_iter_builder(data, 7)
343 /// .build()
344 /// .unwrap();
345 ///
346 /// assert_eq!(vec.len(), 100);
347 /// assert_eq!(vec.bit_width(), 7);
348 /// assert_eq!(vec.get(99), Some(99));
349 /// ```
350 pub fn from_iter_builder<I: IntoIterator<Item = T>>(
351 iter: I,
352 bit_width: usize,
353 ) -> builder::FixedVecFromIterBuilder<T, W, E, I> {
354 builder::FixedVecFromIterBuilder::new(iter, bit_width)
355 }
356
357 /// Creates an owned [`FixedVec`] from a slice of data.
358 ///
359 /// This method is a convenience wrapper around the builder API, using
360 /// the default bit width strategy ([`BitWidth::Minimal`]).
361 ///
362 /// # Examples
363 ///
364 /// ```rust
365 /// use compressed_intvec::fixed::{FixedVec, UFixedVec};
366 ///
367 /// let data: &[u32] = &[10, 20, 30];
368 ///
369 /// let vec: UFixedVec<u32> = FixedVec::from_slice(data).unwrap();
370 ///
371 /// assert_eq!(vec.len(), 3);
372 /// assert_eq!(vec.bit_width(), 5); // 30 fits in 5 bits
373 /// assert_eq!(vec.get(0), Some(10));
374 /// ```
375 pub fn from_slice(slice: &[T]) -> Result<Self, Error> {
376 Self::builder().build(slice)
377 }
378}
379
380// Core immutable API.
381impl<T, W, E, B> FixedVec<T, W, E, B>
382where
383 T: Storable<W>,
384 W: Word,
385 E: Endianness,
386 B: AsRef<[W]>,
387{
388 /// Creates a [`FixedVec`] from its raw components.
389 ///
390 /// This constructor allows for the creation of a zero-copy view over an
391 /// existing buffer.
392 ///
393 /// # Errors
394 ///
395 /// Returns an [`Error::InvalidParameters`] if `bit_width` is larger than
396 /// the word size or if the `bits` buffer is too small to hold the specified
397 /// number of elements and the required padding.
398 ///
399 /// # Implementation Notes
400 ///
401 /// The provided `bits` buffer must contain at least one extra "padding"
402 /// word at the end. This padding is essential to prevent out-of-bounds
403 /// memory access in methods like `get_unchecked`.
404 ///
405 /// # Examples
406 ///
407 /// ```
408 /// use compressed_intvec::fixed::{FixedVec, UFixedVec};
409 ///
410 /// // 3 elements * 5 bits = 15 bits. This requires 1 data word.
411 /// // We need 1 extra word for padding.
412 /// let mut buffer = vec![0_usize; 2];
413 ///
414 /// // Manually encode data into the first word.
415 /// // 10 (01010), 20 (10100), 30 (11110)
416 /// buffer[0] = 0b11110_10100_01010;
417 ///
418 /// let vec = UFixedVec::<u32, _>::from_parts(&buffer, 3, 5).unwrap();
419 /// assert_eq!(vec.get(0), Some(10));
420 /// assert_eq!(vec.get(1), Some(20));
421 /// assert_eq!(vec.get(2), Some(30));
422 /// ```
423 pub fn from_parts(bits: B, len: usize, bit_width: usize) -> Result<Self, Error> {
424 if bit_width > <W as traits::Word>::BITS {
425 return Err(Error::InvalidParameters(format!(
426 "bit_width ({}) cannot be greater than the word size ({})",
427 bit_width,
428 <W as traits::Word>::BITS
429 )));
430 }
431
432 let total_bits = len * bit_width;
433 let data_words = total_bits.div_ceil(<W as traits::Word>::BITS);
434
435 // Essential safety check: ensure the buffer is large enough for the data
436 // AND the 1 padding word required.
437 if bits.as_ref().len() < data_words + 1 {
438 return Err(Error::InvalidParameters(format!(
439 "The provided buffer is too small. It has {} words, but {} data words + 1 padding word are required.",
440 bits.as_ref().len(),
441 data_words
442 )));
443 }
444
445 Ok(unsafe { Self::new_unchecked(bits, len, bit_width) })
446 }
447
448 /// Returns the number of elements in the vector.
449 #[inline]
450 pub fn len(&self) -> usize {
451 self.len
452 }
453
454 /// Returns `true` if the vector is empty.
455 #[inline]
456 pub fn is_empty(&self) -> bool {
457 self.len == 0
458 }
459
460 /// Returns the number of bits used to encode each element.
461 #[inline]
462 pub fn bit_width(&self) -> usize {
463 self.bit_width
464 }
465
466 /// Returns a read-only slice of the underlying storage words.
467 #[inline]
468 pub fn as_limbs(&self) -> &[W] {
469 self.bits.as_ref()
470 }
471
472 /// Returns a raw pointer to the start of the underlying buffer and its
473 /// length in words.
474 ///
475 /// # Safety
476 ///
477 /// The caller must ensure that the buffer is not mutated in a way that
478 /// violates the invariants of the [`FixedVec`] while the pointer is active.
479 pub fn as_raw_parts(&self) -> (*const W, usize) {
480 let slice = self.bits.as_ref();
481 (slice.as_ptr(), slice.len())
482 }
483
484 /// Creates a [`FixedVec`] from its raw components without performing checks.
485 ///
486 /// # Safety
487 ///
488 /// The caller must ensure that the following invariants are met:
489 /// 1. The `bits` buffer must be large enough to hold `len * bit_width` bits.
490 /// 2. The `bits` buffer must have at least one extra padding word at the end
491 /// to prevent out-of-bounds reads during access.
492 /// 3. `bit_width` must not be greater than the number of bits in `W`.
493 pub(crate) unsafe fn new_unchecked(bits: B, len: usize, bit_width: usize) -> Self {
494 let mask = if bit_width == <W as traits::Word>::BITS {
495 W::max_value()
496 } else {
497 (W::ONE << bit_width) - W::ONE
498 };
499
500 Self {
501 bits,
502 len,
503 bit_width,
504 mask,
505 _phantom: PhantomData,
506 }
507 }
508
509 /// Returns the element at the specified index, or `None` if the index is
510 /// out of bounds.
511 ///
512 /// This operation is O(1).
513 ///
514 /// # Examples
515 ///
516 /// ```
517 /// use compressed_intvec::fixed::{FixedVec, UFixedVec};
518 ///
519 /// let data: &[u32] = &[10, 20, 30];
520 /// let vec: UFixedVec<u32> = FixedVec::builder().build(data).unwrap();
521 ///
522 /// assert_eq!(vec.get(1), Some(20));
523 /// assert_eq!(vec.get(3), None);
524 /// ```
525 #[inline]
526 pub fn get(&self, index: usize) -> Option<T> {
527 if index >= self.len {
528 return None;
529 }
530 Some(unsafe { self.get_unchecked(index) })
531 }
532
533 /// Returns the element at the specified index without bounds checking.
534 ///
535 /// # Safety
536 ///
537 /// Calling this method with an out-of-bounds `index` is undefined behavior.
538 /// The `index` must be less than the vector's `len`.
539 #[inline(always)]
540 pub unsafe fn get_unchecked(&self, index: usize) -> T {
541 debug_assert!(index < self.len);
542
543 let bits_per_word = <W as traits::Word>::BITS;
544 // Optimization: if bit_width matches word size, access is trivial.
545 if self.bit_width == bits_per_word {
546 let val = *self.as_limbs().get_unchecked(index);
547 // Apply endianness correction if needed.
548 let final_val = if E::IS_BIG { val.to_be() } else { val };
549 return <T as Storable<W>>::from_word(final_val);
550 }
551
552 // Calculate the bit position and word index for the element.
553 let bit_pos = index * self.bit_width;
554 let word_index = bit_pos / bits_per_word;
555 let bit_offset = bit_pos % bits_per_word;
556
557 let limbs = self.as_limbs();
558 let final_word: W;
559
560 // The logic is specialized for endianness to maximize performance.
561 if E::IS_LITTLE {
562 // Fast path: the element is fully contained within a single word.
563 if bit_offset + self.bit_width <= bits_per_word {
564 final_word = (*limbs.get_unchecked(word_index) >> bit_offset) & self.mask;
565 } else {
566 // Slow path: the element spans two words.
567 // Read the low part from the first word and the high part from the next.
568 let low = *limbs.get_unchecked(word_index) >> bit_offset;
569 let high = *limbs.get_unchecked(word_index + 1) << (bits_per_word - bit_offset);
570 final_word = (low | high) & self.mask;
571 }
572 } else {
573 // Big-Endian logic requires byte-swapping the words before extraction.
574 let word_hi = (*limbs.get_unchecked(word_index)).to_be();
575 if bit_offset + self.bit_width <= bits_per_word {
576 final_word = (word_hi << bit_offset) >> (bits_per_word - self.bit_width);
577 } else {
578 let word_lo = (*limbs.get_unchecked(word_index + 1)).to_be();
579 let bits_in_first = bits_per_word - bit_offset;
580 let high = word_hi << bit_offset >> (bits_per_word - bits_in_first);
581 let low = word_lo >> (bits_per_word - (self.bit_width - bits_in_first));
582 final_word = (high << (self.bit_width - bits_in_first)) | low;
583 }
584 }
585 <T as Storable<W>>::from_word(final_word)
586 }
587
588 /// Returns the element at `index` using unaligned memory access.
589 ///
590 /// This method can be significantly faster for random access. It performs a
591 /// single, potentially unaligned read of a [`Word`] and extracts the value.
592 ///
593 /// # Performance
594 ///
595 /// This method is generally the fastest way to perform random reads on
596 /// Little-Endian architectures, especially when the `bit_width` is not a
597 /// power of two.
598 ///
599 /// - [`get_unchecked`](Self::get_unchecked): May require reading one or two separate machine words
600 /// and combining them with bit shifts. This is fast if the data is already
601 /// in the CPU cache.
602 /// - [`get_unaligned_unchecked`](Self::get_unaligned_unchecked): Performs a single memory read that may
603 /// cross word boundaries. It often results in fewer instructions and better throughput than the two
604 /// separate reads of [`get_unchecked`](Self::get_unchecked), especially in memory-bound scenarios.
605 ///
606 ///
607 /// # Safety
608 ///
609 /// Calling this method with an out-of-bounds `index` is undefined behavior.
610 /// The [`FixedVec`] must have been constructed with sufficient padding (at
611 /// least one [`Word`]) to guarantee that the unaligned read does not go
612 /// past the allocated buffer. This padding is guaranteed by the default
613 /// builders.
614 ///
615 /// # Implementation Notes
616 ///
617 /// For Big-Endian, this method falls back to the standard [`get_unchecked`](Self::get_unchecked) implementation.
618 #[inline(always)]
619 pub unsafe fn get_unaligned_unchecked(&self, index: usize) -> T {
620 debug_assert!(index < self.len);
621
622 if E::IS_LITTLE {
623 let bits_per_word = <W as Word>::BITS;
624 if self.bit_width == bits_per_word {
625 return self.get_unchecked(index);
626 }
627
628 let bit_pos = index * self.bit_width;
629 let byte_pos = bit_pos / 8;
630 let bit_rem = bit_pos % 8;
631
632 let limbs_ptr = self.as_limbs().as_ptr() as *const u8;
633
634 // Perform an unaligned read from the calculated byte position.
635 let word: W = (limbs_ptr.add(byte_pos) as *const W).read_unaligned();
636 let extracted_word = word >> bit_rem;
637
638 <T as Storable<W>>::from_word(extracted_word & self.mask)
639 } else {
640 // For Big-Endian, the logic for unaligned reads is complex and
641 // architecture-dependent. Fall back to the standard `get_unchecked`.
642 self.get_unchecked(index)
643 }
644 }
645
646 /// Returns an iterator over the elements of the vector.
647 ///
648 /// # Examples
649 ///
650 /// ```
651 /// use compressed_intvec::fixed::{FixedVec, UFixedVec};
652 ///
653 /// let data: &[u32] = &[10, 20, 30];
654 /// let vec: UFixedVec<u32> = FixedVec::builder().build(data).unwrap();
655 /// let mut iter = vec.iter();
656 ///
657 /// for (i, value) in iter.enumerate() {
658 /// assert_eq!(Some(value), vec.get(i));
659 /// }
660 /// ```
661 pub fn iter(&self) -> iter::FixedVecIter<'_, T, W, E, B> {
662 iter::FixedVecIter::new(self)
663 }
664
665 /// Returns an unchecked iterator over the elements of the vector.
666 ///
667 /// # Safety
668 ///
669 /// The caller must ensure that the iterator is not advanced beyond the
670 /// vector's length.
671 pub unsafe fn iter_unchecked(&self) -> iter::FixedVecUncheckedIter<'_, T, W, E, B> {
672 iter::FixedVecUncheckedIter::new(self)
673 }
674
675 /// Creates an immutable view (slice) of a sub-region of the vector.
676 ///
677 /// Returns `None` if the specified range is out of bounds.
678 ///
679 /// # Arguments
680 /// * `start`: The starting index of the slice.
681 /// * `len`: The number of elements in the slice.
682 pub fn slice(&self, start: usize, len: usize) -> Option<slice::FixedVecSlice<&Self>> {
683 if start.saturating_add(len) > self.len {
684 return None;
685 }
686 Some(slice::FixedVecSlice::new(self, start..start + len))
687 }
688
689 /// Splits the vector into two immutable views at a given index.
690 ///
691 /// Returns `None` if `mid` is out of bounds.
692 ///
693 /// # Arguments
694 /// * `mid`: The index at which to split the vector.
695 pub fn split_at(
696 &self,
697 mid: usize,
698 ) -> Option<(slice::FixedVecSlice<&Self>, slice::FixedVecSlice<&Self>)> {
699 if mid > self.len {
700 return None;
701 }
702 let left = slice::FixedVecSlice::new(self, 0..mid);
703 let right = slice::FixedVecSlice::new(self, mid..self.len);
704 Some((left, right))
705 }
706
707 /// Returns an iterator over non-overlapping chunks of the vector.
708 ///
709 /// Each chunk is a [`FixedVecSlice`] of length `chunk_size`, except for the
710 /// last chunk, which may be shorter.
711 ///
712 /// # Panics
713 ///
714 /// Panics if `chunk_size` is 0.
715 ///
716 /// [`FixedVecSlice`]: slice::FixedVecSlice
717 pub fn chunks(&self, chunk_size: usize) -> iter::Chunks<'_, T, W, E, B> {
718 iter::Chunks::new(self, chunk_size)
719 }
720
721 /// Returns an iterator over all contiguous windows of length `size`.
722 ///
723 /// The windows overlap. If the vector is shorter than `size`, the iterator
724 /// returns no values.
725 ///
726 /// # Panics
727 ///
728 /// Panics if `size` is 0.
729 ///
730 /// # Examples
731 ///
732 /// ```
733 /// use compressed_intvec::fixed_vec;
734 ///
735 /// let vec = fixed_vec![1u32, 2, 3, 4, 5];
736 /// let mut windows = vec.windows(3);
737 ///
738 /// let slice1 = vec.slice(0, 3).unwrap();
739 /// let slice2 = vec.slice(1, 3).unwrap();
740 /// let slice3 = vec.slice(2, 3).unwrap();
741 ///
742 /// assert_eq!(windows.next().unwrap(), slice1);
743 /// assert_eq!(windows.next().unwrap(), slice2);
744 /// assert_eq!(windows.next().unwrap(), slice3);
745 /// assert!(windows.next().is_none());
746 /// ```
747 pub fn windows(&self, size: usize) -> iter::Windows<'_, T, W, E, B> {
748 assert!(size != 0, "window size cannot be zero");
749 iter::Windows::new(self, size)
750 }
751
752 /// Returns a raw pointer to the storage word containing the start of an element.
753 ///
754 /// This method returns a pointer to the [`Word`] in the backing buffer where
755 /// the data for the element at `index` begins.
756 ///
757 /// Returns `None` if `index` is out of bounds.
758 ///
759 /// # Safety
760 ///
761 /// This method is safe as it only returns a raw pointer. Dereferencing the
762 /// pointer is `unsafe`. The caller must ensure that the pointer is not used
763 /// after the [`FixedVec`] is dropped or modified.
764 pub fn addr_of(&self, index: usize) -> Option<*const W> {
765 if index >= self.len {
766 return None;
767 }
768
769 let bit_pos = index * self.bit_width;
770 let word_idx = bit_pos / <W as Word>::BITS;
771
772 let limbs = self.as_limbs();
773 if word_idx < limbs.len() {
774 Some(limbs.as_ptr().wrapping_add(word_idx))
775 } else {
776 None
777 }
778 }
779
780 /// Hints to the CPU to prefetch the data for the element at `index` into cache.
781 ///
782 /// This method uses an intrinsic to reduce memory latency for predictable
783 /// access patterns. It only has an effect on architectures that support it
784 /// (e.g., x86, x86-64) and compiles to a no-op on other platforms.
785 pub fn prefetch(&self, index: usize) {
786 if index >= self.len {
787 return;
788 }
789
790 #[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
791 {
792 use std::arch::x86_64::{_mm_prefetch, _MM_HINT_T0};
793
794 let bit_pos = index * self.bit_width;
795 let byte_pos = bit_pos / 8;
796
797 let limbs_ptr = self.as_limbs().as_ptr() as *const i8;
798
799 // SAFETY: Bounds check on `index` ensures `byte_pos` is within the
800 // allocated buffer (including padding). The pointer is valid.
801 unsafe {
802 // Prefetch into all cache levels (a good general-purpose default).
803 _mm_prefetch(limbs_ptr.add(byte_pos), _MM_HINT_T0);
804 }
805 }
806 }
807
808 /// Binary searches this vector for a given element.
809 ///
810 /// If the value is found, returns `Ok(usize)` with the index of the
811 /// matching element. If the value is not found, returns `Err(usize)` with
812 /// the index where the value could be inserted to maintain order.
813 ///
814 /// # Examples
815 ///
816 /// ```
817 /// use compressed_intvec::fixed::{FixedVec, UFixedVec};
818 ///
819 /// let data: &[u32] = &[10, 20, 30, 40, 50];
820 /// let vec: UFixedVec<u32> = FixedVec::builder().build(data).unwrap();
821 ///
822 /// assert_eq!(vec.binary_search(&30), Ok(2));
823 /// assert_eq!(vec.binary_search(&35), Err(3));
824 /// ```
825 pub fn binary_search(&self, value: &T) -> Result<usize, usize>
826 where
827 T: Ord,
828 {
829 let mut low = 0;
830 let mut high = self.len();
831
832 while low < high {
833 let mid = low + (high - low) / 2;
834 let mid_val = unsafe { self.get_unchecked(mid) };
835
836 match mid_val.cmp(value) {
837 std::cmp::Ordering::Less => low = mid + 1,
838 std::cmp::Ordering::Equal => return Ok(mid),
839 std::cmp::Ordering::Greater => high = mid,
840 }
841 }
842 Err(low)
843 }
844
845 /// Binary searches this vector with a key extraction function.
846 ///
847 /// This method is useful when searching for a value of a different type
848 /// than the elements of the slice. The function `f` is used to extract a
849 /// key of type `K` from an element, which is then compared to `key`.
850 pub fn binary_search_by_key<K: Ord, F>(&self, key: &K, mut f: F) -> Result<usize, usize>
851 where
852 F: FnMut(T) -> K,
853 {
854 self.binary_search_by(|probe| f(*probe).cmp(key))
855 }
856
857 /// Binary searches this vector with a custom comparison function.
858 ///
859 /// The comparator function `f` should return an `Ordering` indicating
860 /// the relation of a probe element to the value being searched for.
861 ///
862 /// # Examples
863 ///
864 /// ```
865 /// use compressed_intvec::fixed_vec;
866 /// use std::cmp::Ordering;
867 ///
868 /// let vec = fixed_vec![0u32, 1, 1, 2, 3];
869 ///
870 /// // Search for a value by comparing it to the probe.
871 /// let result = vec.binary_search_by(|probe| probe.cmp(&1));
872 /// assert!(matches!(result, Ok(1) | Ok(2)));
873 ///
874 /// let result_not_found = vec.binary_search_by(|probe| probe.cmp(&4));
875 /// assert_eq!(result_not_found, Err(5));
876 /// ```
877 pub fn binary_search_by<F>(&self, mut f: F) -> Result<usize, usize>
878 where
879 F: FnMut(&T) -> std::cmp::Ordering,
880 {
881 let mut low = 0;
882 let mut high = self.len();
883
884 while low < high {
885 let mid = low + (high - low) / 2;
886 // SAFETY: The loop invariants ensure `mid` is always in bounds.
887 let mid_val = unsafe { self.get_unchecked(mid) };
888
889 match f(&mid_val) {
890 std::cmp::Ordering::Less => low = mid + 1,
891 std::cmp::Ordering::Equal => return Ok(mid),
892 std::cmp::Ordering::Greater => high = mid,
893 }
894 }
895 Err(low)
896 }
897
898 /// Returns the index of the partition point of the vector.
899 ///
900 /// The vector is partitioned according to the predicate `pred`. This means
901 /// all elements for which `pred` returns `true` are on the left of the
902 /// partition point, and all elements for which `pred` returns `false` are
903 /// on the right.
904 ///
905 /// # Examples
906 ///
907 /// ```
908 /// use compressed_intvec::fixed_vec;
909 ///
910 /// let vec = fixed_vec![0u32, 1, 2, 3, 4, 5];
911 ///
912 /// // Find the partition point for elements `< 3`.
913 /// let partition_idx = vec.partition_point(|&x| x < 3);
914 /// assert_eq!(partition_idx, 3);
915 /// ```
916 pub fn partition_point<P>(&self, mut pred: P) -> usize
917 where
918 P: FnMut(&T) -> bool,
919 {
920 let mut len = self.len();
921 let mut left = 0;
922
923 while len > 0 {
924 let half = len / 2;
925 let mid = left + half;
926 // SAFETY: The loop invariants ensure `mid` is always in bounds.
927 let value = unsafe { self.get_unchecked(mid) };
928
929 if pred(&value) {
930 left = mid + 1;
931 len -= half + 1;
932 } else {
933 len = half;
934 }
935 }
936 left
937 }
938}
939
940/// Allows iterating over a borrowed [`FixedVec`] (e.g., `for val in &my_vec`).
941impl<'a, T, W, E, B> IntoIterator for &'a FixedVec<T, W, E, B>
942where
943 T: Storable<W>,
944 W: Word,
945 E: Endianness,
946 B: AsRef<[W]>,
947{
948 type Item = T;
949 type IntoIter = iter::FixedVecIter<'a, T, W, E, B>;
950
951 fn into_iter(self) -> Self::IntoIter {
952 self.iter()
953 }
954}
955
956/// Allows iterating over an owned [`FixedVec`], consuming it.
957impl<T, W, E> IntoIterator for FixedVec<T, W, E, Vec<W>>
958where
959 T: Storable<W> + 'static,
960 W: Word,
961 E: Endianness,
962{
963 type Item = T;
964 type IntoIter = iter::FixedVecIntoIter<'static, T, W, E>;
965
966 /// Consumes the vector and returns an iterator over its elements.
967 fn into_iter(self) -> Self::IntoIter {
968 iter::FixedVecIntoIter::new(self)
969 }
970}
971
972impl<T, W, E> FromIterator<T> for FixedVec<T, W, E, Vec<W>>
973where
974 T: Storable<W>,
975 W: Word,
976 E: Endianness,
977 dsi_bitstream::impls::BufBitWriter<E, dsi_bitstream::impls::MemWordWriterVec<W, Vec<W>>>:
978 dsi_bitstream::prelude::BitWrite<E, Error = std::convert::Infallible>,
979{
980 /// Creates a [`FixedVec`] by collecting elements from an iterator.
981 ///
982 /// The bit width is determined automatically using the [`BitWidth::Minimal`]
983 /// strategy. This requires collecting the iterator into a temporary `Vec<T>`
984 /// to analyze its contents before compression.
985 ///
986 /// # Memory Usage
987 ///
988 /// Because this implementation must first collect all items into a temporary
989 /// `Vec<T>` to determine the optimal `bit_width`, it may lead to a temporary
990 /// peak in memory usage that is roughly double the size of the uncompressed
991 /// data.
992 ///
993 /// For very large datasets where this memory overhead is a concern, it is
994 /// recommended to use [`FixedVec::from_iter_builder`] instead. The builder
995 /// allows for streaming construction but requires the `bit_width` to be
996 /// specified manually.
997 ///
998 /// # Examples
999 ///
1000 /// ```
1001 /// use compressed_intvec::fixed::{FixedVec, UFixedVec};
1002 ///
1003 /// let data = 0u32..100;
1004 /// let vec: UFixedVec<u32> = data.collect();
1005 ///
1006 /// assert_eq!(vec.len(), 100);
1007 /// assert_eq!(vec.get(50), Some(50));
1008 /// ```
1009 fn from_iter<I: IntoIterator<Item = T>>(iter: I) -> Self {
1010 let data: Vec<T> = iter.into_iter().collect();
1011 Self::builder().build(&data).unwrap()
1012 }
1013}
1014
1015impl<T, W, E> Default for FixedVec<T, W, E, Vec<W>>
1016where
1017 T: Storable<W>,
1018 W: Word,
1019 E: Endianness,
1020{
1021 /// Creates an empty [`FixedVec`] with a default `bit_width` of 1.
1022 fn default() -> Self {
1023 // SAFETY: An empty vector with a valid bit_width is always safe.
1024 unsafe { Self::new_unchecked(Vec::new(), 0, 1) }
1025 }
1026}
1027
1028// Methods for owned vectors (`B = Vec<W>`).
1029impl<T, W, E> FixedVec<T, W, E, Vec<W>>
1030where
1031 T: Storable<W> + ToPrimitive,
1032 W: Word,
1033 E: Endianness,
1034{
1035 /// Creates a new, empty [`FixedVec`] with a specified bit width.
1036 ///
1037 /// # Errors
1038 ///
1039 /// Returns an [`Error::InvalidParameters`] if `bit_width` is greater than
1040 /// the number of bits in the storage word `W`.
1041 ///
1042 /// # Examples
1043 ///
1044 /// ```
1045 /// use compressed_intvec::fixed::{FixedVec, UFixedVec};
1046 ///
1047 /// let vec: UFixedVec<u32> = FixedVec::new(8).unwrap();
1048 /// assert!(vec.is_empty());
1049 /// assert_eq!(vec.bit_width(), 8);
1050 /// ```
1051 pub fn new(bit_width: usize) -> Result<Self, Error> {
1052 if bit_width > <W as traits::Word>::BITS {
1053 return Err(Error::InvalidParameters(format!(
1054 "bit_width ({}) cannot be greater than the word size ({})",
1055 bit_width,
1056 <W as traits::Word>::BITS
1057 )));
1058 }
1059 Ok(unsafe { Self::new_unchecked(Vec::new(), 0, bit_width) })
1060 }
1061
1062 /// Appends an element to the end of the vector.
1063 ///
1064 /// This operation has amortized O(1) complexity.
1065 ///
1066 /// # Panics
1067 ///
1068 /// Panics if the `value` is too large to be represented by the configured
1069 /// `bit_width`.
1070 ///
1071 /// # Examples
1072 ///
1073 /// ```
1074 /// use compressed_intvec::fixed::{FixedVec, UFixedVec};
1075 ///
1076 /// let mut vec: UFixedVec<u32> = FixedVec::new(4).unwrap();
1077 /// vec.push(10);
1078 /// vec.push(15);
1079 ///
1080 /// assert_eq!(vec.len(), 2);
1081 /// assert_eq!(vec.get(1), Some(15));
1082 /// ```
1083 #[inline(always)]
1084 pub fn push(&mut self, value: T) {
1085 let value_w = <T as Storable<W>>::into_word(value);
1086
1087 // Check if the value fits within the configured bit width.
1088 if (value_w & !self.mask) != W::ZERO {
1089 panic!(
1090 "Value {:?} does not fit in the configured bit_width of {}",
1091 value_w, self.bit_width
1092 );
1093 }
1094
1095 let bits_per_word = <W as traits::Word>::BITS;
1096 // Grow the underlying buffer if the new element would exceed its bit capacity.
1097 if (self.len + 1) * self.bit_width > self.bits.len() * bits_per_word {
1098 self.bits.push(W::ZERO);
1099 }
1100
1101 // SAFETY: We have ensured the value fits and the buffer has capacity.
1102 unsafe {
1103 self.set_unchecked(self.len, value_w);
1104 }
1105
1106 self.len += 1;
1107 }
1108
1109 /// Removes the last element from the vector and returns it.
1110 ///
1111 /// Returns `None` if the vector is empty.
1112 ///
1113 /// # Examples
1114 ///
1115 /// ```
1116 /// use compressed_intvec::fixed_vec;
1117 ///
1118 /// let mut vec = fixed_vec![1u32, 2, 3];
1119 /// assert_eq!(vec.pop(), Some(3));
1120 /// assert_eq!(vec.len(), 2);
1121 /// ```
1122 pub fn pop(&mut self) -> Option<T> {
1123 if self.is_empty() {
1124 return None;
1125 }
1126 let value = self.get(self.len - 1).unwrap();
1127 self.len -= 1;
1128 Some(value)
1129 }
1130
1131 /// Removes all elements from the vector, leaving the capacity unchanged.
1132 pub fn clear(&mut self) {
1133 self.len = 0;
1134 }
1135
1136 /// Creates a new, empty [`FixedVec`] with a specified bit width and capacity.
1137 ///
1138 /// The vector will be able to hold at least `capacity` elements without
1139 /// reallocating.
1140 ///
1141 /// # Errors
1142 ///
1143 /// Returns an [`Error::InvalidParameters`] if `bit_width` is greater than
1144 /// the number of bits in the storage word `W`.
1145 ///
1146 /// # Examples
1147 ///
1148 /// ```
1149 /// use compressed_intvec::fixed::{FixedVec, UFixedVec};
1150 ///
1151 /// let vec: UFixedVec<u32> = FixedVec::with_capacity(5, 1000).unwrap();
1152 /// assert!(vec.capacity() >= 1000);
1153 /// ```
1154 pub fn with_capacity(bit_width: usize, capacity: usize) -> Result<Self, Error> {
1155 if bit_width > <W as traits::Word>::BITS {
1156 return Err(Error::InvalidParameters(format!(
1157 "bit_width ({}) cannot be greater than the word size ({})",
1158 bit_width,
1159 <W as traits::Word>::BITS
1160 )));
1161 }
1162 let bits_per_word = <W as traits::Word>::BITS;
1163 let total_bits = capacity.saturating_mul(bit_width);
1164 let num_words = total_bits.div_ceil(bits_per_word);
1165
1166 let buffer = if capacity == 0 {
1167 Vec::new()
1168 } else {
1169 Vec::with_capacity(num_words + 1) // +1 for padding
1170 };
1171
1172 Ok(unsafe { Self::new_unchecked(buffer, 0, bit_width) })
1173 }
1174
1175 /// Returns the number of elements the vector can hold without reallocating.
1176 pub fn capacity(&self) -> usize {
1177 if self.bit_width == 0 {
1178 return usize::MAX;
1179 }
1180 let word_capacity = self.bits.capacity();
1181 if word_capacity <= 1 {
1182 return 0; // Not enough for data + padding.
1183 }
1184 // Subtract padding words before calculating element capacity.
1185 ((word_capacity - 1) * <W as traits::Word>::BITS) / self.bit_width
1186 }
1187
1188 /// Returns the capacity of the underlying storage in words.
1189 pub fn word_capacity(&self) -> usize {
1190 self.bits.capacity()
1191 }
1192
1193 /// Reserves capacity for at least `additional` more elements to be inserted.
1194 ///
1195 /// # Panics
1196 ///
1197 /// Panics if the new capacity overflows [`usize`].
1198 ///
1199 /// # Examples
1200 ///
1201 /// ```
1202 /// use compressed_intvec::fixed::{FixedVec, UFixedVec};
1203 ///
1204 /// let mut vec: UFixedVec<u32> = FixedVec::new(4).unwrap();
1205 /// vec.reserve(100);
1206 /// assert!(vec.capacity() >= 100);
1207 /// ```
1208 pub fn reserve(&mut self, additional: usize) {
1209 let target_element_capacity = self.len.saturating_add(additional);
1210 if self.capacity() >= target_element_capacity {
1211 return;
1212 }
1213 let bits_per_word = <W as Word>::BITS;
1214 let required_total_bits = target_element_capacity.saturating_mul(self.bit_width);
1215 let required_data_words = required_total_bits.div_ceil(bits_per_word);
1216 let required_word_capacity = required_data_words + 1; // +1 for padding
1217
1218 let current_len = self.bits.len();
1219 if self.bits.capacity() < required_word_capacity {
1220 self.bits.reserve(required_word_capacity - current_len);
1221 }
1222 }
1223
1224 /// Resizes the vector so that its length is equal to `new_len`.
1225 ///
1226 /// If `new_len` is greater than the current length, the vector is extended
1227 /// by appending `value`. If `new_len` is less than the current length, the
1228 /// vector is truncated.
1229 ///
1230 /// # Panics
1231 ///
1232 /// Panics if the `value` used for filling does not fit in the configured
1233 /// `bit_width`.
1234 ///
1235 /// # Examples
1236 ///
1237 /// ```
1238 /// use compressed_intvec::fixed::{FixedVec, UFixedVec};
1239 ///
1240 /// let mut vec = UFixedVec::<u32>::new(4).unwrap();
1241 /// vec.push(1);
1242 /// vec.push(2);
1243 ///
1244 /// vec.resize(4, 10);
1245 /// assert_eq!(vec.len(), 4);
1246 /// assert_eq!(vec.get(3), Some(10));
1247 ///
1248 /// vec.resize(1, 0);
1249 /// assert_eq!(vec.len(), 1);
1250 /// assert_eq!(vec.get(0), Some(1));
1251 /// ```
1252 #[inline(always)]
1253 pub fn resize(&mut self, new_len: usize, value: T) {
1254 if new_len > self.len {
1255 let value_w = <T as Storable<W>>::into_word(value);
1256
1257 if (value_w & !self.mask) != W::ZERO {
1258 panic!(
1259 "Value {:?} does not fit in the configured bit_width of {}",
1260 value_w, self.bit_width
1261 );
1262 }
1263
1264 let bits_per_word = <W as traits::Word>::BITS;
1265 let required_total_bits = new_len * self.bit_width;
1266 let required_data_words = required_total_bits.div_ceil(bits_per_word);
1267 let required_vec_len = required_data_words.saturating_add(1); // Padding
1268
1269 if self.bits.len() < required_vec_len {
1270 self.bits.resize(required_vec_len, W::ZERO);
1271 }
1272
1273 for i in self.len..new_len {
1274 unsafe {
1275 self.set_unchecked(i, value_w);
1276 }
1277 }
1278 self.len = new_len;
1279 } else {
1280 self.len = new_len;
1281 }
1282 }
1283
1284 /// Shrinks the capacity of the vector as much as possible.
1285 pub fn shrink_to_fit(&mut self) {
1286 let min_word_len = if self.len == 0 {
1287 0
1288 } else {
1289 let bits_per_word = <W as traits::Word>::BITS;
1290 let required_total_bits = self.len.saturating_mul(self.bit_width);
1291 let required_words = required_total_bits.div_ceil(bits_per_word);
1292 required_words + 1 // +1 for padding
1293 };
1294
1295 if self.bits.len() > min_word_len {
1296 self.bits.truncate(min_word_len);
1297 }
1298 self.bits.shrink_to_fit();
1299 }
1300
1301 /// Removes and returns the element at `index`, shifting all elements after
1302 /// it to the left.
1303 ///
1304 /// This operation is O(n), where n is the number of elements after `index`.
1305 ///
1306 /// # Panics
1307 ///
1308 /// Panics if `index` is out of bounds.
1309 ///
1310 /// # Complexity
1311 ///
1312 /// This operation has a complexity of O(L), where L is the number of bits
1313 /// that need to be shifted. In the worst case (removing the first element),
1314 /// this is proportional to the total number of bits in the vector
1315 /// (`self.len() * self.bit_width()`).
1316 ///
1317 /// # Panics
1318 ///
1319 /// Panics if `index` is out of bounds.
1320 pub fn remove(&mut self, index: usize) -> T {
1321 assert!(index < self.len, "remove: index out of bounds");
1322
1323 let value_to_return = self.get(index).unwrap();
1324
1325 let start_bit = index * self.bit_width;
1326 let end_bit = self.len * self.bit_width;
1327 let total_bits_to_shift = end_bit - (start_bit + self.bit_width);
1328
1329 if total_bits_to_shift > 0 {
1330 self.shift_bits_left(start_bit, self.bit_width, total_bits_to_shift);
1331 }
1332
1333 self.len -= 1;
1334
1335 value_to_return
1336 }
1337
1338 /// Inserts an element at `index`, shifting all elements after it to the right.
1339 ///
1340 /// This operation is O(n), where n is the number of elements after `index`.
1341 ///
1342 /// # Panics
1343 ///
1344 /// Panics if `index` is out of bounds or if the `element` is too large to
1345 /// be represented by the configured `bit_width`.
1346 ///
1347 /// # Complexity
1348 ///
1349 /// This operation has a complexity of O(L), where L is the number of bits
1350 /// that need to be shifted. In the worst case (inserting at the beginning),
1351 /// this is proportional to the total number of bits in the vector
1352 /// (`self.len() * self.bit_width()`).
1353 ///
1354 /// # Panics
1355 ///
1356 /// Panics if `index` is out of bounds or if the `element` is too large to
1357 /// be represented by the configured `bit_width`.
1358 pub fn insert(&mut self, index: usize, element: T) {
1359 assert!(index <= self.len, "insert: index out of bounds");
1360 let value_w = <T as Storable<W>>::into_word(element);
1361 let bits_per_word = <W as Word>::BITS;
1362 let limit = if self.bit_width < bits_per_word {
1363 W::ONE << self.bit_width
1364 } else {
1365 W::max_value()
1366 };
1367 if self.bit_width < bits_per_word && value_w >= limit {
1368 panic!(
1369 "Value {:?} does not fit in the configured bit_width of {}",
1370 value_w, self.bit_width
1371 );
1372 }
1373 self.reserve(1);
1374 let start_shift_bit = index * self.bit_width;
1375 let num_bits_to_move = (self.len - index) * self.bit_width;
1376 if num_bits_to_move > 0 {
1377 self.shift_bits_right(start_shift_bit, self.bit_width, num_bits_to_move);
1378 }
1379 self.len += 1;
1380 unsafe {
1381 self.set_unchecked(index, value_w);
1382 }
1383 }
1384
1385 /// Shifts a range of bits to the left in-place.
1386 fn shift_bits_left(&mut self, start_bit: usize, shift_amount: usize, num_bits_to_move: usize) {
1387 if num_bits_to_move == 0 {
1388 return;
1389 }
1390
1391 let bits_per_word = <W as Word>::BITS;
1392
1393 // Fast path for word-aligned shifts, using `copy_within`.
1394 if shift_amount % bits_per_word == 0 {
1395 let start_write_word = start_bit / bits_per_word;
1396 let start_read_word = (start_bit + shift_amount) / bits_per_word;
1397 let num_words_to_move = num_bits_to_move.div_ceil(bits_per_word);
1398
1399 if start_read_word < self.bits.len() {
1400 let read_end = (start_read_word + num_words_to_move).min(self.bits.len());
1401 self.bits
1402 .copy_within(start_read_word..read_end, start_write_word);
1403 }
1404 return;
1405 }
1406
1407 // Slow path for unaligned shifts (word-at-a-time).
1408 let shift_rem = shift_amount % bits_per_word;
1409 let inv_shift_rem = bits_per_word - shift_rem;
1410
1411 let start_write_bit = start_bit;
1412 let end_write_bit = start_bit + num_bits_to_move;
1413
1414 let start_write_word = start_write_bit / bits_per_word;
1415 let end_write_word = (end_write_bit - 1) / bits_per_word;
1416
1417 for write_word_idx in start_write_word..=end_write_word {
1418 // Fetch the source data, which may span two source words.
1419 let read_bit = write_word_idx * bits_per_word + shift_rem;
1420 let read_word_idx = read_bit / bits_per_word;
1421
1422 let low_part = self.bits.get(read_word_idx).copied().unwrap_or(W::ZERO) >> shift_rem;
1423 let high_part =
1424 self.bits.get(read_word_idx + 1).copied().unwrap_or(W::ZERO) << inv_shift_rem;
1425
1426 let value_to_write = low_part | high_part;
1427
1428 // Create a mask for the bits we are about to modify.
1429 let mut mask = W::max_value();
1430 if write_word_idx == start_write_word {
1431 mask &= W::max_value() << (start_write_bit % bits_per_word);
1432 }
1433 if write_word_idx == end_write_word {
1434 let end_offset = end_write_bit % bits_per_word;
1435 if end_offset != 0 {
1436 mask &= (W::ONE << end_offset).wrapping_sub(W::ONE);
1437 }
1438 }
1439
1440 self.bits[write_word_idx] =
1441 (self.bits[write_word_idx] & !mask) | (value_to_write & mask);
1442 }
1443 }
1444
1445 /// Shifts a range of bits to the right in-place.
1446 fn shift_bits_right(&mut self, start_bit: usize, shift_amount: usize, num_bits_to_move: usize) {
1447 if num_bits_to_move == 0 {
1448 return;
1449 }
1450
1451 let bits_per_word = <W as Word>::BITS;
1452
1453 // Ensure the vector has enough capacity and is resized to accommodate the shift.
1454 let required_end_bit = start_bit + shift_amount + num_bits_to_move;
1455 let required_words = required_end_bit.div_ceil(bits_per_word);
1456 let required_vec_len = required_words.saturating_add(1); // +1 for padding
1457 if self.bits.len() < required_vec_len {
1458 self.bits.resize(required_vec_len, W::ZERO);
1459 }
1460
1461 // Fast path for word-aligned shifts.
1462 if shift_amount % bits_per_word == 0 {
1463 let start_read_word = start_bit / bits_per_word;
1464 let start_write_word = (start_bit + shift_amount) / bits_per_word;
1465 let num_words_to_move = num_bits_to_move.div_ceil(bits_per_word);
1466
1467 if start_read_word + num_words_to_move <= self.bits.len() {
1468 self.bits.copy_within(
1469 start_read_word..start_read_word + num_words_to_move,
1470 start_write_word,
1471 );
1472 }
1473 } else {
1474 // Slow path for unaligned shifts (iterating from right to left).
1475 let word_shift = shift_amount / bits_per_word;
1476 let shift_rem = shift_amount % bits_per_word;
1477 let inv_shift_rem = bits_per_word - shift_rem;
1478
1479 let start_write_bit = start_bit + shift_amount;
1480 let end_write_bit = start_write_bit + num_bits_to_move;
1481
1482 let start_write_word = start_write_bit / bits_per_word;
1483 let end_write_word = (end_write_bit - 1) / bits_per_word;
1484
1485 for write_word_idx in (start_write_word..=end_write_word).rev() {
1486 let read_word_idx = write_word_idx - word_shift;
1487
1488 // Fetch source data, which may span two words.
1489 let high_part =
1490 self.bits.get(read_word_idx).copied().unwrap_or(W::ZERO) << shift_rem;
1491 let low_part = if read_word_idx > 0 {
1492 self.bits.get(read_word_idx - 1).copied().unwrap_or(W::ZERO) >> inv_shift_rem
1493 } else {
1494 W::ZERO
1495 };
1496 let value_to_write = low_part | high_part;
1497
1498 // Create a mask for the bits we are about to modify.
1499 let mut mask = W::max_value();
1500 if write_word_idx == start_write_word {
1501 mask &= W::max_value() << (start_write_bit % bits_per_word);
1502 }
1503 if write_word_idx == end_write_word {
1504 let end_offset = end_write_bit % bits_per_word;
1505 if end_offset != 0 {
1506 mask &= (W::ONE << end_offset).wrapping_sub(W::ONE);
1507 }
1508 }
1509
1510 self.bits[write_word_idx] =
1511 (self.bits[write_word_idx] & !mask) | (value_to_write & mask);
1512 }
1513 }
1514
1515 // Zero out the vacated bits at the beginning of the shifted region.
1516 let mut clear_bit = start_bit;
1517 let end_clear_bit = start_bit + shift_amount;
1518
1519 while clear_bit < end_clear_bit {
1520 let word_idx = clear_bit / bits_per_word;
1521 let offset = clear_bit % bits_per_word;
1522 let bits_to_clear = (bits_per_word - offset).min(end_clear_bit - clear_bit);
1523
1524 let mask = if bits_to_clear == bits_per_word {
1525 W::max_value()
1526 } else {
1527 ((W::ONE << bits_to_clear).wrapping_sub(W::ONE)) << offset
1528 };
1529
1530 if word_idx < self.bits.len() {
1531 self.bits[word_idx] &= !mask;
1532 }
1533 clear_bit += bits_to_clear;
1534 }
1535 }
1536
1537 /// Removes an element at `index` and returns it, replacing it with the last
1538 /// element of the vector.
1539 ///
1540 /// This operation is O(1) but does not preserve the order of elements.
1541 ///
1542 /// # Panics
1543 ///
1544 /// Panics if `index` is out of bounds.
1545 pub fn swap_remove(&mut self, index: usize) -> T {
1546 assert!(index < self.len, "swap_remove: index out of bounds");
1547
1548 if index == self.len - 1 {
1549 self.pop().unwrap()
1550 } else {
1551 let old_val = unsafe { self.get_unchecked(index) };
1552 let last_val = self.pop().unwrap(); // `pop` already decrements len
1553 self.set(index, last_val);
1554 old_val
1555 }
1556 }
1557
1558 /// Appends an element to the vector, returning an error if the value doesn't fit.
1559 ///
1560 /// # Errors
1561 ///
1562 /// Returns [`Error::ValueTooLarge`] if the `value` cannot be represented
1563 /// by the configured `bit_width`.
1564 pub fn try_push(&mut self, value: T) -> Result<(), Error> {
1565 let value_w = <T as Storable<W>>::into_word(value);
1566 let bits_per_word = <W as traits::Word>::BITS;
1567
1568 let limit = if self.bit_width < bits_per_word {
1569 W::ONE << self.bit_width
1570 } else {
1571 W::max_value()
1572 };
1573
1574 if self.bit_width < bits_per_word && value_w >= limit {
1575 return Err(Error::ValueTooLarge {
1576 value: value_w.to_u128().unwrap(),
1577 index: self.len,
1578 bit_width: self.bit_width,
1579 });
1580 }
1581
1582 self.push(value);
1583 Ok(())
1584 }
1585
1586 /// Extends the vector with the elements from a slice.
1587 ///
1588 /// # Panics
1589 ///
1590 /// Panics if any value in `other` does not fit within the `bit_width`.
1591 pub fn extend_from_slice(&mut self, other: &[T]) {
1592 if other.is_empty() {
1593 return;
1594 }
1595
1596 self.reserve(other.len());
1597
1598 // Pre-validate all values to ensure atomicity.
1599 let bits_per_word = <W as traits::Word>::BITS;
1600 let limit = if self.bit_width < bits_per_word {
1601 W::ONE << self.bit_width
1602 } else {
1603 W::max_value()
1604 };
1605 if self.bit_width < bits_per_word {
1606 for (i, &value) in other.iter().enumerate() {
1607 let value_w = <T as Storable<W>>::into_word(value);
1608 if value_w >= limit {
1609 panic!(
1610 "Value at index {} of slice ({:?}) does not fit in the configured bit_width of {}",
1611 i, value_w, self.bit_width
1612 );
1613 }
1614 }
1615 }
1616
1617 let old_len = self.len;
1618 let new_len = old_len + other.len();
1619
1620 // Ensure the underlying Vec has enough initialized words to write into.
1621 let required_total_bits = new_len * self.bit_width;
1622 let required_data_words = required_total_bits.div_ceil(bits_per_word);
1623 let required_vec_len = required_data_words.saturating_add(1); // Padding
1624 if self.bits.len() < required_vec_len {
1625 self.bits.resize(required_vec_len, W::ZERO);
1626 }
1627
1628 // Write the new values.
1629 for (i, &value) in other.iter().enumerate() {
1630 // SAFETY: We have reserved, resized, and validated the data.
1631 unsafe {
1632 self.set_unchecked(old_len + i, <T as Storable<W>>::into_word(value));
1633 }
1634 }
1635
1636 self.len = new_len;
1637 }
1638}
1639
1640// Mutable in-place operations.
1641impl<T, W, E, B> FixedVec<T, W, E, B>
1642where
1643 T: Storable<W>,
1644 W: Word,
1645 E: Endianness,
1646 B: AsRef<[W]> + AsMut<[W]>,
1647{
1648 /// Returns a mutable proxy for an element at `index`.
1649 ///
1650 /// This allows for syntax like `*vec.at_mut(i).unwrap() = new_value;`.
1651 ///
1652 /// Returns `None` if the index is out of bounds.
1653 pub fn at_mut(&mut self, index: usize) -> Option<MutProxy<'_, T, W, E, B>> {
1654 if index >= self.len {
1655 return None;
1656 }
1657 Some(MutProxy::new(self, index))
1658 }
1659
1660 /// Returns a mutable slice of the underlying storage words.
1661 ///
1662 /// # Safety
1663 ///
1664 /// Modifying the returned slice is logically unsafe. Any change to the bits
1665 /// can violate the invariants of the [`FixedVec`], leading to panics or
1666 /// incorrect results on subsequent method calls.
1667 pub unsafe fn as_mut_limbs(&mut self) -> &mut [W] {
1668 self.bits.as_mut()
1669 }
1670
1671 /// Sets the value of the element at `index`.
1672 ///
1673 /// # Panics
1674 ///
1675 /// Panics if `index` is out of bounds or if `value` is too large to be
1676 /// represented by the configured `bit_width`.
1677 ///
1678 /// # Examples
1679 ///
1680 /// ```
1681 /// use compressed_intvec::fixed::{FixedVec, UFixedVec, BitWidth};
1682 ///
1683 /// let data: &[u32] = &[10, 20, 30];
1684 /// let mut vec: UFixedVec<u32> = FixedVec::builder().bit_width(BitWidth::Explicit(7)).build(data).unwrap();
1685 ///
1686 /// vec.set(1, 99);
1687 /// assert_eq!(vec.get(1), Some(99));
1688 /// ```
1689 pub fn set(&mut self, index: usize, value: T) {
1690 assert!(
1691 index < self.len,
1692 "Index out of bounds: expected index < {}, got {}",
1693 self.len,
1694 index
1695 );
1696
1697 let value_w = <T as Storable<W>>::into_word(value);
1698 let bits_per_word = <W as traits::Word>::BITS;
1699
1700 let limit = if self.bit_width < bits_per_word {
1701 W::ONE << self.bit_width
1702 } else {
1703 W::max_value()
1704 };
1705
1706 if self.bit_width < bits_per_word && value_w >= limit {
1707 panic!(
1708 "Value {:?} does not fit in the configured bit_width of {}",
1709 value_w, self.bit_width
1710 );
1711 }
1712
1713 unsafe { self.set_unchecked(index, value_w) };
1714 }
1715
1716 /// Sets the value of the element at `index` without bounds or value checking.
1717 ///
1718 /// # Safety
1719 ///
1720 /// The caller must ensure that `index` is within bounds and that `value_w`
1721 /// fits within the configured `bit_width`. Failure to do so will result in
1722 /// data corruption or a panic.
1723 pub unsafe fn set_unchecked(&mut self, index: usize, value_w: W) {
1724 let bits_per_word = <W as traits::Word>::BITS;
1725 let bit_pos = index * self.bit_width;
1726 let word_index = bit_pos / bits_per_word;
1727 let bit_offset = bit_pos % bits_per_word;
1728
1729 let limbs = self.bits.as_mut();
1730
1731 if E::IS_LITTLE {
1732 // Fast path: the value fits entirely within a single word.
1733 if bit_offset + self.bit_width <= bits_per_word {
1734 let word = &mut limbs[word_index];
1735 // Clear the target bits and then OR the new value.
1736 *word &= !(self.mask << bit_offset);
1737 *word |= value_w << bit_offset;
1738 } else {
1739 // Slow path: the value spans two words.
1740 let (left, right) = limbs.split_at_mut(word_index + 1);
1741 let low_word = &mut left[word_index];
1742 let high_word = &mut right[0];
1743
1744 // Write the low part of the value to the first word.
1745 *low_word &= !(self.mask << bit_offset);
1746 *low_word |= value_w << bit_offset;
1747
1748 // Write the high part of the value to the next word.
1749 let bits_in_high = (bit_offset + self.bit_width) - bits_per_word;
1750 let high_mask = self.mask >> (self.bit_width - bits_in_high);
1751 *high_word &= !high_mask;
1752 *high_word |= value_w >> (self.bit_width - bits_in_high);
1753 }
1754 } else {
1755 // Big-Endian set logic.
1756 if bit_offset + self.bit_width <= bits_per_word {
1757 let shift = bits_per_word - self.bit_width - bit_offset;
1758 let mask = self.mask << shift;
1759 let word = &mut limbs[word_index];
1760 *word &= !mask.to_be();
1761 *word |= (value_w << shift).to_be();
1762 } else {
1763 let (left, right) = limbs.split_at_mut(word_index + 1);
1764 let high_word = &mut left[word_index];
1765 let low_word = &mut right[0];
1766
1767 let bits_in_first = bits_per_word - bit_offset;
1768 let bits_in_second = self.bit_width - bits_in_first;
1769
1770 let high_mask =
1771 (self.mask >> bits_in_second) << (bits_per_word - bits_in_first - bit_offset);
1772 let high_value = value_w >> bits_in_second;
1773 *high_word &= !high_mask.to_be();
1774 *high_word |= (high_value << (bits_per_word - bits_in_first - bit_offset)).to_be();
1775
1776 let low_mask = self.mask << (bits_per_word - bits_in_second);
1777 let low_value = value_w << (bits_per_word - bits_in_second);
1778 *low_word &= !low_mask.to_be();
1779 *low_word |= low_value.to_be();
1780 }
1781 }
1782 }
1783
1784 /// Sets the value of an element, returning an error if the value doesn't fit.
1785 ///
1786 /// # Errors
1787 ///
1788 /// Returns [`Error::ValueTooLarge`] if the `value` cannot be represented
1789 /// by the configured `bit_width`.
1790 ///
1791 /// # Panics
1792 ///
1793 /// Panics if `index` is out of bounds.
1794 pub fn try_set(&mut self, index: usize, value: T) -> Result<(), Error> {
1795 assert!(index < self.len, "try_set: index out of bounds");
1796
1797 let value_w = <T as Storable<W>>::into_word(value);
1798 let bits_per_word = <W as traits::Word>::BITS;
1799
1800 let limit = if self.bit_width < bits_per_word {
1801 W::ONE << self.bit_width
1802 } else {
1803 W::max_value()
1804 };
1805
1806 if self.bit_width < bits_per_word && value_w >= limit {
1807 return Err(Error::ValueTooLarge {
1808 value: value_w.to_u128().unwrap(),
1809 index,
1810 bit_width: self.bit_width,
1811 });
1812 }
1813
1814 unsafe { self.set_unchecked(index, value_w) };
1815 Ok(())
1816 }
1817
1818 /// Returns an iterator over non-overlapping, mutable chunks of the vector.
1819 ///
1820 /// # Panics
1821 ///
1822 /// Panics if `chunk_size` is 0.
1823 pub fn chunks_mut(&mut self, chunk_size: usize) -> iter_mut::ChunksMut<'_, T, W, E, B> {
1824 iter_mut::ChunksMut::new(self, chunk_size)
1825 }
1826
1827 /// Returns an unchecked mutable iterator over the elements of the vector.
1828 ///
1829 /// # Safety
1830 ///
1831 /// The caller must ensure that the iterator is not advanced beyond the
1832 /// vector's length.
1833 pub unsafe fn iter_mut_unchecked(&mut self) -> iter_mut::IterMutUnchecked<'_, T, W, E, B> {
1834 iter_mut::IterMutUnchecked::new(self)
1835 }
1836
1837 /// Applies a function to all elements in place, without checking if the
1838 /// returned values fit within the `bit_width`.
1839 ///
1840 /// # Safety
1841 ///
1842 /// The caller must ensure that the function `f` always returns a value
1843 /// that can be represented by `self.bit_width()` bits. Returning a value
1844 /// that is too large will result in data corruption.
1845 pub unsafe fn map_in_place_unchecked<F>(&mut self, mut f: F)
1846 where
1847 F: FnMut(T) -> T,
1848 {
1849 if E::IS_LITTLE {
1850 // This path operates on words directly for efficiency.
1851 let mut word_op = |word: W| -> W {
1852 let val_t = <T as Storable<W>>::from_word(word);
1853 let new_val_t = f(val_t);
1854 <T as Storable<W>>::into_word(new_val_t)
1855 };
1856 self.map_in_place_generic_word_op(&mut word_op);
1857 } else {
1858 // Fallback for BE, which is more complex to optimize at the word level.
1859 for i in 0..self.len {
1860 let old_val_t = self.get_unchecked(i);
1861 let new_val_t = f(old_val_t);
1862 self.set_unchecked(i, <T as Storable<W>>::into_word(new_val_t));
1863 }
1864 }
1865 }
1866
1867 /// Internal worker for `map_in_place_unchecked`, operating on `W`.
1868 unsafe fn map_in_place_generic_word_op<FW>(&mut self, f_w: &mut FW)
1869 where
1870 FW: FnMut(W) -> W,
1871 {
1872 let bit_width = self.bit_width;
1873 if self.len == 0 || bit_width == 0 {
1874 return;
1875 }
1876
1877 let bits_per_word = <W as Word>::BITS;
1878
1879 // Path for power-of-two bit widths that align with word boundaries.
1880 if bit_width.is_power_of_two() && bits_per_word % bit_width == 0 {
1881 let elems_per_word = bits_per_word / bit_width;
1882 let mask = self.mask;
1883 let num_full_words = self.len / elems_per_word;
1884
1885 for word_idx in 0..num_full_words {
1886 let old_word = *self.bits.as_ref().get_unchecked(word_idx);
1887 let mut new_word = W::ZERO;
1888 for i in 0..elems_per_word {
1889 let shift = i * bit_width;
1890 let old_val_w = (old_word >> shift) & mask;
1891 new_word |= f_w(old_val_w) << shift;
1892 }
1893 *self.bits.as_mut().get_unchecked_mut(word_idx) = new_word;
1894 }
1895 // Process remaining elements individually.
1896 let start_idx = num_full_words * elems_per_word;
1897 for i in start_idx..self.len {
1898 let old_val_t = self.get(i).unwrap();
1899 let old_val_w = <T as Storable<W>>::into_word(old_val_t);
1900 self.set_unchecked(i, f_w(old_val_w));
1901 }
1902 return;
1903 }
1904
1905 // Generic path for non-power-of-two bit widths.
1906 let limbs = self.bits.as_mut();
1907 let num_words = (self.len * bit_width).div_ceil(bits_per_word);
1908 let last_word_idx = num_words.saturating_sub(1);
1909
1910 let mut write_buffer = W::ZERO;
1911 let mut read_buffer = *limbs.get_unchecked(0);
1912 let mut global_bit_offset = 0;
1913
1914 for word_idx in 0..last_word_idx {
1915 let lower_word_boundary = word_idx * bits_per_word;
1916 let upper_word_boundary = lower_word_boundary + bits_per_word;
1917
1918 while global_bit_offset + bit_width <= upper_word_boundary {
1919 let offset_in_word = global_bit_offset - lower_word_boundary;
1920 let element = (read_buffer >> offset_in_word) & self.mask;
1921 write_buffer |= f_w(element) << offset_in_word;
1922 global_bit_offset += bit_width;
1923 }
1924
1925 let next_word = *limbs.get_unchecked(word_idx + 1);
1926 let mut new_write_buffer = W::ZERO;
1927
1928 if upper_word_boundary != global_bit_offset {
1929 let elem_idx = global_bit_offset / bit_width;
1930 if elem_idx >= self.len {
1931 *limbs.get_unchecked_mut(word_idx) = write_buffer;
1932 return;
1933 }
1934
1935 let remainder_in_word = upper_word_boundary - global_bit_offset;
1936 let offset_in_word = global_bit_offset - lower_word_boundary;
1937
1938 let element = ((read_buffer >> offset_in_word) | (next_word << remainder_in_word))
1939 & self.mask;
1940 let new_element = f_w(element);
1941
1942 write_buffer |= new_element << offset_in_word;
1943 new_write_buffer = new_element >> remainder_in_word;
1944
1945 global_bit_offset += bit_width;
1946 }
1947
1948 *limbs.get_unchecked_mut(word_idx) = write_buffer;
1949
1950 read_buffer = next_word;
1951 write_buffer = new_write_buffer;
1952 }
1953
1954 let lower_word_boundary = last_word_idx * bits_per_word;
1955
1956 while global_bit_offset < self.len * bit_width {
1957 let offset_in_word = global_bit_offset - lower_word_boundary;
1958 let element = (read_buffer >> offset_in_word) & self.mask;
1959 write_buffer |= f_w(element) << offset_in_word;
1960 global_bit_offset += bit_width;
1961 }
1962
1963 *limbs.get_unchecked_mut(last_word_idx) = write_buffer;
1964 }
1965
1966 /// Applies a function to all elements in the vector, modifying them in-place.
1967 ///
1968 /// # Panics
1969 ///
1970 /// Panics if the function `f` returns a value that does not fit within the
1971 /// configured `bit_width`.
1972 ///
1973 /// # Examples
1974 ///
1975 /// ```
1976 /// use compressed_intvec::fixed::{FixedVec, BitWidth, UFixedVec};
1977 ///
1978 /// // Values up to 9*2=18, requires 5 bits. We must build with enough space.
1979 /// let initial_data: Vec<u32> = (0..10).collect();
1980 /// let mut vec: UFixedVec<u32> = FixedVec::builder()
1981 /// .bit_width(BitWidth::Explicit(5))
1982 /// .build(&initial_data)
1983 /// .unwrap();
1984 ///
1985 /// vec.map_in_place(|x| x * 2);
1986 ///
1987 /// for i in 0..vec.len() {
1988 /// assert_eq!(vec.get(i), Some(i as u32 * 2));
1989 /// }
1990 /// ```
1991 pub fn map_in_place<F>(&mut self, mut f: F)
1992 where
1993 F: FnMut(T) -> T,
1994 {
1995 let bit_width = self.bit_width;
1996 let limit = if bit_width < <W as Word>::BITS {
1997 W::ONE << bit_width
1998 } else {
1999 W::max_value()
2000 };
2001
2002 let safe_f = |value: T| {
2003 let new_value = f(value);
2004 let new_value_w = <T as Storable<W>>::into_word(new_value);
2005 if bit_width < <W as Word>::BITS && new_value_w >= limit {
2006 panic!(
2007 "map_in_place: returned value {new_value_w:?} does not fit in bit_width {bit_width}"
2008 );
2009 }
2010 new_value
2011 };
2012
2013 // SAFETY: The `safe_f` wrapper ensures that any value passed to the
2014 // underlying unsafe function is valid for the vector's bit_width.
2015 unsafe {
2016 self.map_in_place_unchecked(safe_f);
2017 }
2018 }
2019
2020 /// Replaces the element at `index` with a new `value`, returning the old value.
2021 ///
2022 /// # Panics
2023 ///
2024 /// Panics if `index` is out of bounds or if `value` does not fit within
2025 /// the configured `bit_width`.
2026 pub fn replace(&mut self, index: usize, value: T) -> T {
2027 assert!(index < self.len, "replace: index out of bounds");
2028
2029 let old_value = unsafe { self.get_unchecked(index) };
2030 self.set(index, value);
2031 old_value
2032 }
2033
2034 /// Swaps the elements at indices `a` and `b`.
2035 ///
2036 /// # Panics
2037 ///
2038 /// Panics if `a` or `b` are out of bounds.
2039 pub fn swap(&mut self, a: usize, b: usize) {
2040 assert!(a < self.len, "swap: index a out of bounds");
2041 assert!(b < self.len, "swap: index b out of bounds");
2042
2043 if a == b {
2044 return;
2045 }
2046
2047 unsafe {
2048 let val_a = self.get_unchecked(a);
2049 let val_b = self.get_unchecked(b);
2050 self.set_unchecked(a, <T as Storable<W>>::into_word(val_b));
2051 self.set_unchecked(b, <T as Storable<W>>::into_word(val_a));
2052 }
2053 }
2054
2055 /// Returns a mutable proxy to the first element, or `None` if empty.
2056 pub fn first_mut(&mut self) -> Option<MutProxy<'_, T, W, E, B>> {
2057 if self.is_empty() {
2058 None
2059 } else {
2060 self.at_mut(0)
2061 }
2062 }
2063
2064 /// Returns a mutable proxy to the last element, or `None` if empty.
2065 pub fn last_mut(&mut self) -> Option<MutProxy<'_, T, W, E, B>> {
2066 if self.is_empty() {
2067 None
2068 } else {
2069 let len = self.len();
2070 self.at_mut(len - 1)
2071 }
2072 }
2073
2074 /// Returns the first element of the vector, or `None` if empty.
2075 pub fn first(&self) -> Option<T> {
2076 if self.is_empty() {
2077 None
2078 } else {
2079 Some(unsafe { self.get_unchecked(0) })
2080 }
2081 }
2082
2083 /// Returns the last element of the vector, or `None` if empty.
2084 pub fn last(&self) -> Option<T> {
2085 if self.is_empty() {
2086 None
2087 } else {
2088 let len = self.len();
2089 Some(unsafe { self.get_unchecked(len - 1) })
2090 }
2091 }
2092
2093 /// Splits the vector into two mutable slices at `mid`.
2094 ///
2095 /// # Panics
2096 ///
2097 /// Panics if `mid > len`.
2098 pub fn split_at_mut(
2099 &mut self,
2100 mid: usize,
2101 ) -> (
2102 slice::FixedVecSlice<&mut Self>,
2103 slice::FixedVecSlice<&mut Self>,
2104 ) {
2105 assert!(mid <= self.len, "mid > len in split_at_mut");
2106 // SAFETY: The two slices are guaranteed not to overlap.
2107 unsafe {
2108 let ptr = self as *mut Self;
2109 let left = slice::FixedVecSlice::new(&mut *ptr, 0..mid);
2110 let right = slice::FixedVecSlice::new(&mut *ptr, mid..self.len());
2111 (left, right)
2112 }
2113 }
2114
2115 /// Returns a mutable iterator over the elements of the vector.
2116 pub fn iter_mut(&mut self) -> iter_mut::IterMut<'_, T, W, E, B> {
2117 iter_mut::IterMut::new(self)
2118 }
2119
2120 /// Rotates the elements of the vector in-place such that the element at `mid`
2121 /// becomes the first element.
2122 ///
2123 /// # Panics
2124 ///
2125 /// Panics if `mid > len`.
2126 pub fn rotate_left(&mut self, mid: usize) {
2127 assert!(mid <= self.len, "mid > len in rotate_left");
2128 if self.is_empty() || mid == 0 || mid == self.len {
2129 return;
2130 }
2131 // A simple, correct implementation.
2132 let mut temp = Vec::with_capacity(mid);
2133 for i in 0..mid {
2134 temp.push(unsafe { self.get_unchecked(i) });
2135 }
2136 for i in mid..self.len {
2137 let val = unsafe { self.get_unchecked(i) };
2138 unsafe { self.set_unchecked(i - mid, <T as Storable<W>>::into_word(val)) };
2139 }
2140 for (i, val) in temp.into_iter().enumerate() {
2141 unsafe { self.set_unchecked(self.len - mid + i, <T as Storable<W>>::into_word(val)) };
2142 }
2143 }
2144
2145 /// Rotates the elements of the vector in-place such that the element at
2146 /// `len - k` becomes the first element.
2147 ///
2148 /// # Panics
2149 ///
2150 /// Panics if `k > len`.
2151 pub fn rotate_right(&mut self, k: usize) {
2152 assert!(k <= self.len, "k > len in rotate_right");
2153 if self.is_empty() || k == 0 || k == self.len {
2154 return;
2155 }
2156 self.rotate_left(self.len - k);
2157 }
2158
2159 /// Fills the vector with the given value.
2160 ///
2161 /// # Panics
2162 ///
2163 /// Panics if `value` does not fit in the configured `bit_width`.
2164 pub fn fill(&mut self, value: T) {
2165 let value_w = <T as Storable<W>>::into_word(value);
2166 let bits_per_word = <W as traits::Word>::BITS;
2167 let limit = if self.bit_width < bits_per_word {
2168 W::ONE << self.bit_width
2169 } else {
2170 W::max_value()
2171 };
2172 if self.bit_width < bits_per_word && value_w >= limit {
2173 panic!(
2174 "Value {:?} does not fit in the configured bit_width of {}",
2175 value_w, self.bit_width
2176 );
2177 }
2178
2179 for i in 0..self.len() {
2180 unsafe { self.set_unchecked(i, value_w) };
2181 }
2182 }
2183
2184 /// Fills the vector with values returned by a closure.
2185 ///
2186 /// # Panics
2187 ///
2188 /// Panics if the closure returns a value that does not fit in the configured
2189 /// `bit_width`.
2190 pub fn fill_with<F>(&mut self, mut f: F)
2191 where
2192 F: FnMut() -> T,
2193 {
2194 let bits_per_word = <W as traits::Word>::BITS;
2195 let limit = if self.bit_width < bits_per_word {
2196 W::ONE << self.bit_width
2197 } else {
2198 W::max_value()
2199 };
2200
2201 for i in 0..self.len() {
2202 let value = f();
2203 let value_w = <T as Storable<W>>::into_word(value);
2204 if self.bit_width < bits_per_word && value_w >= limit {
2205 panic!(
2206 "Value {:?} returned by closure does not fit in bit_width {}",
2207 value_w, self.bit_width
2208 );
2209 }
2210 unsafe { self.set_unchecked(i, value_w) };
2211 }
2212 }
2213
2214 /// Copies a sequence of elements from a source [`FixedVec`] into this one.
2215 ///
2216 /// # Panics
2217 ///
2218 /// Panics if the source and destination vectors do not have the same `bit_width`,
2219 /// or if the source or destination ranges are out of bounds.
2220 pub fn copy_from_slice(
2221 &mut self,
2222 src: &Self,
2223 src_range: std::ops::Range<usize>,
2224 dest_index: usize,
2225 ) {
2226 assert_eq!(
2227 self.bit_width, src.bit_width,
2228 "bit_width mismatch in copy_from_slice"
2229 );
2230 assert!(src_range.start <= src_range.end, "source range start > end");
2231 assert!(src_range.end <= src.len(), "source range out of bounds");
2232 let len = src_range.len();
2233 assert!(
2234 dest_index + len <= self.len(),
2235 "destination range out of bounds"
2236 );
2237
2238 if len == 0 {
2239 return;
2240 }
2241
2242 // Fast path: if bit alignments are the same, we can do word-level copies.
2243 let bit_width = self.bit_width;
2244 let bits_per_word = <W as Word>::BITS;
2245 let src_bit_offset = (src_range.start * bit_width) % bits_per_word;
2246 let dest_bit_offset = (dest_index * bit_width) % bits_per_word;
2247
2248 if src_bit_offset == dest_bit_offset {
2249 let src_word_start = (src_range.start * bit_width) / bits_per_word;
2250 let dest_word_start = (dest_index * bit_width) / bits_per_word;
2251 let total_bits_to_copy = len * bit_width;
2252 let num_words_to_copy = total_bits_to_copy.div_ceil(bits_per_word);
2253
2254 let src_words = &src.bits.as_ref()[src_word_start..src_word_start + num_words_to_copy];
2255 let dest_words =
2256 &mut self.bits.as_mut()[dest_word_start..dest_word_start + num_words_to_copy];
2257
2258 // If the last word is not fully copied, we need to mask it.
2259 let residual_bits = total_bits_to_copy % bits_per_word;
2260 if residual_bits == 0 {
2261 dest_words.copy_from_slice(src_words);
2262 } else {
2263 if num_words_to_copy > 1 {
2264 dest_words[..num_words_to_copy - 1]
2265 .copy_from_slice(&src_words[..num_words_to_copy - 1]);
2266 }
2267 let last_word_mask = (W::ONE << residual_bits).wrapping_sub(W::ONE);
2268 let dest_last_word = &mut dest_words[num_words_to_copy - 1];
2269 let src_last_word = &src_words[num_words_to_copy - 1];
2270 *dest_last_word =
2271 (*dest_last_word & !last_word_mask) | (*src_last_word & last_word_mask);
2272 }
2273 } else {
2274 // Slow path: copy element by element if alignments differ.
2275 if dest_index > src_range.start && dest_index < src_range.end {
2276 // Copy backwards if ranges overlap and dest is after src.
2277 for i in (0..len).rev() {
2278 let val = unsafe { src.get_unchecked(src_range.start + i) };
2279 unsafe {
2280 self.set_unchecked(dest_index + i, <T as Storable<W>>::into_word(val))
2281 };
2282 }
2283 } else {
2284 for i in 0..len {
2285 let val = unsafe { src.get_unchecked(src_range.start + i) };
2286 unsafe {
2287 self.set_unchecked(dest_index + i, <T as Storable<W>>::into_word(val))
2288 };
2289 }
2290 }
2291 }
2292 }
2293}
2294
2295impl<T, W, E, B, B2> PartialEq<FixedVec<T, W, E, B2>> for FixedVec<T, W, E, B>
2296where
2297 T: Storable<W> + PartialEq,
2298 W: Word,
2299 E: Endianness,
2300 B: AsRef<[W]>,
2301 B2: AsRef<[W]>,
2302{
2303 /// Checks for equality between two [`FixedVec`] instances.
2304 ///
2305 /// It first checks `len` and `bit_width`, then the underlying storage.
2306 fn eq(&self, other: &FixedVec<T, W, E, B2>) -> bool {
2307 if self.len() != other.len() || self.bit_width() != other.bit_width() {
2308 return false;
2309 }
2310 self.as_limbs() == other.as_limbs()
2311 }
2312}
2313
2314/// Implements `PartialEq` for comparing a [`FixedVec`] with a standard slice.
2315impl<T, W, E, B, T2> PartialEq<&[T2]> for FixedVec<T, W, E, B>
2316where
2317 T: Storable<W> + PartialEq<T2>,
2318 W: Word,
2319 E: Endianness,
2320 B: AsRef<[W]>,
2321 T2: Clone,
2322{
2323 fn eq(&self, other: &&[T2]) -> bool {
2324 if self.len() != other.len() {
2325 return false;
2326 }
2327 self.iter().zip(other.iter()).all(|(a, b)| a == *b)
2328 }
2329}
2330
2331impl<T, W, E> Extend<T> for FixedVec<T, W, E, Vec<W>>
2332where
2333 T: Storable<W> + ToPrimitive,
2334 W: Word,
2335 E: Endianness,
2336{
2337 /// Extends the vector with the contents of an iterator.
2338 ///
2339 /// # Panics
2340 ///
2341 /// Panics if any value from the iterator does not fit within the
2342 /// configured `bit_width`.
2343 fn extend<I: IntoIterator<Item = T>>(&mut self, iter: I) {
2344 let iter = iter.into_iter();
2345 let (lower_bound, _) = iter.size_hint();
2346 self.reserve(lower_bound);
2347 for item in iter {
2348 self.push(item);
2349 }
2350 }
2351}
2352
2353impl<T, W, E> From<FixedVec<T, W, E, Vec<W>>> for FixedVec<T, W, E, Box<[W]>>
2354where
2355 T: Storable<W>,
2356 W: Word,
2357 E: Endianness,
2358{
2359 /// Converts a `Vec`-backed [`FixedVec`] into a `Box<[]>`-backed [`FixedVec`].
2360 fn from(vec: FixedVec<T, W, E, Vec<W>>) -> Self {
2361 unsafe { Self::new_unchecked(vec.bits.into_boxed_slice(), vec.len, vec.bit_width) }
2362 }
2363}
2364
2365impl<T, W, E> From<FixedVec<T, W, E, Box<[W]>>> for FixedVec<T, W, E, Vec<W>>
2366where
2367 T: Storable<W>,
2368 W: Word,
2369 E: Endianness,
2370{
2371 /// Converts a `Box<[]>`-backed [`FixedVec`] into a `Vec`-backed [`FixedVec`].
2372 fn from(vec: FixedVec<T, W, E, Box<[W]>>) -> Self {
2373 unsafe { Self::new_unchecked(vec.bits.into_vec(), vec.len, vec.bit_width) }
2374 }
2375}
2376
2377impl<'a, T> TryFrom<&'a [T]> for FixedVec<T, <T as DefaultParams>::W, <T as DefaultParams>::E>
2378where
2379 T: Storable<<T as DefaultParams>::W> + DefaultParams,
2380 <T as DefaultParams>::W: Word,
2381 <T as DefaultParams>::E: Endianness,
2382 dsi_bitstream::impls::BufBitWriter<
2383 <T as DefaultParams>::E,
2384 dsi_bitstream::impls::MemWordWriterVec<
2385 <T as DefaultParams>::W,
2386 Vec<<T as DefaultParams>::W>,
2387 >,
2388 >: dsi_bitstream::prelude::BitWrite<<T as DefaultParams>::E, Error = std::convert::Infallible>,
2389{
2390 type Error = Error;
2391
2392 /// Creates a [`FixedVec`] from a slice using [`BitWidth::Minimal`] and default parameters.
2393 ///
2394 /// This is a convenience method equivalent to `FixedVec::builder().build(slice)`.
2395 /// It uses the default [`Word`] ([`usize`]) and [`Endianness`] ([`LE`]) associated with the element type `T`.
2396 ///
2397 /// # Examples
2398 ///
2399 /// ```
2400 /// use compressed_intvec::fixed::{UFixedVec, SFixedVec};
2401 /// use std::convert::TryFrom;
2402 ///
2403 /// // For unsigned types
2404 /// let data_u: &[u32] = &[10, 20, 30];
2405 /// let vec_u = UFixedVec::<u32>::try_from(data_u).unwrap();
2406 /// assert_eq!(vec_u.bit_width(), 5);
2407 ///
2408 /// // For signed types
2409 /// let data_s: &[i16] = &[-10, 0, 10];
2410 /// let vec_s = SFixedVec::<i16>::try_from(data_s).unwrap();
2411 /// assert_eq!(vec_s.bit_width(), 5);
2412 /// ```
2413 fn try_from(slice: &'a [T]) -> Result<Self, Self::Error> {
2414 Self::builder().bit_width(BitWidth::Minimal).build(slice)
2415 }
2416}