compressed-intvec 0.6.0

# Change Log

All notable changes to this project will be documented in this file.

## [0.6.0] - 2026-03-15

### BREAKING

*   **`dsi-bitstream` upgraded from 0.5 to 0.9**: This is a transitive
    breaking change for users who interact with the `Codes` enum directly.
    Variants with parameters changed from struct syntax to tuple syntax
    (e.g., `Codes::Zeta { k: 3 }` is now `Codes::Zeta(3)`).

*   **Serde format for `VarVec` and `SeqVec` changed**: The `encoding`
    field is now serialized via `Display`/`FromStr` (e.g., `"Zeta(3)"`)
    instead of the previous struct-based JSON format. Data serialized with
    0.5.x cannot be deserialized with 0.6.0.

### New

*   **`seq` module**: Introduced `SeqVec`, a compressed vector of
    variable-length sequences with indexed access. Designed for adjacency
    lists, document terms, and similar data organized as many
    variable-length sequences. Includes `SeqVecBuilder`,
    `SeqVecFromIterBuilder`, `SeqIter`, `SeqVecIter`, `SeqVecReader`,
    `SeqVecSlice`, parallel iteration, and serde support.

### Changed

*   Upgraded `dsi-bitstream` from 0.5.0 to 0.9.0. Key upstream changes:
    compile-time table validation, `Codes` with native serde support,
    `Codes::canonicalize()`, `DispatchError` replacing `anyhow::Error`,
    `MemWordReader` with `INF` const generic for infallible reads.

*   Upgraded `rand` from 0.9 to 0.10 and `rand_distr` from 0.5 to 0.6
    for compatibility with `dsi-bitstream`'s `implied` feature.

*   Removed the internal `CodesSerde` proxy enum (`src/common/serde.rs`).
    `Codes` now implements `Serialize`/`Deserialize` natively in
    dsi-bitstream 0.9.

*   Removed `UnsignedInt` from the `Word` trait bounds to resolve trait
    method ambiguity between `common_traits` and `num-primitive`.

### Improved

*   Codec auto-selection now applies `Codes::canonicalize()` to ensure
    canonical codec forms (e.g., `Golomb(2^n)` becomes `Rice(n)`,
    `Zeta(1)` becomes `Gamma`).

## [0.5.1] - 2025-09-16

### New

*   Added a new `arch-dependent-storable` feature flag to enable `Storable` implementations for `usize` and `isize` in `variable::IntVec`. This allows for creating vectors of architecture-dependent types, but sacrifices data portability guarantees across platforms with different pointer widths.
*   Added a new benchmark suite (`bench_random_write`) and corresponding plotting scripts to measure and compare random write performance for `fixed::FixedVec` against other libraries.

### Improved

*   Optimized `FixedVec::set_unchecked` for significantly improved write performance. The implementation now includes a fast path for elements with a bit width equal to the storage word size and uses `split_at_mut_unchecked` to reduce bounds checking overhead for writes that span word boundaries.
*   Updated crate description and keywords in `Cargo.toml` for better discoverability and clarity on crates.io.
*   Replaced the `num_cpus` dependency with `std::thread::available_parallelism`.

### Fixed

*   Corrected minor type errors in the `variable` module test suite.

## [0.5.0] - 11-08-2025

This release introduces a fundamental architectural restructuring of the library, splitting the implementation into two distinct data structures: `FixedVec` for fixed-width encoding and `IntVec` for variable-width encoding. This separation enables significant new features, including full mutability, atomic operations, and improved performance, but constitutes a major breaking change to the public API.

### BREAKING

*   **Architectural Separation of Encodings**: The core `IntVec` structure has been split into two distinct implementations based on the encoding strategy.
    *   **`fixed::FixedVec`**: A new data structure exclusively for fixed-width integer encoding. It provides a mutable, `Vec`-like API, O(1) random access, and a thread-safe atomic variant (`AtomicFixedVec`). This component replaces the previous `CodecSpec::FixedLength` functionality.
    *   **`variable::IntVec`**: The refactored successor to the original `IntVec`, now located in the `variable` module. It is dedicated exclusively to variable-length instantaneous codes (e.g., Gamma, Delta, Zeta). It remains immutable after creation.

*   **Module and API Restructuring**: The project's module structure has been fundamentally changed.
    *   The top-level `intvec` and `sintvec` source modules have been removed. All functionality is now organized under the `fixed` and `variable` modules.
    *   The `CodecSpec` enum has been renamed to `VariableCodecSpec` and is now located in `src/variable/codec.rs`. It is used only for `variable::IntVec`.
    *   `fixed::FixedVec` is now configured using a new `BitWidth` enum, which is distinct from `VariableCodecSpec`.

*   **Removal of `SIntVec` and Introduction of Generic Types**: The standalone `SIntVec` struct has been removed.
    *   Both `FixedVec` and `IntVec` are now fully generic over all primitive integer types (e.g., `u8`, `i16`, `u32`, `i64`).
    *   Signed integer support is now provided directly through the `Storable` trait, which transparently handles ZigZag encoding.
    *   New type aliases such as `SFixedVec<T>` (for signed `FixedVec`) and `SIntVec<T>` (for signed `IntVec`) are provided.

*   **Builder API Modification**: The builder pattern has been changed for all vector types. The input data slice is now passed to the final `.build()` method instead of the constructor.
    *   **Old**: `LEIntVec::builder(&data).codec(...).build()`
    *   **New**: `LEIntVec::builder().codec(...).build(&data)`

*   **Codec Analysis Behavior Change**: The analysis strategy for `VariableCodecSpec::Auto` has been changed.
    *   It now analyzes the **entire** input dataset to determine the optimal codec, whereas the previous implementation used a 10,000-element sample for large datasets.
    *   This change improves compression ratio accuracy at the cost of increased construction time for large inputs.

### New

*   **`fixed::FixedVec` Data Structure**: Introduced a new vector implementation for fixed-width integer encoding, located in the `fixed` module.
    *   **Mutable API**: Provides a mutable interface including methods such as `push`, `pop`, `set`, `insert`, `remove`, `resize`, `map_in_place`, and `fill`.
    *   **Generic Implementation**: The structure is generic over the element type `T` (e.g., `u16`, `i32`), the storage word `W` (e.g., `u64`, `usize`), the endianness `E`, and the backing buffer `B` (`Vec<W>` or `&[W]`).
    *   **Zero-Copy Slicing**: Added an extensive API for creating immutable and mutable zero-copy views (`slice`, `split_at`, `chunks`, `windows`).
    *   **Unaligned Access**: A new method, `get_unaligned_unchecked`, was added for random access via unaligned memory reads.
    *   **Convenience Constructors**: Implements `FromIterator`, `TryFrom<&[T]>`, and is supported by a new `fixed_vec!` macro.

*   **`fixed::atomic::AtomicFixedVec` Data Structure**: Added a thread-safe variant of `FixedVec` for concurrent applications.
    *   **Atomic Operations**: Provides an API analogous to standard atomic types, including `load`, `store`, `swap`, `compare_exchange`.
    *   **Atomic RMW Operations**: Implements atomic Read-Modify-Write (RMW) methods such as `fetch_add`, `fetch_sub`, `fetch_and`, `fetch_or`, `fetch_xor`, `fetch_max`, and `fetch_min`.
    *   **Hybrid Atomicity Model**: Utilizes lock-free atomic instructions for elements fully contained within a single `u64` storage word. For elements that span two words, it uses a striped locking mechanism to ensure atomicity without a global lock.
    *   **Parallel Mutation**: Supports parallel in-place modification via a `par_iter_mut` method.
    *   **Construction**: Supported by its own builder and the `atomic_fixed_vec!` macro.

*   **Generic Integer Type Support**: Both `FixedVec` and `IntVec` now support all primitive integer types (e.g., `u8`, `i16`, `u32`, `i64`). This is managed by the `Storable` trait, which abstracts the conversion to and from the underlying storage words.

*   **`variable::IntVecSeqReader`**: Introduced a new stateful reader for `variable::IntVec`, optimized for access patterns with high locality.
    *   **Stateful Cursor**: The reader maintains an internal cursor of its current decoding position.
    *   **Optimized Forward Decoding**: If a requested index is at or after the cursor's position and within the same sample block, the reader decodes forward from its last position, avoiding a seek operation. For non-sequential access, it falls back to a seek-based approach.

*   **Convenience Macros**:
    *   **`fixed_vec!`**: A new macro for `vec!`-like initialization of `fixed::FixedVec`.
    *   **`atomic_fixed_vec!`**: A new macro for `vec!`-like initialization of `fixed::atomic::AtomicFixedVec`.
    *   The `int_vec!` and `sint_vec!` macros have been updated to construct `variable::IntVec` instances.

### Improved

*   **`variable::IntVec` Reader Robustness**: The internal codec dispatcher for `variable::IntVec` has been re-implemented with a hybrid strategy. It now uses a fast, function-pointer-based path for common codecs and parameters, but includes a `match`-based fallback path for less common parameterizations. This change guarantees that any validly constructed `IntVec` can be read without panicking due to an unsupported codec configuration.

*   **API Unification and Ergonomics**:
    *   The `Storable` trait now provides a unified mechanism for handling both signed and unsigned integer types across `FixedVec` and `IntVec`, removing the need for a separate `SIntVec` struct and resulting in a more consistent API.
    *   The `prelude` module has been expanded to export all common types, traits, and aliases from both the `fixed` and `variable` modules, simplifying imports.

*   **Benchmarking Suite**: The benchmarking infrastructure has been significantly expanded and restructured.
    *   Benchmarks are now organized into `fixed` and `variable` directories, with new suites covering atomic operations, mutable operations, and various access patterns (random, sorted, clustered).
    *   Performance is now compared against external libraries (`sux`, `succinct`) to provide a clearer context for the library's performance characteristics.

*   **`serde` Implementation**: The `serde` implementations have been updated to support the new `FixedVec`, `AtomicFixedVec`, and restructured `IntVec` types, ensuring correct serialization and deserialization for the new architecture.

## [0.4.0] - 2025-07-27

This is a release representing a complete architectural overhaul of the
library. It introduces a more ergonomic and powerful builder-based API, adds
support for signed integers and parallel processing, and provides
intelligent, automatic codec selection. These changes result in significant
usability and performance gains but include foundational breaking changes.

### New

*   **Automatic Codec Selection (`CodecSpec::Auto`)**: The builder can now
    intelligently select the most space-efficient compression codec for a given
    dataset. When `CodecSpec::Auto` is used, the builder performs a statistical
    analysis on a sample of the input data to choose the optimal variable-length
    code (e.g., Gamma, Delta, Zeta). This eliminates the need for manual tuning
    and ensures excellent compression ratios out-of-the-box.

*   **Signed Integer Vector (`SIntVec`)**: Introduced `SIntVec`, a new data
    structure specifically designed for compressing vectors of `i64`. It
    transparently uses ZigZag encoding to map signed integers to unsigned
    integers, allowing standard compression codes to work efficiently on data
    distributions centered around zero.

*   **High-Performance Batch Access Methods**: Introduced two new methods for
    efficiently retrieving multiple elements at once:
    *   `get_many()`: An optimized sequential method for batch lookups.
      For variable-length codes, it sorts the requested indices to perform a
      single, monotonic forward scan over the data, which minimizes expensive
      seek operations and avoids redundant decoding.
    *   `par_get_many()`: A parallel version of `get_many` (enabled by the
      `parallel` feature) that distributes lookups across multiple CPU cores,
      offering significant speedups for large batches of indices.

*   **Parallel Processing (`parallel` feature)**: Added a `parallel` feature,
    enabled by default, which leverages the Rayon library to accelerate
    operations on multi-core systems. In addition to `par_get_many`, this
    feature also provides:
    *   `par_iter()`: A parallel iterator for high-throughput full-vector
      decompression, which is particularly effective with computationally
      intensive codecs.

*   **Iterator-Based Builder**: Introduced `IntVec::from_iter_builder` to
    construct an `IntVec` directly from a streaming iterator. This approach is
    highly memory-efficient, making it suitable for datasets that are too large
    to fit into memory. It requires manual codec parameter specification, as
    data cannot be pre-analyzed.

*   **Stateful Reader (`IntVecReader`)**: Added `IntVec::reader()`, which returns
    a reusable `IntVecReader`. This stateful reader is designed for efficient
    dynamic random access, as it amortizes the setup cost of the bitstream
    reader across multiple `get()` calls, making it ideal for access patterns
    where lookup indices are not known in advance.

*   **Prelude Module**: Added a new `prelude` module (`compressed_intvec::prelude::*`)
    to simplify imports of the most commonly used types and traits, such as
    `LEIntVec`, `SIntVec`, and `CodecSpec`.

### Changed

*   **BREAKING: Complete API Redesign with Builder Pattern**: The core API has
    been fundamentally redesigned around a builder pattern for ergonomics and flexibility.
    *   The previous `from()` and `from_with_param()` methods have been entirely
        replaced by `IntVec::builder()`.
    *   Codec selection is no longer performed with generic type parameters
        (e.g., `LEIntVec<GammaCodec>`). Instead, the compression strategy is now
        specified at runtime via the `builder.codec(CodecSpec)` method. This
        change is what enables dynamic and automatic codec selection.

*   **BREAKING: Project Structure Refactoring**: The `src` directory has been
    restructured.
    *   Core logic is now split into dedicated modules: `src/intvec/`,
        `src/sintvec/`, and `src/codec_spec.rs`.
    *   The internal implementation of `IntVec` is further organized into
        `builder.rs`, `iter.rs`, `reader.rs`, `parallel.rs`, and `serde.rs`.

*   **Dependencies**: Updated key dependencies, including `dsi-bitstream` to
    version `0.5.0` and `mem_dbg` to `0.3.0`. Added `rayon` as a core dependency
    for the new, default-enabled `parallel` feature.

### Improved

*   **`FixedLength` Encoding Strategy**: The new `CodecSpec::FixedLength` replaces
    the old `MinimalBinaryCodec`. This new implementation is more powerful and
    explicit:
    *   It can now automatically detect the minimum required bit width for a
      dataset when `num_bits: None` is specified.
    *   It is now a distinct encoding strategy, separate from the DSI-based
      codes, with a highly optimized O(1) access path that does not require a
      sampling table.

*   **Memory-Efficient Sample Storage**: For bit-level encodings, the `IntVec`
    now stores sample offsets in a `Vec<u32>` if the total bit-length of the
    data is less than `u32::MAX`, falling back to `Vec<u64>` only when strictly
    necessary. This reduces memory overhead for small to medium-sized vectors.

*   **Robust Serde Implementation**: The `serde` implementation is now handled
    manually using serializable proxy ("shadow") structs. This makes
    serialization more robust, removes previous limitations, and decouples the
    library from the `dsi-bitstream` dependency, which does not provide `serde`
    traits for its `Codes` enum.

*   **Comprehensive Benchmarking Suite**: The benchmarking infrastructure has been
    completely revamped to provide more accurate performance
    data.
    *   Benchmarks are now run against multiple data distributions (Uniform,
        Geometric, Power-Law) to provide a cleaner view of codec performance.
    *   A new benchmark, `bench_parallel`, specifically measures the performance
        and scalability of the new parallel access methods.
    *   Plotting scripts have been consolidated into a single `python/plot.py`
        that generates interactive and static plots directly from Criterion's
        JSON output, removing the need for intermediate CSV files for most
        benchmarks (except for `bench_size`)

### Removed

*   **BREAKING: Generic Codec API**: The old API, which relied on generic codec
    structs (e.g., `GammaCodec`, `DeltaCodec`, `RiceCodec`, `MinimalBinaryCodec`)
    and the associated `Codec` trait, has been completely removed in favor of the
    new builder pattern and the `CodecSpec` enum.

### Fixed

*   **Fixed `serde` Serialization**: The `serde` implementation now correctly
    serializes and deserializes the internal state of `IntVec` and `SIntVec`,
    ensuring that all codec parameters and data are preserved accurately.