# Copilot Instructions for sux-rs
## Project Overview
`sux` is a pure Rust implementation of succinct and compressed data structures:
bit vectors, rank/select structures, Elias–Fano, rear-coded lists, static
functions/filters, and partial arrays. Requires 64-bit pointer width. MSRV is
1.85 (Rust edition 2024).
## Build, Test, and Lint
```bash
cargo build # standard build
cargo build --release # optimized (includes debug info)
cargo test # run tests
cargo test --all-features # run tests with all features
cargo test --release --features slow_tests # slow integration tests
cargo test test_name # single test by name
cargo test test_name -- --nocapture # single test with output
cargo test --test test_bit_vec # all tests in a file
cargo fmt -- --check # check formatting
cargo clippy --all-features --all-targets # lint (must be warning-free)
```
## Key Feature Flags
- `rayon` (default), `flate2` (default), `zstd` (default)
- `epserde`: ε-serde serialization; `mmap`: memory mapping (implies `epserde`)
- `serde`: standard serde support
- `cli`: build binary utilities (implies `clap`, `epserde`, `deko`)
- `slow_tests`: enable expensive tests (use with `--release`)
- `mwhc`: MWHC data structures (benchmarking)
- `aarch64_prefetch`, `iter_advance_by`: require nightly
## Architecture
### Composable Layering
Structures compose in layers, inner to outer: **BitVec → Rank → Select →
SelectZero**. Example:
```
SelectAdapt<Rank9<BitVec<Box<[usize]>>>>
```
Each layer forwards traits it doesn't implement to the inner layer via the
`ambassador` crate's `#[delegate(...)]` pattern. Forwarded traits include
`AsRef<[usize]>`, `Index<usize>`, `BitLength`, and all unimplemented
rank/select traits.
- `into_inner()` extracts the underlying structure.
- `map(|x| ...)` replaces an underlying structure (unsafe—must maintain
consistency).
- `AddNumBits` wraps a structure to cache bit counts for constant-time
`NumBits`.
### Checked vs. Unchecked Methods
Almost all structures provide both:
- **Checked** (`rank`, `select`): bounds-check, return `Option` or panic.
- **Unchecked** (`rank_unchecked`, `select_unchecked`): `unsafe`, no bounds
checks, maximum performance.
### Module Layout
- **`src/bits/`**: `BitVec`, `BitFieldVec` — fundamental bit storage.
- **`src/traits/`**: Core traits (`Rank`, `Select`, `BitLength`, `NumBits`,
`BitCount`, `Backend`, `BitFieldSlice`, `IndexedDict`).
- **`src/rank_sel/`**: Rank and select structures (`Rank9`, `RankSmall`,
`SelectAdapt`, `SelectSmall`, `Select9`, and zero variants).
- **`src/dict/`**: Indexed dictionaries (`EliasFano`, `RearCodedList`,
`VFilter`, `SignedVFunc`).
- **`src/func/`**: Static functions (`VFunc`, `VBuilder`).
- **`src/array/`**: `PartialArray` — arrays with holes.
- **`src/utils/`**: Lending iterators, signature storage, GF(2) linear systems.
### Serialization and Memory
- All structures implement `MemDbg` and `MemSize` (`mem_dbg` is a required
dependency).
- ε-serde (`epserde` feature) for efficient serialization and memory mapping.
- Standard `serde` support via the `serde` feature.
## Development Guidelines
Follow the [Rust development guidelines](https://github.com/vigna/rust-dev-guidelines/blob/main/README.md).
Key points:
### Code Style
- Use `rustfmt` with edition 2024 style (see `rustfmt.toml`).
- Code must be `cargo clippy --all-features --all-targets`-clean.
- Prefer `impl Trait` over type parameters for function arguments.
- Prefer bounds in `impl` clause over `where` clause.
- Minimize trait bounds on `impl` blocks and function parameters.
- Prefer receiving `IntoIterator` over `Iterator`, `AsRef<str>` over
`String`/`&str`, `AsRef<Path>` over `Path`/`&Path`.
### Source File Organization
Within a source file, items should appear in this order:
1. Declaration
2. Implementations of derivable traits
3. Macros (constructors, etc.)
4. Inherent implementations
5. Implementations of crate traits
6. Implementations of external crate traits
7. Implementations of std traits
Structure fields: immutable fields first, then mutable, with `PhantomData`
last. Fields ordered from most general/important to least.
### Safety
- All `unsafe` methods must have clear safety documentation.
- Use `panic_if_out_of_bounds!`, `panic_if_value!`, `debug_assert_bounds!`
macros for bounds checking.
### Testing
- Test functions: `test_` prefix + brief feature description (e.g.,
`test_long_input`).
- Tests must return `anyhow::Result<()>` and use `?` — avoid `unwrap`/`expect`.
- Assertions: actual value first, expected second:
`assert_eq!(result, expected)`.
- Unit tests go in `#[cfg(test)] mod tests` at the end of the source file.
- Slow tests gated with `#[cfg(feature = "slow_tests")]`.
### Documentation
- Crate docs live in `README.md`, included via
`#![doc = include_str!("../README.md")]`.
- Use reference-style links with absolute `docs.rs` URLs.
- An `Implementation Notes` section can document internal details.
### Logging
Binaries and tests using logging must configure `env_logger`:
```rust
env_logger::builder()
.filter_level(log::LevelFilter::Info)
.try_init()?;
```
### Parallelism
When using `rayon`, pass `RAYON_MIN_LEN` (100,000) to `.with_min_len()` for
fast bulk operations.
### Trait Forwarding Pattern
When adding a new structure that wraps another, follow the `ambassador`
delegation pattern used throughout `src/rank_sel/`. Import the
`ambassador_impl_*` items for each trait to forward, and annotate the struct
with `#[derive(Delegate)]` and `#[delegate(...)]` attributes.