bitvec 0.15.1

A crate for manipulating memory, bit by bit
docs.rs failed to build bitvec-0.15.1
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build: bitvec-1.0.1

bitvec – Managing Memory Bit by Bit

Crate Documentation License Continuous Integration Code Coverage Crate Downloads Crate Size

Summary

bitvec enables refining memory manipulation from single-byte precision to single-bit precision. The bit-precision pointers in this crate allow creation of more powerful bit-masks, set arithmetic, and I/O packet processing.

The core export of this crate is the type BitSlice. This type is a region of memory with individually-addressable bits. It is accessed by standard Rust references: &BitSlice and &mut BitSlice. These references are able to describe and operate on regions that start and end on any arbitrary bit address, regardless of alignment to a byte or processor word.

Rust provides three types to manipulate a sequence of memory: &[T]/&mut [T] to borrow a region, Box<[T]> to statically own a region, and Vec<T> to dynamically own a region. bitvec provides parallel types for each: &BitSlice and &mut BitSlice borrow, BitBox statically owns, and BitVec dynamically owns. These types mirror the relationships and APIs, including inherent methods and trait implementations, that are found in the standard library.

What Makes bitvec Different Than All The Other Bit Vector Crates

The most significant differences are that bitvec provides arbitrary bit ordering through the Cursor trait, and provides a full-featured slice type. Other crates have fixed orderings, and are often unable to produce slices that begin at any arbitrary bit.

Additionally, the bitvec types implement the full extent of the standard library APIs possible, in both inherent methods and trait implementations.

The bitvec types’ handles are exactly the size of their standard library counterparts, while the other crates carry bit index information in separate fields, making their handles wider. Depending on your needs, this may sway your opinion in either direction. bitvec is more compact, but mangles the internal pointer representation and requires more complex logic to use the bit region, while other crates’ types are larger, but have more straightforward internal logic.

Why Would You Use This

  • You need to directly control a bitstream’s representation in memory.
  • You need to do unpleasant things with I/O communications protocols.
  • You need a list of bools that doesn’t waste 7 bits for every bit used.
  • You need to do set arithmetic, or numeric arithmetic, on those lists.
  • You are running a find/replace command on your repository from &[bool] to &BitSlice, or Vec<bool> to BitVec, and expect minimal damage as a result.

Why Wouldn’t You Use This

  • Your concern with the memory representation of bitsets includes sequence compression. BitSlice performs absolutely no compression, and maps bits directly into memory. Compressed bit sets can be found in other crates, such as the compacts crate, which uses the Roaring BitSet format.
  • You want discontiguous data structures, such as a hash table, a tree, or any other hallmark of computer science beyond the flat array. bitvec does not, and will not, accomplish this. You may be able to use bitvec’s flat structures beneath a type wrapper which handles index processing, but bitvec types are incapable of accomplishing this task themselves. Also, I don’t know how any of those data structures work.

Usage

Minimum Rust Version: 1.36.0

The 1.36 release of Rust stabilized the alloc crate, allowing allocating features (such as the BitVec type) to be used in #![no_std] environments with the stable compiler series.

Symbol Import

# Cargo.toml
[dependencies]
bitvec = "0.15"

bitvec is highly modular, and requires several items to function correctly. The simplest way to use it is via prelude glob import:

use bitvec::prelude::*;

This imports the following symbols:

  • BigEndian
  • BitBox (only when an allocator is present)
  • BitSlice
  • BitStore
  • BitVec (only when an allocator is present)
  • Bits
  • BitsMut
  • Cursor
  • LittleEndian
  • bitbox! (only when an allocator is present)
  • bitvec! (only when an allocator is present)

If you do not want these names imported directly into the local scope – Cursor, BigEndian, and LittleEndian are likely culprits for name collision – then you can import the prelude with a scope guard:

use bitvec::prelude as bv;

and those symbols will all be available only with a bv:: prefix.

Cargo Features

bitvec uses Cargo features to conditionally control some behavior.

The most prominent such behavior is one that cannot be controlled by Cargo configuration: u64 is only usable with this library when targeting a 64-bit system. 32-bit system targets are only permitted to use u8, u16, and u32.

Atomic Behavior

bitvec uses atomic read/modify/write instructions by default. This is necessary to avoid data races in &mut BitSlice operations without using heavier synchronization mechanisms. If your target does not support Rust’s AtomicU* types, or you do not want to use atomic RMW instructions, you may disable the atomic feature:

# Cargo.toml

[dependencies.bitvec]
default-features = false
features = [
  # "atomic",
  "std",
]

Allocator Support

The two owning structures, BitBox and BitVec, require the presence of an allocator. As bitvec is written specifically for use cases where an allocator may not exist, this dependence can be disabled. bitvec is #![no_std]-compatible once the std feature is disabled. It is not a design goal to be #![no_core]-compatible.

# Cargo.toml

[dependencies.bitvec]
default-features = false
features = [
  "atomic",
  # "std",
]

If you are working in a #![no_std] environment that does have an allocator available, you can reënable allocator support with the alloc feature:

# Cargo.toml

[dependencies.bitvec]
default-features = false # disables "std"
features = ["alloc"] # enables the allocator

This uses #![feature(alloc)], which requires the nightly compiler.

Serde Support

De/serialization of bit slices is implemented through the serde crate. This functionality is governed by both the serde feature and the std feature.

By default, when serde is enabled, BitSlice, BitBox, and BitVec all gain the Serialize trait, and BitBox and BitVec gain the Deserialize trait.

When std is disabled, the BitBox and BitVec types are removed, leaving only BitSlice with Serialize.

# Cargo.toml

[dependencies.bitvec]
features = ["serde"]

Data Structures

bitvec’s three data structures are &BitSlice, BitBox, and BitVec. Each of these types takes two type parameters, which I have elided previously.

The first type parameter is the Cursor trait. This trait governs how a bit index maps to a bit position in the underlying memory. This parameter defaults to the BigEndian type, which counts from the most significant bit first to the least significant bit last. The LittleEndian type counts in the opposite direction.

The second type parameter is the BitStore trait. This trait abstracts over the Rust fundamental types u8, u16, and u32. On 64-bit targets, u64 is also available. This parameter defaults to u8, which acts on individual bytes.

These traits are both explained in the next section.

&BitSlice<C: Cursor, T: BitStore> is an immutable region of memory, addressable at bit precision. This has all the inherent methods of Rust’s slice primitive, &[bool], and all the trait implementations. It has additional methods which are specialized to its status as a slice of individual bits.

&mut BitSlice<C: Cursor, T: BitStore> is a mutable region of memory. This functions identically to &mut [bool], with the exception that IndexMut is impossible: you cannot write bitslice[index] = bit;. This restriction is sidestepped with the C++-style method at: *bitslice.at(index) = bit; is the shim for write indexing.

The slice references have no restrictions on the alignment of their start or end bits.

The owning references, described below, will always begin their slice aligned to the edge of their T: BitStore type parameter. While this is not strictly required by the implementation, it is convenient for ensuring that the allocation pointer is preserved.

BitBox<C: Cursor, T: BitStore> is a &mut BitSlice<C: Cursor, T: BitStore> in owned memory. It has few useful methods and no trait implementations of its own. It is only capable of taking a bit slice into owned memory.

BitVec<C: Cursor, T: BitStore> is a BitBox that can adjust its allocation size. It follows the inherent and trait API of the standard library’s Vec type.

The API for these types is deliberately uninteresting. They are written to be as close to drop-in replacements for the standard library types as possible. The end goal of bitvec is that you should be able to adopt it by running three sed find/replace commands on your repository. This is not literally possible, but the work required for replacement is intended to be minimal.

Traits

bitvec generalizes its behavior through the use of traits. In order to optimize performance, these traits are not object-safe, and may not be used as dyn Trait patterns for type erasure. Refactoring bitvec to support type erasure would require significantly rewriting core infrastructure, and this is not a design goal. I am willing to consider it if demand is shown, but I am not going to proactively pursue it.

Cursor

The Cursor trait is an open-ended trait, that you are free to implement yourself. It has one required function: fn at<T: BitStore>(BitIdx) -> BitPos.

This function translates a semantic index to an electrical position. bitvec provides two implementations for you: BigEndian and LittleEndian, described above. The invariants this function must uphold are listed in its documentation.

BitStore

The BitStore trait is sealed, and may only be implemented by this library. It is used to abstract over the Rust fundamentals u8, u16, u32, and (on 64-bit systems) u64.

Your choice in fundamental types governs how the Cursor type translates indices, and how the memory underneath your slice is written. The document doc/Bit Patterns.md enumerates the effect of the Cursor and BitStore combinations on raw memory.

If you are using bitvec to perform set arithmetic, and you expect that your sets will have more full elements in the interior than partially-used elements on the front and back edge, it is advantageous to use the local CPU word. The BitSlice operations which traverse the slice are required to perform bit-by-bit crawls on partial-use elements, but are able to use whole-word instructions on full elements. The latter is a marked acceleration.

If you are using bitvec to perform I/O packet manipulation, you should use the fundamental best suited for your protocols. This is likely u8, which is why it is the default type.

Bits and BitsMut

The Bits and BitsMut traits are entry interfaces to the BitSlice types. These are equivalent to the AsRef and AsMut reference conversion traits in the standard library, and should be used as such.

These traits are implemented on the Rust fundamentals that implement BitStore (uN), on slices of those fundamentals ([uN]), and the first thirty-two arrays of them ([uN; 0] to [uN; 32]). Each implementation of these traits causes a linear expansion of compile time, and going beyond thirty-two both surpasses the standard library’s manual implementation limits, and is a denial-of-service attack on each rebuild.

These traits are left open so that if you need to implement them on wider arrays, you are able to do so.

You can use these traits to attach .as_bitslice::<C: Cursor>() and .as_mut_bitslice::<C: Cursor>() conversion methods to any implementor, and gain access to a BitSlice over that type, or to bound a generic function similar to how the standard library uses AsRef<Path>:

let mut base = [0u8; 8];
let bits = base.as_mut_bitslice::<LittleEndian>();
//  bits is now an `&mut BitSlice<LittleEndian, u8>`
println!("{}", bits.len()); // 64

fn operate_on_bits(mut data: impl BitsMut) {
  let bits = data.as_mut_bitslice::<BigEndian>();
  //  `bits` is now an `&mut BitSlice<BigEndian, _>`
}

Macros

The bitbox! and bitvec! macros allow convenient production of their eponymous types, equivalent to the vec! macro in the standard library.

These macros accept an optional cursor token, an optional type token, and either a list of bits or a single bit and a repetition counter.

Because these are standard macros, not proc-macros, they do not yet produce well-optimized expanded code.

These macros are more thoroughly explained, including a list of all available use syntaxes, in their documentation.

Example Usage

This snippet runs through a selection of library functionality to demonstrate behavior. It is deliberately not representative of likely usage.

extern crate bitvec;

use bitvec::prelude::*;

use std::iter::repeat;

fn main() {
    let mut bv = bitvec![BigEndian, u8; 0, 1, 0, 1];
    bv.reserve(8);
    bv.extend(repeat(false).take(4).chain(repeat(true).take(4)));

    //  Memory access
    assert_eq!(bv.as_slice(), &[0b0101_0000, 0b1111_0000]);
    //                   index 0 -^               ^- index 11
    assert_eq!(bv.len(), 12);
    assert!(bv.capacity() >= 16);

    //  Stack operations
    bv.push(true);
    bv.push(false);
    bv.push(true);

    assert!(bv[12]);
    assert!(!bv[13]);
    assert!(bv[14]);
    assert!(bv.get(15).is_none());

    bv.pop();
    bv.pop();
    bv.pop();

    //  Set operations. These deliberately have no effect.
    bv &= repeat(true);
    bv = bv | repeat(false);
    bv ^= repeat(true);
    bv = !bv;

    //  Arithmetic operations
    let one = bitvec![1];
    bv += one.clone();
    assert_eq!(bv.as_slice(), &[0b0101_0001, 0b0000_0000]);
    bv -= one.clone();
    assert_eq!(bv.as_slice(), &[0b0101_0000, 0b1111_0000]);

    //  Borrowing iteration
    let mut iter = bv.iter();
    //  index 0
    assert_eq!(iter.next().unwrap(), false);
    //  index 11
    assert_eq!(iter.next_back().unwrap(), true);
    assert_eq!(iter.len(), 10);
}

Immutable and mutable access to the underlying memory is provided by the AsRef and AsMut implementations, so the BitVec can be readily passed to transport functions.

BitVec implements Borrow down to BitSlice, and BitSlice implements ToOwned up to BitVec, so they can be used in a Cow or wherever this API is desired. Any case where a Vec/[T] pair cannot be replaced with a BitVec/BitSlice pair is a bug in this library, and a bug report is appropriate.

BitVec can relinquish its owned memory with .into_vec() or .into_boxed_slice(), and BitSlice can relinquish its borrow by going out of scope.

Warnings

The BitSlice type is able to cause memory aliasing, as multiple independent &mut BitSlice instances may use the same underlying memory. This crate takes care to ensure that all observed behavior is exactly as expected, without any side effects.

The BitSlice methods only use whole-element instructions when the slice spans the full width of the element; when the slice has only partial use, the methods crawl each bit individually. This is slower on most architectures, but guarantees safety.

Race conditions are avoided through use of the atomic read/modify/write instructions stabilized in 1.34.0, as described above.

Planned Features

Contributions of items in this list are absolutely welcome! Contributions of other features are also welcome, but I’ll have to be sold on them.

  • Creation of specialized pointers Rc<BitSlice> and Arc<BitSlice>.
  • Procedural macros for bitvec! and bitbox!
  • An FFI module, and bindings from other languages.