`std::io::Cursor`:
```rust
struct Cursor<T> {
inner: T,
pos: u64 // 'byte' level position
}
```
`BitCursor`:
```rust
struct BitCursor<T> {
inner: T,
pos: u64 // 'bit' level position
}
```
The APIs will be exactly the same, just with `BitCursor`'s applying at a bit-level where it makes sense:
`get_mut`, `get_ref`, `into_inner`, `new` all work exactly the same
`position` and `set_position` will use bit-indices as opposed to byte for `BitCursor`
`split` and `split_mut` will return `(&BitSlice<u8, Msb0>, &BitSlice<u8, Msb0>)` for `BitCursor` instead of `(&[u8], &[u8])`.
Implementing split/split_mut:
At first I did blanket implementations of `BitCursor<T>` for any T that was
`AsRef<BitSlice<u8, Msb0>>`. For the most part this worked well, but it meant
that `BitCursor<&[u8]>` no longer worked since it didn't meet that requirement.
Then I discovered `bitvec` defines an `AsBits` trait that it claims is the
equivalent for `AsRef`, so I tried that instead, but that results in another
problem: for some reason `BitVec`` doesn't implement`AsBits`, it just provides
its own`as_bitslice` method. From what I can tell, this means there's no
single blanket trait boundary I can use to provide an implementation that works
for everything.
For now I think I'll stick with AsRef, but hopefully can figure something
better out long term.
--> Actually, it looks like there isn't an impl for AsRef<BitSlice<..>> for
Vec<u8>, so maybe AsBits is better for now and I can manually do some
implementations for BitCursor<BitVec<..>>? Will give that a try.
--> **Actually** this is flawed: it looks like AsRef _is_ what I want, and, in
order to make things like Vec<u8> work I'd just need to do some extra stuff
with a helper trait. But, maybe this makes sense since Vec<u8> and &[u8] are
explicitly byte-level abstractions and need to be 'adapted'?
----> The way forward here ended up being adding a separate helper trait that
can bridge all the gaps and using that
## BitBuf::advance vs BitBufMut::advance_mut
'advance' on a Buf/BitBuf is "advancing" the start of the view. 'advance_mut' on BitBufMut/BufMut is advancing the _write position_ of the view.
## Managing BitStore compatibility
Originally the read_uXX and write_uXX methods all copied the data over
bit-by-bit. This felt a bit wasteful and inefficient and it made that code
very "special" compared to the other APIs. `bytes` does this by converting all
types to slices and then putting a slice, so I wanted to do something similar
but with BitSlices. This means we'd have a trait like this:
```rust
pub trait ByteOrder {
fn write_u32_to_bits<O: BitStore>(bits: &mut BitSlice<O>, value: u32);
fn read_u32_from_bits<O: BitStore>(bits: &BitSlice<O>) -> u32;
}
```
Which would then get implemented for `BigEndian` and `LittleEndian`. Ideally we'd have something like:
```rust
impl ByteOrder for BigEndian {
fn write_u32_to_bits<O: BitStore>(bits: &mut BitSlice<O>, value: u32) {
assert!(bits.len() <= 32, "cannot write more than 32 bits");
let mut value_be_bytes = value.to_be_bytes();
let value_slice = &mut value_be_bytes.view_bits_mut::<Msb0>()[(32 - bits.len())..];
bits.copy_from_slice(value_slice);
}
fn read_u32_from_bits<O: BitStore>(bits: &BitSlice<O>) -> u32 {
assert!(bits.len() <= 32, "cannot read more than 32 bits into a u32");
let mut buf = [0u8; 4];
let dest = &mut BitSlice::from_slice_mut(&mut buf)[..bits.len()];
dest.copy_from_slice(bits);
u32::from_be_bytes(buf)
}
}
```
The problem is that `BitSlice::copy_from_slice` requires that both slices have
the exact same `BitStore` type. Even though bits-io _always_ uses `u8`, bitvec
defines 'safe aliases' for each storage type that can be used when two mutable
slices overlap the same bytes in their back storage. In `u8`'s case that alias
is `BitSafeU8`. So that means if one of `src` or `dest` is `u8` and the other
is `BitSafeU8` then we can't copy directly between them. In each of the
read/write scenarios, we have flexibility over one of the slices, but not the
other:
When writing, we control the source's bitslice: we're converting a u32 into
bytes and then can always create a `BitSlice<u8>` from that, but we don't
control the slice we're writing _to_: that comes from the caller and could be
either `BitSlice<u8>` or `BitSlice<BitSafeU8`>.
When reading, we control the destination's bitslice: we create a buffer to read
the value into and can always create a `BitSlice<u8>` from that, but we don't
control the source we're reading _from_: that comes from the caller and could
be either `BitSlice<u8>` or `BitSlice<BitSafeU8>`.
If, in both cases, the source and destination are `BitSlice<u8>`, then we can
use `BitSlice::copy_from_slice`. But, if they differ, we need some special
logic. This means we need to know one case from the other, so we define some
traits to enable this:
For writing:
```rust
pub trait CopyFromBitSlice {
fn copy_from_slice(dest: &mut BitSlice<Self>, src: &mut BitSlice<u8>)
where
Self: BitStore + Sized;
}
```
For reading:
```rust
pub trait CopyToBitSlice {
fn copy_to_slice(src: &BitSlice<Self>, dest: &mut BitSlice<u8>)
where
Self: BitStore + Sized;
}
```
Then we can have specific implementations for the `BitSafeU8` and `u8` cases.
When writing from `BitSlice<u8>` to `BitSlice<BitSafeU8>`, we can use
`BitSlice::split_at_unchecked_mut` to transform the source from a
`BitSlice<u8>` into a `BitSlice<BitSafeU8>`. When reading from a
`BitSlice<BitSafeU8>` to a `BitSlice<u8>`, we can use `split_at_unchecked_mut`
again to transform the _dest_ from a `BitSlice<u8>` to a `BitSlice<BitSafeU8>`.
This subtle difference in the source vs dest transformation is why we need two
traits.
Another required piece for this to work is to seal our wrapper of the
`BitStore` trait so that the compiler knows there are only two possible
implementations (`u8` and `BitSafeU8`), that way we can take a generic bounded
by the `BitStore` trait (`<O: BitStore>`) and call `O::copy_to_slice` and
`O::copy_from_slice` after providing implementations of `CopyFromBitSlice` and
`CopyToBitSlice` for both `u8` and `BitSafeU8` because that then covers all
possible values of `O`.
operations don't work well when the slice isn't aligned with the underlying
storage. I _think_ this might be a bug and filed [this
issue](https://github.com/ferrilab/bitvec/issues/294). For now will deal with
it as I don't think it'll be a common case for stuff I'm working on, but would
be nice if it worked as expected (though for all I know it could be by-design).
I ran into another case where I was going to need to handle this again and I
finally stumbled across `clone_from_bitslice` which works when BitStore's
differ: when they're the same it uses `copy_from_bitslice` and when it can't it
goes through and copies on its on (what i was doing manually).
Yet another case this came up was in the BitWrite impl for &mut BitSlice. I
noticed that the std::io::Write impl for &mut [T] had the reference update
itself to reflect the written bytes, but my impl for BitWrite didn't. I
looked at the std::io::Write impl and it leveraged split_at_mut so I tried to
do the same, but then I ran into the aliasing problem again:
when the type was &mut BitSlice<u8>, calling split_a_mut gave me a &mut
BitSlice<BitSafeU8>, so I couldn't re-assign that to self. bitvec has quite a
few ways to unalias a type's storage when you know it's safe (there's
BitSlice::unalias_mut and BitSlice::split_at_unchecked_mut_noalias which really
would've been ideal) but none are public. I also tried a more manual casting
approach:
```rust
/// SAFETY:
/// - `bits` must have originated from a `BitSlice<u8, Msb0>`
/// - `bits` must be disjoint from any other active reference
// Note: this almost worked, but BitSpan isn't public at all.
pub unsafe fn strip_aliasing_u8<'a>(bits: &mut BitSlice<BitSafeU8>) -> &'a mut BitSlice<u8> {
let bitptr = bits.as_mut_bitptr(); // BitPtr<Mut, BitSafeU8, Msb0>
let len = bits.len();
// Get the underlying storage pointer and bit offset
let (raw, bit_offset) = bitptr.raw_parts();
// Cast the raw element pointer from BitSafeU8 to u8
let unaliased_raw = raw.cast::<u8>();
// SAFETY: caller guarantees this was originally BitSlice<u8, Msb0>
let ptr_u8 = BitPtr::<_, u8, Msb0>::new_unchecked(unaliased_raw, bit_offset);
// SAFETY: BitSpan::new_unchecked is equivalent to reconstructing a bitslice view
let span = BitSpan::new_unchecked(ptr_u8, len);
BitSlice::from_bitptr_mut(ptr_u8, len)
}
```
but `BitSpan` isn't public at all. Finally I just tried doing the impl without
`split_at_mut` and just writing to self directly and that seems to work, so
will go with that for now.
## Byte slice operations
There are a couple byte slice operations that it would be nice to enable, but
they'd only make sense if the current position is on a byte boundary. Things
like chunk_bytes, chunk_mut_bytes, or even ways to hander out 'reader' or
'writer' instances like Buf/BufMut do. I'm thinking we could add a
'byte_aligned' method that would allow the caller to confirm they're on a
byte-aligned boundary. Then the methods that had things out can assert that
it's the case before returning the value.
Note that I think for at least some operations we'd need to require both the
start _and_ end are byte-aligned. Maybe if there's something that we knew
wasn't going all the way to the end it'd be ok if the end wasn't...but might be
best to ensure it anyway.
## Tracking capacity in BitsMut
At first I thought we'd need to track this separately, but now I think I'm
realizing that we don't: it looks like it's only relevant from an allocation
perspective and since we're deferring all of that to the underlying BytesMut
structure, I think we can always leverage self.inner.capacity() for capacity.
--> Turns out this is wrong: we need to manage a bits-level capacity because
the bits-level 'split' operations mean that the capacity limit can occur at a
non-byte boundary.
## Making Buf/BufMut supertraits
I wanted to re-orient the types here a bit, especially the `bytes` equivalents,
to be more about "bytes extensions" than "bits types" since the goal is really
to have them do what bytes does _plus_ some bits-related stuff. One idea there
was to make the BitBuf/BitBufMut traits have Buf/BufMut as super-traits. The
problem there is that then I'd have to be able to implement Buf/BufMut for
types like BitSlice, which are foreign, so that doesn't work.
One option, if the goal is to allow this to be passed somewhere that expect
`<T: Buf>`, might be to add a method which hands out an `impl Buf` via some
'buf_adaptor' method/type.
## impl BitBuf for &BitSlice<BitSafeU8>
The `BitBuf` trait currently defines methods:
```rust
/// Returns a [`BitSlice`] starting at the current position and of length between 0 and
/// `BitBuf::remaining`. Note that this _can_ return a shorter slice.
fn chunk(&self) -> &BitSlice;
/// Returns a slice of bytes starting at the current position and of length between 0 and
/// `BitBuf::remaining_bytes`. Note that this _can_ return a shorter slice.
fn chunk_bytes(&self) -> &[u8];
```
And we have `impl BitBuf for &BitSlice`, but this means there's no impl of
`BitBuf` for `&BitSlice<BitSafeU8>`. In order to do this we'd need to make the
`BitBuf` trait generic around the `BitStore` type, such that we'd end up with
something like:
```rust
pub trait BitBuf<S: BitStore> {
...
fn chunk(&self) -> &BitSlice<S>;
fn chunk_bytes(&self) -> &[O::Unalias];
}
```
but this would ruin the ability to use `BitBuf` as a generic bound, so I'm not
sure we want that. I haven't run into a case where I needed this yet, but it
feels like a bit of a timebomb to have it lurking in there. Maybe can
eventually find some way to work around it.