Struct Buffer

Source

pub struct Buffer<'a> { /* private fields */ }

Expand description

AV1 bitstream reader.

This type implements the primitive bit-level operations used by the AV1 syntax functions in spec Section 4.10:

f(n): fixed-width unsigned bits
uvlc(): unsigned Exp-Golomb-like code used by AV1
le(n): little-endian byte-aligned integer
leb128(): little-endian base-128 variable-length integer
su(n): fixed-width signed integer
ns(n): non-symmetric range coding helper

The reader is intentionally simple: it borrows an input slice, maintains a byte cursor plus a bit offset inside the current byte, and exposes methods that match the syntax names from the specification as closely as possible.

Bit ordering:

AV1 reads bits MSB-first inside each byte. If the current byte is 0b1011_0010, the read order is 1, 0, 1, 1, 0, 0, 1, 0.

References:

AV1 specification, Section 4.10 “Bitstream data syntax”
AV1 specification, Section 5 “Syntax structures”
LEB128 background: DWARF Appendix C and https://en.wikipedia.org/wiki/LEB128

Implementations§

Source §

impl<'a> Buffer<'a>

Source

pub fn from_slice(buf: &'a [u8]) -> Self

Construct a reader over a borrowed byte slice.

The initial cursor points at the first bit of the first byte: index = 0, bit_pos = 0.

Source

pub fn seek_bits(&mut self, cut: usize)

Skip n bits without returning a value.

This is conceptually identical to calling get_bit n times and discarding the result, but it avoids repeated boolean materialization and keeps the intent explicit when the syntax says to “ignore” or “skip” reserved bits.

Source

pub fn get_bytes(&mut self, count: usize) -> &[u8] ⓘ

Read count bytes as a slice. Requires byte alignment.

This method does not copy data. It advances the byte cursor and returns a borrowed subslice into the original buffer.

Byte alignment is required because AV1 syntax only permits raw byte reads at whole-byte boundaries. If bit_pos != 0, the caller would be asking for a slice that starts in the middle of a byte, which cannot be represented as &[u8] without additional packing logic.

Source

pub fn get_bit(&mut self) -> bool

Read one bit and return it as a boolean.

Internally this extracts bit (7 - bit_pos) from the current byte, then advances the cursor by one bit.

Source

pub fn get_bits(&mut self, count: usize) -> u32

f(n): read count bits MSB-first as an unsigned integer.

AV1 spec Section 4.10.2 - f(n).

Algorithm:

Read one bit at a time in stream order.
Shift each bit into its numeric position in the result.
The first bit read becomes the highest-order bit of the returned value, and the last bit read becomes the lowest-order bit.

For example, if the next four bits are 1 0 1 1, the result is:

1<<3 | 0<<2 | 1<<1 | 1<<0 = 0b1011 = 11

Cross-byte example:

Suppose the unread stream is:

byte 0 = 1010_1011
byte 1 = 1100_1101

Calling get_bits(12) reads:

first 8 bits from byte 0: 1010_1011
next 4 bits from byte 1: 1100

Concatenating them in read order yields:

1010_1011_1100 = 0xABC

This is why the implementation ORs each bit into (count - i - 1): it reconstructs the integer exactly as the bitstring appears in the specification.

Source

pub fn get_uvlc(&mut self) -> u32

uvlc(): variable-length unsigned integer.

AV1 spec Section 4.10.3 - uvlc().

AV1 uvlc() uses a prefix code closely related to Exp-Golomb coding:

count the number of leading zero bits, lz
consume the terminating 1
read lz payload bits
return payload + 2^lz - 1

Example:

Bit pattern 1 -> lz=0, payload bits="", value=0
Bit pattern 010 -> lz=1, payload bits=0, value=1
Bit pattern 011 -> lz=1, payload bits=1, value=2
Bit pattern 00110 -> lz=2, payload bits=10, value=5

Worked example for 00110:

leading zeros: 00 -> lz = 2
stop bit: 1
payload: 10 -> decimal 2
value: 2 + 2^2 - 1 = 5

The 2^lz - 1 offset makes codes of different prefix lengths map to contiguous integer ranges.

Per the spec, if lz >= 32, the decoder returns 0xFFFF_FFFF.

Related background: this is closely related to unsigned Exp-Golomb coding, but AV1 defines the exact mapping normatively in spec Section 4.10.3.

Source

pub fn get_le(&mut self, count: usize) -> u32

le(n): unsigned little-endian count-byte integer.

AV1 spec Section 4.10.4 - le(n).

Requires byte alignment because the syntax is defined over complete bytes, not arbitrary bit positions.

The implementation reads bytes in stream order and places byte i into bit range [8*i, 8*i+7] of the result:

value = b0 + (b1 << 8) + (b2 << 16) + ...

So bytes [0x34, 0x12] decode to 0x1234.

Worked example:

first byte read: 0x78
second byte read: 0x56
third byte read: 0x34
fourth byte read: 0x12

Then:

0x78 + (0x56 << 8) + (0x34 << 16) + (0x12 << 24) = 0x12345678

Source

pub fn get_leb128(&mut self) -> u64

leb128(): variable-length LEB128 unsigned integer. Requires byte alignment.

AV1 spec Section 4.10.5 - leb128().

LEB128 stores an integer in 7-bit groups:

bit 7 of each byte is the continuation flag
bits 0..6 carry payload
the first byte contains the least-significant 7 payload bits

Numerically this means:

value = group0 << 0 | group1 << 7 | group2 << 14 | ...

Example:

[0x05] -> 5
[0x80, 0x01] -> 128
[0xAC, 0x02] -> 300

Worked example for [0xAC, 0x02]:

0xAC = 1010_1100
- continuation = 1
- payload = 0x2C = 44
0x02 = 0000_0010
- continuation = 0
- payload = 0x02 = 2

Reassemble in little-endian 7-bit groups:

44 << 0 | 2 << 7 = 44 + 256 = 300

The implementation stops when it encounters a byte whose continuation flag is 0, or after 8 bytes, matching the AV1 spec limit.

Source

pub fn get_su(&mut self, count: usize) -> i32

su(n): n-bit signed integer.

AV1 spec Section 4.10.6 - su(n).

AV1 defines su(n) as a fixed-width signed integer encoded in two’s complement over exactly n bits.

Decoding strategy:

Read the n bits as an unsigned integer.
Inspect the top bit (1 << (n - 1)), which is the sign bit.
If the sign bit is clear, the value is already non-negative.
If the sign bit is set, subtract 2^n to sign-extend into i32.

Example for n = 4:

0011 -> 3
1100 -> 12 - 16 = -4

Another way to see the negative case:

n = 4 means the representable range is [-8, 7]
raw unsigned 1100 is 12
because the sign bit is set, interpret it modulo 2^4 = 16
12 - 16 = -4

Source

pub fn get_ns(&mut self, n: u32) -> u32

ns(n): non-symmetric unsigned coded integer in the range [0, n-1].

AV1 spec Section 4.10.7 - ns(n).

Motivation:

When n is not a power of two, a fixed-width code wastes states. For example, values in [0, 4] need 5 states, but 3 bits represent 8 states. AV1’s ns(n) removes that waste by using:

a short code for the first m values
a long code for the remaining n - m values

where:

w = ceil(log2(n))
m = 2^w - n

Decoding algorithm:

Read w - 1 bits to get v.
If v < m, return v.
Otherwise read one extra bit b and return (v << 1) - m + b.

This partitions the code space so exactly n output values are generated, while keeping the code as close as possible to fixed-width.

Example for n = 5:

w = 3, m = 8 - 5 = 3
values 0,1,2 use 2 bits: 00, 01, 10
values 3,4 use 3 bits: 110, 111

Worked decode examples for n = 5:

input 01
- read w - 1 = 2 bits -> v = 1
- v < m (1 < 3) -> return 1
input 110
- read first 2 bits -> v = 3
- v >= m (3 >= 3) -> read one extra bit 0
- return (3 << 1) - 3 + 0 = 3
input 111
- read first 2 bits -> v = 3
- extra bit = 1
- return (3 << 1) - 3 + 1 = 4

Reference: the AV1 spec defines this directly in Section 4.10.7; the same idea is also known as truncated binary coding in information theory; see also https://en.wikipedia.org/wiki/Truncated_binary_encoding.

Source

pub fn is_byte_aligned(&self) -> bool

Returns true if the cursor is at a byte boundary.

This simply means no partial bits of the current byte have been consumed, i.e. bit_pos == 0.

Source

pub fn byte_align(&mut self)

Advance to the next byte boundary, discarding any remaining bits in the current byte (trailing_bits padding).

This is commonly used after parsing AV1 payloads that end in trailing_bits(): a single 1 bit followed by enough 0 bits to complete the byte.

Example:

If 3 bits of the current byte have already been consumed, then bit_pos = 3 and byte_align() skips 8 - 3 = 5 bits so that the next read starts at the next byte.

Source

pub fn bytes_remaining(&self) -> usize

Returns the number of bytes remaining from the current byte index.

This is intentionally byte-granular. If the cursor is mid-byte, the partially consumed current byte still counts as remaining because future bit reads can continue from it.

Source

pub fn bytes_consumed(&self) -> usize

Returns the number of bytes consumed so far, rounded up.

Rounding up is useful when enforcing AV1 OBU boundaries, because having consumed even one bit from a byte means that byte is no longer available to subsequent syntax elements.