bitcraft 0.9.2

A zero-cost, hardware-aligned bitfield and enumeration generator.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
# 🛠️ `bitcraft` Implementation Deep Dive

This document details the engineering principles, memory layouts, and internal mechanisms that allow `bitcraft` to achieve zero-cost bitfield manipulation with strict type safety.

---

## 🚀 1. The Efficiency Gap

In high-performance domains (vector engines, network stacks, or high-frequency trading), standard Rust structs introduce **implicit padding** to satisfy memory alignment. A 1-bit boolean might consume 8 bits, and a 24-bit ID might occupy 32 bits. At billion-scale, this waste trashes CPU caches and increases memory pressure.

`bitcraft` solves this by giving you:

- **Absolute Bit Control**: Define exactly which bits map to which logical fields.
- **Unique `bytestruct!` Support**: native support for **flexible 1-16 byte arrays** (`[u8; N]`) that are treated as primitive-like registers. Most libraries restrict you to standard `u8-u128`.
- **Unique `byteval!` IDs**: Instant "Packed IDs" for 24-bit, 40-bit, or 56-bit values that behave like first-class integers—solving the "Odd-Width Integer" problem in one line.
- **Zero-Cost Abstractions**: Generated code compiles down to the exact bitwise shifts and masks you would write by hand—verified by LLV-MIR inspection.
- **Hardware Alignment**: LSB-first mapping ensures your software layout matches the physical little-endian storage in modern hardware.
- **Boilerplate-Free Ergonomics**: Automatic `Default` (zero-init) and a fluid `with_*` builder pattern come standard.

---

## 📐 2. Memory Layout: `std` vs. `bitcraft`

Standard Rust structs are optimized for **Alignment Padding** to satisfy CPU word boundaries. `bitcraft` is optimized for **Mechanical Sympathy**—aligning data to the exact bit.

### Standard Rust (`struct`)

Rust inserts "dead space" (padding) to ensure fields align with power-of-two boundaries.

```rust
struct Standard {
    active: bool, // 1 byte + 3 bytes padding
    id: u32,      // 4 bytes
} // Total: 8 bytes
```

### `bitstruct!` Layout (LSB-First)

`bitcraft` uses a **Least Significant Bit (LSB) first** mapping. The first field occupies the lowest bits of the underlying primitive.

```rust
bitstruct! {
    struct Packed(u32) {
        active: bool = 1, // Bit 0
        id: u32 = 31,     // Bits 1-31
    }
} // Total: 4 bytes (100% density)
```

**Binary Representation (u32):**

```text
MSB                                          LSB
 [ ID (31 bits)                           ] [A]
 Bit 31 ............................ Bit 1   Bit 0
```

### Big-Endian vs. Little-Endian Naturalism

While generic bitfield libraries often struggle with endianness, `bitcraft` embraces the **Natural Hardware Layout** of modern CPUs (x86_64, ARM64). By using **LSB-first** (Least Significant Bit) mapping, the logical "Bit 0" in your code matches the physical "Bit 0" in a little-endian memory dump. This makes side-by-side debugging with a hex editor intuitive and predictable.

---

## 🏎️ 3. Zero-Copy FFI & `bytemuck` Integration

One of the most powerful features of `bitcraft` is its ability to interpret raw memory buffers without copying. Every generated struct is decorated with `#[repr(transparent)]` and implements the `bytemuck::Pod` and `Zeroable` traits.

### The "No-Alloc" Pattern

In network programming or high-speed data processing, you can "overlay" a bitcraft onto a raw byte slice from a socket or disk:

```rust
// Pseudo-code
let buffer: &[u8] = socket.recv();
// Zero-cost cast: No memory is moved, no allocation occurs.
let header: &PacketHeader = bytemuck::cast_ref(&buffer[0..4]);

if header.is_valid() {
    process(header.payload_id());
}
```

This effectively turns your structs into **Schema-on-Read** overlays, making them ideal for high-performance firmware and protocol parsers.

---

## ⚙️ 4. The "Literal Guard" Pattern

A major performance bottleneck in bitfield libraries is using dynamic loops or `memcpy` to handle fields that span multiple bytes. `bitcraft` avoids this using **Const-Generic Helper Functions** (`read_le_bits` and `write_le_bits`) that implement the **Literal Guard** pattern.

### The "Acting Primitive" Pattern

When you define a field in `bytestruct!`, the macro doesn't generate a raw loop. Instead, it routes the operation to a specialized `const fn` that uses the widest available registers.

**1. Byte-Aligned Fast Paths**:
If a field is byte-aligned (e.g., a `u16` starting at byte 2), the engine bypasses bit-shifting entirely and uses direct hardware instructions (like `MOV` or `LDR`) via `u16::from_le_bytes`.

**2. Bit-Level Literal Guards**:
For unaligned fields, the helper functions use a sequence of constant-folded index checks (e.g., `if len <= 8 { ... }`). Because the field widths and positions are passed as **Const Generics**, the Rust compiler (LLVM) deletes all branches, leaving only a flat, branchless sequence of bitwise shifts and ORs—the absolute theoretical maximum speed.

---

## 🛡️ 5. Checked vs. Unchecked Mutators

`bitcraft` provides two distinct safety layers for modifying data, allowing developers to balance performance and validation.

### `set_*` (Optimistic/High-Performance)

Standard setters are designed for "Hot Paths" where you trust the data source (or have already validated it).

- **Behavior**: Uses `debug_assert!` to check for bit overflows.
- **Production**: In `--release` mode, the check is removed, and values are simply masked. This ensures **zero branch overhead** in tight loops.

### `try_set_*` (Validated/Schema-Entry)

Try-setters are designed for boundaries where data is untrusted (e.g., input from a REST API or a file).

- **Behavior**: Always returns a `Result<(), BitstructError>`.
- **Production**: Performs a runtime bounds check and returns an explicit `Overflow` error if the value exceeds the allocated bit-width, preventing data corruption.

---

## ⛓️ 6. The "Fluid Builder" Pattern

Generated structs include `with_*` and `try_with_*` methods that enable a clear, functional-style API while preserving register-level efficiency.

### Register Renaming Optimization

Because these methods operate on `self` by value on the stack, modern CPU **Register Renaming** units can often execute multiple "updates" in parallel within the same register pipeline.

```rust
// Functional style is as fast as manual mutation
let config = Config::default()
    .with_enabled(true)
    .with_mode(7);
```

---

## 🛡️ 7. Strict Type Safety Enforcement

The library prevents signed integers from being used as storage or fields to avoid the "Two's Complement Ambiguity" during bit-packing.

### Internal Traits

We use **Marker Traits with Associated Constants** to trigger compile-time errors.

#### `IsUnsignedInt`: Restricting Base Storage

```rust
pub trait IsUnsignedInt { const ASSERT_UNSIGNED: () = (); }
impl IsUnsignedInt for u8 { const ASSERT_UNSIGNED: () = (); }
// ... implemented only for u8-u128
```

#### `ValidField`: Restricting Field Types

```rust
pub trait ValidField { const ASSERT_VALID: (); }
impl ValidField for bool { const ASSERT_VALID: () = (); }
impl ValidField for u8 { const ASSERT_VALID: () = (); }
// ... implemented for u8-u128 and bitenum! generated types
```

### Enforcement Mechanism

The macro inserts a check into a `const _: () = { ... };` block:

```rust
const _: () = {
    // This will FAIL TO COMPILE if $base_type is i32
    let _ = <$base_type as $crate::IsUnsignedInt>::ASSERT_UNSIGNED;

    // This will FAIL TO COMPILE if $field_type is i32
    let _ = <$field_type as $crate::ValidField>::ASSERT_VALID;
};
```

---

## 🔌 8. Acting Primitives (Register Routing)

`bytestruct!` supports arrays up to 16 bytes, but it doesn't always use 128-bit operations. It uses **Dynamic Register Routing** to choose the smallest register that can safely hold the field.

| Array Size | Acting Primitive | CPU Requirement |
| :--- | :--- | :--- |
| 1-4 Bytes | `u32` | 32-bit registers |
| 5-8 Bytes | `u64` | 64-bit registers |
| 9-16 Bytes | `u128` | Software/Hardware 128-bit |

### Implementation Logic (Pseudo-code)

```rust
macro_rules! route_fields {
    ($size <= 4) => { use u32_operations; }
    ($size <= 8) => { use u64_operations; }
    ($size <= 16) => { use u128_operations; }
}
```

This ensures that "Hot Path" metadata (usually < 8 bytes) never suffers the overhead of 128-bit register emulation on older hardware.

---

## 💎 9. `bitenum!` and "Total Types"

Standard Rust enums are **Algebraic Data Types**. Reading an invalid bit pattern into a standard `enum` is **Undefined Behavior (UB)**.

`bitenum!` solves this by creating a **Total Type wrapper** around a primitive.

### Standard Enum Danger

```rust
#[repr(u8)]
enum Std { A = 0, B = 1 }
// let x: Std = unsafe { transmute(3u8) }; // UB / CRASH
```

### `bitenum!` Safety

```rust
bitenum! { enum Safe(2) { A = 0, B = 1 } }
// Implementation:
struct Safe(u8);
impl Safe {
    fn is_defined(&self) -> bool {
        match self.0 { 0 | 1 => true, _ => false }
    }
}
```

- **From Bits**: In debug mode, `from_bits` panics if the input is > mask. In release, it truncates.
- **Try From Bits**: Returns `Result<Self, BitstructError::InvalidVariant>`, ensuring your system **never encounters UB** even when parsing malicious network packets.

---

## ❄️ 10. `const` Initialization & Zero-Init Guarantee

`bitcraft` is designed for static and embedded environments where boot-time initialization must be zero-cost.

### Zero-Cost Statics

Every generated accessor and constructor is a `const fn`. This allows you to define global configurations that are baked into the binary's `.data` segment.

```rust
// Baked into the binary at compile-time
static DEFAULT_CFG: Config = Config::from_bits(0x01);
const TEMPLATE: Config = Config::default().with_enabled(true);
```

Because the base storage is always an unsigned integer or primitive array, `bitcraft` types are inherently **Zero-Init compatible**. Running `Default::default()` on a 16-byte bitcraft is equivalent to a single 128-bit clear instruction.

---

## ⚡ 11. Instruction Efficiency: Atomic Register Manipulation

While standard Rust types are optimized for simplicity, `bitcraft` allows you to trade a negligible amount of instruction latency for massive gains in memory density.

### Register Specialization (`u64` vs. `u128`)

While `bytestruct!` supports fields up to 16 bytes (128 bits), using `u128` registers for 2-bit flags on 64-bit hardware introduces unnecessary register pressure and instruction complexity.

The engine implements **Dynamic Register Routing**:

- **Fields spanning ≤ 8 bytes**: Operations are performed using native `u64` registers. This allows the CPU to retire instructions immediately without the "software-emulated" overhead often associated with `u128` on modern 64-bit architectures.
- **Fields spanning > 8 bytes**: The macro gracefully promotes the operation to `u128`, ensuring correctness for massive fields while preserving specialized speed for hot-path metadata.

### Instruction Fusion & Stack Traffic

When you manually manipulate byte arrays (e.g., `[u8; 3]`), you often introduce "Stack Traffic." Creating temporary fixed-size arrays to satisfy library signatures (like `u32::from_le_bytes([b0, b1, b2, 0])`) forces the compiler to move data from registers to the stack and back.

`bitcraft` avoids this by generating a single unrolled "Shift-and-OR" expression (e.g., `(b0 as u32) | ((b1 as u32) << 8) | ...`). Modern compilers recognize this pattern and perform **Instruction Fusion**. Instead of multiple individual shifts, the backend generates a single **Unaligned Load** instruction (like `MOV` or `LDR`), effectively loading your "packed" data directly into a high-speed CPU register in one cycle.

---

## ⚖️ 12. Summary Feature Matrix

| Feature | Standard Rust (`struct`/`enum`) | `bitcraft` Library |
| :--- | :--- | :--- |
| **Granularity** | Byte-level (minimum 8 bits) | **Bit-level** (minimum 1 bit) |
| **Byte-Array Basis** || **Unique `bytestruct!` Support** |
| **Odd-Width Ints** || **Unique `byteval!` Support** |
| **Padding** | Implicit (inserted by rustc) | **None** (Explicit control) |
| **Instruction Count** | Multiple loads/stores | **Atomic** (Register-wide) |
| **Alignment** | Compiler-enforced | **Hardware-aligned** (LSB-First) |
| **Safety** | UB-risk on invalid patterns | **UB-Free** (Total Types) & Bounds Checked |
| **FFI / C-ABI** | Manual `#[repr(C)]` | **Transparent** (Automatic) |
| **Const Eval** | Limited in enums | **Full `const fn`** support |

---

## ⚖️ 13. Instruction Count Comparison

| Operation | `std` Struct | `bitcraft` (Optimal) |
| :--- | :--- | :--- |
| **Read Field** | `MOV (offset)` | `MOV + SHR + AND` |
| **Write Field** | `MOV (offset)` | `MOV + AND (mask) + OR + MOV` |
| **Bulk Load** | Multiple instructions | **Single 64/128-bit Load** |

While bitfields require slightly more instructions per individual access, they drastically reduce **Stack Traffic** and **Cache Misses**. By doubling your memory density, you effectively double the speed of your CPU cache for that data structure.

---

## 🧩 14. Recursive Macro Architecture (TT Munching)

`bitcraft` uses the **Token-Tree (TT) Muncher** pattern to process field definitions. This is a recursive macro technique that allows for sequential processing of an arbitrary list of inputs.

### 1. Sequential Traversal

Because Rust macros cannot use standard "loops" for code generation, we use recursion. Each macro call processes the *first* field in the list and then calls itself with the *remaining* fields.

### 2. State Accumulation ($shift)

With each recursive call, the macro passes along an accumulated `$shift` value. This ensures that the second field starts exactly where the first one ended, maintaining LSB-first packing without the user having to calculate raw bit offsets manually.

### Recursive Logic (Pseudo-code)

```rust
macro_rules! bitcraft {
    // 1. Entrance Point: Start the recursion at bit offset 0.
    (struct $name:ident ($base:ty) { $($fields)* }) => {
        bitstruct!(@impl_getters_setters $base, 0, $($fields)*);
    };

    // 2. Termination Case: No more fields left to process.
    (@impl_getters_setters $base:ty, $shift:expr, ) => {};

    // 3. Recursive Step: Process the HEAD field, then recurse with the TAIL.
    (@impl_getters_setters $base:ty, $shift:expr, $vis:vis $name:ident $type:tt $bits:tt $($rest:tt)*) => {
        // Generate Getter/Setter for this field using the current $shift.
        impl_accessor!($name, $type, $bits, $shift);

        // Call self again with the NEW shift: (current_shift + current_bits)
        bitstruct!(@impl_getters_setters $base, $shift + $bits, $($rest)*);
    };
}
```

### 3. Type-Based Specialization (The Router)

The recursion also acts as a "Router". By pattern-matching on the field type during the recursive step, `bitcraft` can generate specialized code:

- **`bool`**: Generated getters return native `true`/`false` by checking if the bit is non-zero.
- **`bitenum!`**: Generated getters wrap the raw bits in the custom enum type using its `from_bits` method.
- **`u8 - u128`**: Generic numeric getters perform a standard shift-and-mask.

---

## 🏗️ 15. The "Parsing" Pipeline: TT Munching in Detail

Since `bitcraft` uses declarative `macro_rules!`, it does not have access to a formal AST (Abstract Syntax Tree) in the compiler's sense. Instead, it operates on a stream of **Token Trees**. The "parsing" logic is distributed across three distinct phases.

### Phase 1: The Collective Entry Point

When the macro is invoked, it first performs **batch operations** on the entire list of fields using standard macro repetitions (`$()*`).

- **Memory Density Validation**: It sums the `$bits` of all fields (`0 $(+ $bits)*`) and compares it against the `size_of::<BaseType>()`.
- **Debug Implementation**: It iterates through every `$field_name` to build a clean `fmt::Debug` output.
- **Verification Trigger**: It passes all field types to `@check_fields` to trigger trait-based validation.

### Phase 2: The Recursive "Muncher"

After the initial setup, the macro hands off the stream of tokens to the **TT Muncher** (`@impl_getters_setters`).
A typical muncher arm captures the "Head" (current field) and "Tail" (remaining fields). It calculates the next offset as `$shift + $bits` and passes it to the next recursive call.

### Phase 3: Type Specialization (Pattern Matching)

The muncher doesn't just treat everything as text. It uses **Literal Token Matching** to decide which code to generate.

- **If the type is exactly `bool`**: Specialized logic for boolean conversion.
- **If the type is a primitive (e.g., `u16`)**: Routes to generic integer getters.
- **Fallback (`$type:tt`)**: Assumes a custom `bitenum!` and generates glue code for variant conversion.

This "Pipeline" ensures that even though the logic is defined within a macro, the resulting code is as specialized and typed as if you had written individual implementations for every field by hand.

---

## 🏛️ 16. Architectural Pillars: The Design Philosophy

The `bitcraft` architecture is built on four core pillars that distinguish it from general-purpose serialization libraries.

### 1. Hardware Naturalism (LSB-First)

Unlike libraries that offer "adjustable" bit-ordering, `bitcraft` enforces a strict **Least Significant Bit (LSB) first** convention. This decision is architectural: by matching the physical storage of little-endian CPUs (x86_64, ARM64), we ensure that a bitcraft in memory looks exactly like it does in the source code. This eliminates the "Endianness Tax" and allows for direct MMIO (Memory Mapped I/O) compatibility.

### 2. Register Specialization (Acting Primitives)

The "Acting Primitive" pattern is central to our speed. We don't treat byte-arrays as arrays; we treat them as **Dynamic Registers**. By promoting an 11-byte array to a `u128` register for bit manipulation, we move data from the slow RAM stack into high-speed CPU registers as early as possible. This minimizes "Stack Traffic" and maximizes the instruction retirement rate.

### 3. Declarative-Only Integrity

By intentionally avoiding procedural macros (`syn`/`quote`), `bitcraft` remains **highly portable** and **compiles in milliseconds**. This architecture ensures that the crate has zero heavy dependencies, making it suitable for even the most minimal `no_std` embedded environments where compile-time resources are constrained.

### 4. Zero-Cost Verification (Trait-Based Safety)

We use a **Two-Tier Verification** strategy:

- **Macro-Level**: Sum checks and bit-width assertions happen during expansion.
- **Trait-Level**: We use marker traits (`ValidField`, `IsUnsignedInt`) to inject compile-time errors if a user attempts to use a signed integer or an invalid storage type. This architecture shifts the burden of safety from the **runtime** to the **compiler**, ensuring that a compiled `bitcraft` binary is as lean as manual C code.

---

## 🛠️ Roadmap & Future Implementation

- [ ] **Signed Field Interpretation**: Support for `i8`, `i16`, etc., via automatic Sign Extension on the N-bit fields.
- [ ] **C-Header Generation**: Integration with `cbindgen` to automatically generate FFI-compatible C headers for C/C++ firmware.
- [ ] **`serde` Integration**: Optional feature to derive `Serialize` and `Deserialize` for all packed types.
- [x] **Property-Based Testing**: Use `proptest` to fuzz the bit-packing logic for millions of random inputs.