Bitcraft ⚙️
The zero-cost, hardware-aligned bitfield and enumeration engine for Rust.
bitcraft is a high-performance declarative macro library designed for systems where every bit counts. Engineered for Mechanical Sympathy, it allows developers to define data structures that align perfectly with CPU cache lines and memory bus widths, eliminating the silent performance tax of implicit padding.
[!NOTE] Roadmap:
bitcraftbase storage will remain unsigned for maximum register efficiency and deterministic bit-packing. Support for interpreting fields as signed integers (two's complement) within these unsigned containers is currently on the roadmap.
[!TIP] New to Bitfields? See our Ecosystem Comparison to understand how
bitcraftdiffers frommodular-bitfield,packed_struct, and standard Rust enums.
[!TIP] Technical Deep Dive: Curious about how it works? See our Internal Implementation Guide for a breakdown of TT-munching, register specialization, and hardware alignment.
[!IMPORTANT] Type Safety:
bitcraftbase storage is always unsigned (u8throughu128) to ensure hardware alignment and register efficiency. Currently, fields are also restricted to unsigned types at compile-time. Support for interpreting bits as signed values (two's complement fields) within these unsigned structs is on the future roadmap.
🚀 The Efficiency Gap
In high-performance domains (vector engines, network stacks, or high-frequency trading), standard Rust structs introduce implicit padding to satisfy memory alignment. A 1-bit boolean might consume 8 bits, and a 24-bit ID might occupy 32 bits. At billion-scale, this waste trashes CPU caches and increases memory pressure.
bitcraft solves this by giving you:
- Absolute Bit Control: Define exactly which bits map to which logical fields.
- Unique
bytestruct!Support: Native support for flexible 1-16 byte arrays ([u8; N]) that are treated as primitive-like registers. Most libraries restrict you to standardu8-u128. - Unique
byteval!IDs: Instant "Packed IDs" for 24-bit, 40-bit, or 56-bit values that behave like first-class integers. - Zero-Cost Abstractions: Generated code compiles down to the exact bitwise shifts and masks you would write by hand—verified by LLV-MIR inspection.
- Hardware Alignment: LSB-first mapping ensures your software layout matches the physical little-endian storage in modern hardware.
- Boilerplate-Free Ergonomics: Automatic
Default(zero-init) and a fluidwith_*builder pattern come standard.
⚖️ The Deep Dive: std vs. bitcraft
Chosen between standard Rust types and bitcraft depends on whether you are optimizing for Developer Velocity (std) or Mechanical Sympathy (bitcraft).
1. bitstruct! vs. Standard Structs: Memory Density
Standard Rust structs satisfy Alignment Padding requirements by inserting "dead space." This ensures fields align with CPU word boundaries (8 bytes on 64-bit systems).
Standard Rust Layout:
// Total: 8 bytes
bitstruct! Core:
bitstruct! // Total: 4 bytes (Zero wasted bits)
Performance Impact: By cutting memory usage by 50%, you effectively double your L1/L2 cache capacity for this data type. In high-frequency loops, this reduces cache misses and memory bus contention.
2. bitenum! vs. Standard Enums: Safety & "Total Types"
Standard Rust enums are Algebraic Data Types. If a memory location contains a bit pattern that doesn't match a valid variant, reading it is Undefined Behavior (UB).
Standard Rust (enum):
// let s: State = unsafe { std::mem::transmute(3u8) }; // CRASH / UB!
bitenum! Core:
bitenum!
// let s = State::from_bits(3); // Panic in debug mode!
let s = try_from_bits; // Returns Err(BitstructError::InvalidVariant)
- The Variant Gap: A
repr(u8)enum with 2 variants only recognizes values0and1. Value2is illegal and causes UB if transmuted. - Safety:
bitenum!is Strictly Validated. It wraps a primitive but ensures that only defined variants can be instantiated throughtry_from_bits. All bit patterns are contained, but only valid ones are "Active." This is critical for defensive programming when parsing untrusted network packets or unstable hardware registers.
3. bytestruct! vs. Manual Byte Traversal: Instruction Efficiency
Manually manipulating byte arrays usually involves individual byte-level access, which is slow and prevents CPU vectorization.
Manual Access: Requires multiple load instructions and manual byte-by-byte reconstruction.
let mut arr = ;
let x = u16from_le_bytes; // Load arr[0], Load arr[1], Shift, Or
let y = u16from_le_bytes;
arr = 0x01; // Manual index management
bytestruct! Acting Primitives:
bytestruct!
let mut l = default;
l.set_x; // High-level, zero-cost API
The macro chooses the widest possible register (u32, u64, or u128) as an Acting Primitive.
- Accessing a field in a 5-byte array becomes: 1 Load (u64) → 1 Shift → 1 Mask.
- This reduces the Instruction Count and allows the CPU's out-of-order execution engine to retire the result significantly faster.
4. byteval! vs. NewType Wrappers: Ergonomics & Bandwidth
For "Odd-sized" types like 24-bit or 40-bit IDs, Rust developers often use a u64 (wasting 24 bits of bandwidth) or a 100-line custom wrapper.
Standard Wrapper:
; // Consumes 4 bytes in memory
// OR
; // Hard to use for arithmetic
byteval! Core:
byteval!
let id = from_u32; // Behaves like a 3-byte u32
byteval!provides a zero-cost wrapper that implements all numeric boilerplate automatically.- It ensures your 24-bit ID actually only consumes 3 bytes on disk/wire while behaving like a first-class
u32in your code.
5. Advanced Mechanism: Compile-Time & Runtime Verification
Standard Rust doesn't prevent you from defining a struct that is "too big" for a specific serialization format. bitstruct applies several layers of safety:
- Compile-Time Sum Verification:
bitstructensures the sum of field bits matches the base type. - Debug Bounds Verification: Builders (
with_xyz) and setters (set_xyz) automatically assert bounds usingdebug_assert!to catch overflow early in development. Release builds use silent truncation masking. - Explicit Error Handling: For untrusted data (like network packets), use the generated
try_set_xyzandtry_with_xyzmethods, which perform strict bounds validation and return aResult<_, BitstructError>. - Enum Validation:
bitenum!providestry_from_bits, which returnsBitstructError::InvalidVariantif the raw value does not match a defined enumeration variant.
6. Summary Feature Matrix
| Feature | Standard Rust (struct/enum) |
bitstruct bitcraft Library |
|---|---|---|
| Granularity | Byte-level (minimum 8 bits) | Bit-level (minimum 1 bit) |
| Signed Bitfields | ✅ | ❌ (Strictly Checked) |
| Padding | Implicit (inserted by rustc) | None (Explicit control) |
| Instruction Count | Multiple loads/stores | Atomic (Register-wide) |
| Alignment | Compiler-enforced | Hardware-aligned (LSB-First) |
| Safety | UB-risk on invalid patterns | UB-Free (Total Types) & Bounds Checked |
| FFI / C-ABI | Manual #[repr(C)] |
Transparent (Automatic) |
| Const Eval | Limited in enums | Full const fn support |
🚦 When To Use Which Macro?
The bitcraft crate provides four specialized tools. Choosing the right one determines your memory density and instruction efficiency:
-
Use
bitstruct!(Base: u8 - u128)- When: You need to pack multiple small fields (booleans, 3-bit ints, 4-bit enums) into a single, standard CPU register (up to 128 bits).
- Why: Fastest execution. The CPU loads the entire struct in a single instruction, manipulates the bits in registers, and writes them back. Perfect for protocol headers or status registers.
-
Use
bytestruct!(Base: [u8; N])- When: Your data structure logically exceeds 16 bytes (128 bits) but must still remain perfectly dense without padding, or when the data is intrinsically an array (like a generic payload buffer with flags at the end).
- Why: Allows dense packing up to 128 bits (16 bytes) while still utilizing the widest available CPU registers (like
u64) behind the scenes to modify localized chunks of the array efficiently.
-
Use
byteval!(Base: [u8; N])- When: You need a single integer value that has an "awkward" byte width (e.g., a 24-bit (
[u8; 3]) audio sample, or a 40-bit ([u8; 5]) network ID). - Why: It generates a zero-cost NewType wrapper around the byte array but gives you native
to_u32(),to_u64(), andfrom_u32()methods so it behaves like a normal number in code, without wasting the 8 or 24 padding bits a trueu32/u64would consume in an array of thousands.
- When: You need a single integer value that has an "awkward" byte width (e.g., a 24-bit (
-
Use
bitenum!(Base: N Bits mapped to u8-u128)- When: You need a strongly-typed, memory-safe enumeration to represent a variant parameter inside one of the above structs.
- Why: Pure, safe "Total Types". Writing illegal byte values over a network packet will securely return an error dynamically generated bounds-checking, while guaranteeing 0-bit overhead inside the struct.
🧩 Showcasing Interoperability
bitcraft is engineered for high-performance systems where data must move seamlessly between the CPU, the network, and other languages.
1. Network Protocol Buffers (Zero-Copy)
Using the bytemuck crate, you can cast raw network buffers directly into typed structures with zero overhead.
bitstruct!
2. Foreign Function Interface (C-ABI)
Every bitstruct!, bytestruct!, and bitenum! is marked #[repr(transparent)]. This guarantees binary compatibility with the underlying primitive, making them safe to pass directly to C or C++ interfaces as standard integers or byte arrays.
bitstruct!
// Transparently passes as 'uint8_t' in C
unsafe extern "C"
3. Hardware Register Mapping (MMIO)
Because bitcraft uses LSB-first mapping, your logical definitions perfectly match the physical bit-offsets used in hardware datasheets for little-endian architectures (x86_64, ARM64).
bitstruct!
// Simulating a memory-mapped register access
let mut reg = from_bits;
reg.set_enable;
reg.set_mode;
// Resulting value is perfectly aligned for an MMIO write.
4. Database Storage Density
Store billion-scale metadata with the absolute minimum footprint. Packing a 24-bit ID and 8-bit status into a single u32 saves significant storage compared to standard Rust padding.
bitstruct!
// 1 Billion records = 4.0GB with bitcraft vs 8.0GB+ with standard structs.
🧩 The Macro Suite
bitcraft provides four specialized macros, each targeting a specific layer of the storage-performance spectrum.
| Macro | Storage Basis | Range | Primary Use Case |
|---|---|---|---|
bitenum! |
u8 .. u128 |
1 - 128 Bits | Type-safe variants inside packed fields |
bitstruct! |
Primitives | 1 - 128 Bits | Word-aligned "Hot Path" CPU optimization |
bytestruct! |
[u8; N] |
2 - 16 Bytes | Unique: Array-backed dense buffers with register-speed |
byteval! |
[u8; N] |
3 - 16 Bytes | Unique: Packed IDs (24-bit, 40-bit) as first-class numbers |
⚡ Performance Benchmarks
bitcraft is engineered for Mechanical Sympathy. While standard Rust types are optimized for simplicity, bitcraft allows you to trade a negligible amount of instruction latency for massive gains in memory density (e.g., 2x - 8x).
We evaluated 1,000,000,000 (1B) iterations of complex read/write operations on an optimized release build (cargo test --release --test performance):
💻 Benchmark Environment
- OS: "Manjaro Linux"
- CPU: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
- RAM: 31Gi
- Rust Version: 1.93.1
| Metric | Macro Type | Base Storage | Overhead vs. std |
Physical Density |
|---|---|---|---|---|
| Execution Latency | bitenum! |
u8 (3 bits) |
0.95x (Faster!) | 1.00x (Safe) |
| Execution Latency | byteval! |
[u8; 3] |
0.91x (Faster!) | 2.67x Higher |
| Execution Latency | bitstruct! |
u16 |
0.96x (Faster!) | 2.00x Higher |
| Execution Latency | bytestruct! |
[u8; 2] |
2.45x | 3.20x Higher |
Zero-Copy Casting (bytemuck)
All structs generated by bitstruct!, bytestruct!, and bitenum! automatically derive bytemuck::Pod and bytemuck::Zeroable. This allows for zero-cost casting between raw byte buffers and your typed structs:
let raw_bytes = ;
let my_struct: MyBytestruct = cast;
🛠️ Roadmap & Future Implementation
- Signed Field Interpretation: Support for
i8,i16, etc., via automatic Sign Extension on the N-bit fields. - C-Header Generation: Integration with
cbindgento automatically generate FFI-compatible C headers for C/C++ firmware. -
serdeIntegration: Optional feature to deriveSerializeandDeserializefor all packed types. - Property-Based Testing: Use
proptestto fuzz the bit-packing logic for millions of random inputs.
🔬 Technical Deep Dive: The Engineering Behind the Speed
bitstruct isn't just a set of macros; it's a compiler-aware optimization engine. Below is an analysis of the specific patterns we use to ensure that high abstraction doesn't lead to high overhead.
1. The "Literal Guard" Pattern
Standard bit-manipulation libraries often use dynamic loops or copy_nonoverlapping to read fragmented fields. In our benchmarks, this was consistently slower than our Literal Guard approach.
Instead of a loop, bytestruct! generates a sequence of literal index checks (e.g., if len > 0 { ... }). Because the field widths and positions are known at compile-time, LLVM perfectly constant-folds these branches. This transforms a potential memory-loop into a flat, branchless sequence of bitwise shifts and ORs—the absolute fastest execution path possible.
2. Register Specialization (u64 vs. u128)
While bytestruct! supports fields up to 16 bytes (128 bits), using u128 registers for 2-bit flags on 64-bit hardware introduces unnecessary register pressure and instruction complexity.
The engine implements Dynamic Register Routing:
- Fields spanning ≤ 8 bytes: Operations are performed using native
u64registers. This allows the CPU to retire instructions immediately without the "software-emulated" overhead often associated withu128on modern 64-bit architectures. - Fields spanning > 8 bytes: The macro gracefully promotes the operation to
u128, ensuring correctness for massive fields while preserving specialized speed for hot-path metadata.
3. Instruction Fusion & Stack Traffic
When you manually manipulate byte arrays (e.g., [u8; 3]), you often introduce "Stack Traffic." Creating temporary fixed-size arrays to satisfy library signatures (like u32::from_le_bytes([b0, b1, b2, 0])) forces the compiler to move data from registers to the stack and back.
bitstruct avoids this by generating a single unrolled "Shift-and-OR" expression (e.g., (b0 as u32) | ((b1 as u32) << 8) | ...). Modern compilers recognize this pattern and perform Instruction Fusion. Instead of multiple individual shifts, the backend generates a single Unaligned Load instruction (like MOV or LDR), effectively loading your "packed" data directly into a high-speed CPU register in one cycle.
4. The Latency-Density Paradox
In micro-benchmarks, bitstruct! and bytestruct! may show a negligible ~1.2x overhead compared to standard structs. This is expected: standard structs use memory offsets, while bitfields use Shifts, Masks, and Read-Modify-Write cycles.
However, in real-world applications, this latency is a illusion. The 10x - 100x performance penalty of a CPU Cache Miss dominates all other metrics. By doubling or quadrupling your physical data density, bitcraft ensures your data stays in the L1/L2 Cache, providing a massive net gain in system throughput that standard "offset-based" types cannot match.
📖 Engineering Guide
1. bitenum!
Define variants that map directly to raw bits. Unlike standard enums, bitenum provides a #[repr(transparent)] wrapper that consumes zero extra bits and automatically maps to the most efficient CPU primitive (u8-u128) based exclusively on the number of required bits.
use bitenum;
bitenum!
Memory Layout:
- Automatically maps to the narrowest CPU primitive (
u8-u128) capable of holding the specified bits. - Example
(2)bits: Consumesu8in physical memory. Natively utilizes bits[ 1 0 ], while bits[ 7 .. 2 ]remain zeroed. - Strict Validation: Providing raw values like
4or5totry_from_bitswill return an error, preventing invalid state propagation.
2. bitstruct!
The primary tool for packing data into standard CPU words. All getters and builders are const fn.
use ;
bitstruct!
// Fluent builder pattern
let desc = from_bits
.with_is_active
.with_state;
assert_eq!;
Memory Layout (LSB-First):
MSB LSB
[ Unused ] [ State ] [ Payload ] [ Prio ] [ Act ]
2 bits 2 bits 8 bits 3 bits 1 bit
- Field 0 (Act): Occupies the lowest possible bit (Index 0).
- Field N: Continues immediately after Field N-1.
3. bytestruct!
For when you need non-standard sizes (e.g., 5, 7, or 13 bytes) that must be array-backed but treated as a single unit.
use bytestruct;
bytestruct!
let loc = from_u64;
assert_eq!;
Memory Layout (Little-Endian Array):
Byte: [ 0 ] [ 1 ] [ 2 ] [ 3 ] [ 4 ]
Field: [ x ] [ x ] [ y ] [ y ] [flags]
Content: [Low 8] [High 8] [Low 8] [High 8] [ u8 ]
- Mapping: Directly maps bits to a
[u8; 5]buffer. - Acting Primitive: Operations use a
u64register for 1-cycle execution.
4. byteval!
A specialization for "NewType" byte-array wrappers (e.g., 24-bit IDs).
use byteval;
byteval!
Memory Layout:
Byte: [ 0 ] [ 1 ] [ 2 ]
Content: [Low 8] [Mid 8] [High 8]
- Total Size: 3 bytes (LSB-First value).
🛡️ Technical Safety & Internals
Acting Primitive Selection
bytestruct! doesn't just manipulate byte arrays. It uses an internal "Acting Primitive" routing system. Based on the array size, the macro selects the widest possible CPU register (u32, u64, or u128) to perform operations. This means reading a field from a 13-byte array compiled to a single 128-bit register load and shift—the fastest possible execution path.
LSB-First Ordering
All macros follow a Least Significant Bit (LSB) first convention. The first field defined occupies the lowest bits (starting from bit 0).
[!CAUTION] Persistence Warning: Reordering fields in a macro definition will change the binary layout. If your data is saved to disk or sent over a network, changing the order breaks backward compatibility.
Strict Bound Verification
Macros execute compile-time assertions to ensure the sum of all field widths exactly matches or is less than the total storage capacity. This prevents "silent overlaps" and masking bugs.
⚖️ License
Licensed under the MIT License.