simple-endian
simple_endian is a toolkit for describing binary formats in Rust: network protocols, file formats, and on-disk/on-wire data structures that must be stored consistently regardless of host CPU endianness.
It’s not only about swapping bytes. The goal is to let you design and implement binary layouts ergonomically, with the compiler helping enforce that:
- numeric fields are always read/written with the correct byte order
- your in-memory logic can stay close to “ordinary” Rust
- conversions happen explicitly at boundaries (wire ↔ native) so mistakes become type errors
At the core, you declare endianness in the data definition (BigEndian<T>, LittleEndian<T>, and shorthand aliases like u32be, u16le) and then:
- operate on endian-aware values using normal operators and traits
- convert at boundaries via
.to_native()/.into()
Optional features expand this into a full wire-format toolkit:
derive: generate*Wirehelper types from logical structs/enums (stable layout, endian-correct fields)io-std/io-core:read_specific/write_specificfor safe endian-aware IO with binary structures.text_*: fixed-size UTF-16/UTF-32 helpers for formats that standardize on those encodings (e.g. UTF-16LE)
The crate is designed to be lightweight and supports #![no_std] (derive/IO/text are feature-gated).
If you’re using LLM-powered tooling, there’s a concise, repository-specific usage guide in LLMs.txt.
No-std Support
simple_endian works in no_std environments. The library automatically switches to no_std mode when:
- You're not running tests
- The
io-stdoriofeatures are not enabled
To use simple_endian in a no_std project, disable default features and enable only what you need:
[]
= { = "0.4", = false, = ["integer_impls", "both_endian", "byte_impls"] }
Common feature combinations for no_std:
- Minimal:
integer_impls,both_endian,byte_impls— basic endian types - With derive: add
derive— for#[derive(Endianize)]support - With core IO: add
io-core— for slice-based read/write helpers (nostd::iodependency)
The library compiles successfully for embedded targets like thumbv7m-none-eabi. See tests/no_std_compatibility.rs for examples.
New Text Handling
New in the 0.4 release is a set of feature-gated types and conversions for handling on-disk/on-wire Unicode encodings other than UTF-8.
This matters because a lot of real-world binary formats standardize their text fields as:
- UTF-16LE/BE code units (common in file formats and OS metadata)
- UTF-32LE/BE code units (sometimes used for fixed-width fields)
- fixed-size strings with explicit padding rules
Trying to model those with Rust String/&str directly usually leads to ad-hoc byte slicing and accidental host-endian assumptions. The text helpers here are meant to keep your code:
- explicit about encoding
- explicit about endianness of the code units
- safe (bounded, validated conversions)
- consistent with the rest of the crate’s “wire types are types” approach
Features
Text support is opt-in:
text_utf8– fixed-size UTF-8 byte field helperstext_utf16– UTF-16 code unit types and conversionstext_utf32– UTF-32 code unit types and conversionstext_fixed– fixed-size (const-generic) text field wrapperstext_all– convenience alias enabling the above
These are designed to work with:
derive(via#[text(...)]on struct fields)io-std/io-core(viaread_specific/write_specificfor fixed UTF fields)
Fixed-size UTF-16/UTF-32 fields (padding semantics)
The fixed types represent exactly $N$ code units on the wire. They are not growable strings. They come in a few common padding styles, for example:
FixedUtf16LeSpacePadded<N>– UTF-16LE, right-padded with the space code unit (0x0020); decoding trims trailing spacesFixedUtf16LeNullPadded<N>– UTF-16LE, right-padded with NUL (0x0000); decoding trims trailing NULs
Similar wrappers exist for BE and for UTF-32.
There are also fixed-size UTF-8 byte field wrappers:
FixedUtf8NullPadded<N>– right-padded with0x00; decoding trims trailing NULsFixedUtf8SpacePadded<N>– right-padded with0x20(space); decoding trims trailing spaces
Note: the text APIs are entirely feature-gated. If you don't enable any text_* features, these types and conversions won't be part of your build.
The important endianness point: the endianness applies to the UTF code units, not to the host.
So FixedUtf16Le... is always little-endian on the wire, even on a big-endian CPU.
Example: a fixed UTF-16LE field in a wire struct
This mirrors formats like FAT long file names (UTF-16LE) or other metadata blocks that store fixed-width UTF-16.
use ;
use ;
Notes:
- For formats that are actually ASCII (e.g. FAT16 short names, many protocol tokens), keep them as
[u8; N]bytes and validate/trim explicitly. - Use UTF-16/UTF-32 helpers when the spec calls for them; that’s where they shine.
- The fixed types are great for avoiding variable-length parsing and for guaranteeing layout.
Example: a fixed UTF-8 field in a wire struct
This is useful for formats that store fixed-width, right-padded UTF-8 bytes.
use Endianize;
use ;
Isn’t there already a library for this?
Yes, there are several that cover at least a part of this functionality. Most focus on functions for byte swapping / reading numbers from byte slices. A few well-known ones:
byteorder is the prevailing approach (and a great crate), but it tends to push endianness decisions into parsing logic. For some codebases, it’s nicer if the endianness is part of the type and your code can stay closer to “ordinary” Rust.
So, why create another one?
Because the existing crates for handling endianness in Rust require a lot of manual byte mashing, especially if you're trying to write safe code.
This crate aims to make binary formats feel like ordinary Rust types:
- endianness lives in your struct/enum definitions
- the compiler prevents “oops, I wrote native-endian to the wire” mistakes
- optional derive + IO helpers make it practical to build complete protocols and storage formats
That makes it a good fit for packet formats, RPC framing, binary logs, file formats, and any place where a stable representation matters.
Highlights
This repo includes several runnable examples in ./examples/.
Notable ones:
derive_protocol/enum_protocol: derive-based wire types +read_specific/write_specificfat16_driver: a small FAT16 boot-sector / directory walkthroughethernet_inspector: an Ethernet II frame inspector that recognizes VLAN/ARP/IPv4/IPv6/TCP/UDP/ICMP
Performance notes
simple_endian is designed to be low-overhead: endian-aware wrappers are #[repr(transparent)] and the read/write helpers are optimized to avoid per-value allocations.
There’s a short benchmark-driven writeup (including BE vs LE comparisons and “pure” conversion vs std::io overhead) in PERFORMANCE.md.
Note: if you’re formatting these values in hot paths, consider converting to native first (e.g. via .to_native()), since formatting overhead can dominate.
Binary Size Notes
This crate has a lot of functionality. The easiest thing to do is enable all features....and then watch your binary size bloat. Fortunately, it's also designed to scale up and down quite a lot, so that it's suitable both for codebases that need all the features, as well as small lightweight embedded projects.
For that reason, consider it a strong recommendation to use the granular feature flags for what you actually need. If you care about binary size, compile times, or a smaller API surface (especially for no_std/embedded), this crate is designed to be “pick what you need”.
Some commonly useful opt-ins:
nonzero: enablecore::num::NonZero*support (including shorthand aliases likenzu32be,nzi64le).wrapping: enablecore::num::Wrapping<T>support.
Other feature families you may care about:
derive: proc-macro derives for generating wire types.io-core/io-std: endian-aware read/write helpers.text_utf16/text_utf32/text_fixed(ortext_all): fixed-width UTF helpers.
Goals of this project
The goals of this crate are as follows:
- Safely provide specific-endian types with low or no runtime overhead. There should be no runtime penalty when the host architecture matches the specified endianness, and very low penalty loads and stores otherwise.
- Straightforward, architecture-independent declarative syntax which ensures that load and store operations are correct.
- Ergonomic use patterns that maximize clarity and convenience without sacrificing correctness or safety.
- Because of the provided classes of operations, many logical, bitwise, and mathematical operations can be performed on the specific-endian types within the crate without explicitly converting to native host endian.
- Incorrect handling of data should generate clear type errors at compile time.
- Determination of correct endianness should be at declaration, and should not need to be repeated unless converting to a different endianness.
- Support for all or Rust's built-in types where endianness is relevant.
- The only dependency needed is the core crate. The std crate is used, however, for tests and benchmarks, and for some optional features.
Quick start
Define endian-aware values
use *;
let foo: u64be = 4.into; //Stores 4 in foo in big endian.
println!;
The output will depend on what sort of computer you're using. If you're running a little-endian system, such as x86 (PCs, Macs, etc.), you will see the big endian representation interpreted as if little-endian, as it's stored in memory. Note that the ``.to_bits()` method is mostly there for debugging purposes, and should not be used often.
This works in reverse as well:
use *;
let foo: u64be = 4.into;
let bar = u64from;
println!;
If you prefer, there's a convenience method so that you don't need to explicitly convert back to the basic native type.
use *;
let foo: u64be = 4.into;
let bar = foo.to_native;
println!;
And the type system ensures that native-endian values are never written without being converted into the proper endian.
let mut foo: u64be = 4.into;
foo = 7; // Will not compile without .into().
Optional pattern: compute in native, store as endian
While it is possible and often just as fast to do many operations directly and safely on the endianness of the structure, for arithmetic-heavy code, it’s may be clearer (and sometimes faster) to do math in native types and store back.
Either way, the type system ensures that accesses are safe and correct, so this code would be considered optimization:
use ;
// Same idea with the generic wrapper:
How it works
At its core, this crate centers around one trait, SpecificEndian<T>, plus the generic wrappers BigEndian<T> and LittleEndian<T>.
SpecificEndian<T> marks a type as safe to store in an endian-tagged wrapper. For primitives, that’s a given; for custom types, you can implement it yourself.
There is no extra memory footprint added by BigEndian<T>/LittleEndian<T> (they’re #[repr(transparent)]); they exist to make endianness explicit and enforce correct reads/writes via the type system.
This crate provides SpecificEndian implementations for most of the built-in types in Rust, including:
- Single-byte values (
i8,u8,bool), although this really doesn't do much but provide completeness. - The multi-byte integers:
u16,u32,u64,u128,usize,i16,i32,i64,i128,isize - The floats:
f32,f64.
Additionally, char is supported (behind the simple_char_impls feature, enabled by default via the simple_all/default feature set).
Note: even though char is supported as an in-memory value, many on-disk / on-wire formats don’t store Rust char values directly. When you need a stable binary representation, prefer explicit encodings (e.g. UTF-8 bytes, UTF-16 code units, UTF-32 code points) using the text helpers.
Derive macro (optional): derive Feature
If you enable the derive feature, you can generate wire-format helper types from a “logical” struct definition. This comes in the form of the Endianize macro.
The macro is intentionally conservative:
- It generates a companion
*Wirestruct where fields are endian-wrapped. - It can generate inline fixed UTF-16/UTF-32 padded fields from
String/&strfields.
It’s designed for “I want a stable on-wire layout, and I want the compiler to help keep it correct.”
use Endianize;
// Optional: control the generated wire layout. Defaults to repr(C).
// #[wire_repr(packed)]
// Optional: pass additional derives through to the generated *Wire types.
// #[wire_derive(Clone, Copy, Debug, PartialEq, Eq)]
// Generated by the derive:
//
// #[repr(C)]
// struct HeaderWire {
// a: BigEndian<u32>,
// b: BigEndian<u16>,
// title: FixedUtf16BeSpacePadded<8>,
// }
let _wire = HeaderWire ;
Notes:
- This currently supports structs, enums, and unions.
- The derive auto-generates common conversions between logical types and their
*Wirecounterparts:From<Header> for HeaderWire(for structs without#[text(..)]fields)From<HeaderWire> for Header(when the struct has no#[text(..)]fields)TryFrom<HeaderWire> for Header(when the struct contains#[text(..)]fields; error type isFixedTextError)
- Arrays:
- Raw byte arrays like
[u8; 8]are treated as already wire-safe and are passed through unchanged (endianness does not apply to bytes). - For other fixed-size arrays, endianness is applied per element. For example, under
#[endian(le)], a fieldwords: [u16; 3]becomeswords: [LittleEndian<u16>; 3]in the generated*Wiretype.
- Raw byte arrays like
HeaderWireis a#[repr(C)]“on-wire” type by default, that you can read/write as bytes (often via the IO helpers below).
Wire layout control: #[wire_repr(...)]
By default, Endianize generates wire types using #[repr(C)], which can introduce padding due to alignment.
If you’re modeling a packed binary format, you can override the representation used for generated wire types:
use Endianize;
// PackedHeaderWire will be #[repr(packed)] and have no padding.
Safety note: #[repr(packed)] makes fields potentially unaligned. The generated code avoids taking references to packed fields (so it compiles safely), but you should still avoid taking references to packed fields in your own code. Prefer using the IO helpers (read_specific/write_specific) or copying values out.
Wire derive pass-through: #[wire_derive(...)]
Sometimes you want the generated *Wire types to implement extra traits (for example Debug, Copy, or PartialEq).
You can pass these through to the generated wire containers:
use Endianize;
// MsgWire derives Clone/Copy/Debug/PartialEq/Eq.
Enum support (tag + payload)
Enums are generated as a stable tag + payload wire type:
- The enum must declare
#[repr(u8|u16|u32|u64)]to select the tag width. - Unit variants and named-field variants like
Variant { a: T, b: U }are supported. - Tuple variants are not supported yet.
- If the enum has any data-carrying variants, then all variants must have explicit discriminants (e.g.
Ping = 1,Data { .. } = 2).
When the io-std feature is enabled, the generated *Wire enum container also gets EndianRead/EndianWrite impls.
Union support (safe default)
Unions are generated in a safe default mode:
- The derive generates a
*Wireunion (#[repr(C)]) where each field is endian-wrapped. - No IO impls are generated for unions. A union needs external context (typically a separate tag) to know which field is valid; auto-serializing it would be ambiguous/unsafe.
#[text(...)]is not supported on union fields.
If you need IO for a union-like format, model it as an enum instead (tag + payload), which Endianize supports.
Examples
There are runnable examples in examples/ (each example is in its own subdirectory as examples/<name>/main.rs):
endian_values: store values inBigEndian<T>/LittleEndian<T>, convert to native, and use arithmetic/bitwise operations.explicit_struct_endian: define a#[repr(C)]struct with explicit endian fields (e.g.u32be,u16le) and inspect raw bytes.derive_protocol: a small binary-protocol demo using#[derive(Endianize)], fixed padded text fields, enums (tag + payload), andio-stdread/write.- Requires features:
derive,io-std,text_all.
- Requires features:
enum_protocol: a framed binary protocol example focusing specifically on#[derive(Endianize)]enums (tag + payload).- Demonstrates a multi-byte (u16) discriminator stored on the wire in a specified endian, and shows why interpreting it with the wrong endian produces the wrong value.
- Requires features:
derive,io-std,text_all.
cpu_emulator: a tiny toy CPU emulator that stores its registers asBigEndian<u16>and reads/writes 16-bit words in big-endian byte order.- Run with:
cargo run --example cpu_emulator --features "io-std"
- Run with:
fat16_driver: a tiny FAT16 “driver” that parses a synthetic disk image and prints boot sector + root directory info.- Requires features:
derive,io-std,text_all.
- Requires features:
messaging_client/messaging_server(underexamples/messaging_app/): a more end-to-end demo of designing a small wire protocol and using derive + IO helpers across a client/server boundary.- Requires features:
derive,io-std,text_all.
- Requires features:
This crate also provides implementations of a variety of useful traits for the types that it wraps, including boolean logic implementations for the integer types, including bools. This allows most boolean logic operations to be performed without any endian conversions using ordinary operators. You are required to use same-endian operands, however, like this:
use *;
let ip: = 0x0a00000a.into;
let subnet_mask = from;
let network = ip & subnet_mask;
println!;
As you see, the network is calculated by masking the IP address with the subnet mask in a way that the programmer barely has to think about the conversion operations.
Alternatively, you might want to define a structure with the elements typed so that it can be moved around as a unit.
use *;
let config = NetworkConfig
println!;
Note that the println! will convert the values to native endian.
And finally, this crate implements a number of traits that allow most of the basic arithmetic operators to be used on the Big- and LittleEndian variants of all of the types, where appropriate, including for the floats. There is a certain amount of overhead to this, since each operation requires at least one and often two or more endian conversions, however, since this crate aims to minimize the cost of writing portable code, they are provided to reduce friction to adoption. If you are writing code that is extremely sensitive to such overhead, it might make sense to convert to native endian, do your operations, and then store back in the specified endian using .into() or similar. That said, the overhead is often very small, and Rust's optimizer is very good, so I would encourage you to do some actual benchmarking before taking an unergonomic approach to your code. There are too many traits implemented to list them here, so I recommend consulting the documentation. Alternatively, you could just try what you want to do, and see if it compiles. It shouldn't ever allow you to compile something that doesn't handle endianness correctly unless you work pretty hard at it.
Representations and ABI
You might notice that we used #[repr(C)] in the data struct above, and you might be wondering why. It is often the case that you want to write a struct that has a very specific layout when you are writing structures that will be directly read from and written to some medium. Rust's default ABI does not guarantee this. For that reason, all of the structs defined in this crate are #[repr(transparent)], and it is strongly recommended if you do plan to directly write these structures to disk or the network, that you do something to ensure a consistent layout similar or otherwise guarantee the order in which the fields are stored.
Operations on Types
In addition to offering support for ensuring that correct endianness is used by leveraging the Rust type system, this crate also provides implementations of a number of traits from the core library that allow you to work with values directly without converting them to native endian types first. In many cases, this is literally a zero-cost capability, because bitwise operations are endian-agnostic, and as long as you are using other SpecificEndian types, there is no overhead to doing operations on them directly. In cases where a conversion to native endian is necessary, the crate will perform the conversion, and return a value in the same type as the input.
Feature flags
Although this crate includes a lot of useful functionality up front, including it all can increase your compiled size significantly. For size-conscious applications, I recommend not including everything.
By default, this crate enables a broad set of convenience features. If you care about binary size or compile time, consider turning defaults off and enabling only what you need.
[]
= "0.4"
= false
= ["both_endian", "integer_impls"]
The two most useful features are probably the ones that control support for big- and little- endians:
big_endianlittle_endian
Others are broken into categories:
- Operations types - These can make the use of
SpecificEndian<T>types more ergonimic, and allow for some amount of optimization by avoiding unnecessary convertions to and from native endian.bitwisecomparisonsmath_opsneg_opsshift_ops
- Support for formatting in the
formatfeature. - Support for different types
float_implsinteger_implsbyte_impls
The “simple” feature family enables endianness-invariant built-ins via the SimpleEndian trait:
simple_bool–boolsimple_byte_impls–u8,i8simple_char_impls–charsimple_string_impls–&str,Stringsimple_all– enables all of the above
Text helpers (features: text_utf16, text_utf32, text_fixed)
This crate also includes optional, feature-gated helpers for working with Unicode text in binary formats and foreign-function interfaces.
These features are designed for cases where a format/API specifies a fixed encoding and/or endianness (for example: “UTF-16LE code units”, “UTF-32BE code units”, or “exactly 8 UTF-16 code units stored inline in the struct”).
Enable them like this:
[]
= "0.4"
= ["text_all"]
Or pick only what you need:
text_utf16– UTF-16 helper types (Utf16String*,Utf16Str*) and conversions.text_utf32– UTF-32 helper types (Utf32String*,Utf32Str*) and conversions.text_fixed– fixed-size, inline string helpers.
Endianness-aware text buffers
For UTF-16 and UTF-32, there are explicit endianness types and host-endian aliases:
- Explicit:
Utf16StringBE/Utf16StringLE,Utf32StringBE/Utf32StringLE - Host-endian aliases:
Utf16String,Utf32String(pick BE/LE based ontarget_endian)
This lets you keep code portable while still being able to target a stable on-the-wire encoding when you need it (protocols, file formats, hashing over bytes, etc.).
Fixed-size, inline UTF-16 fields in binary structs
Many binary formats (and some ABIs) store strings inline using a fixed number of UTF-16 code
units. For that scenario, enable text_fixed + text_utf16.
If you know the wire format is UTF-16LE (very common on Windows), the most direct way to
model it is to wrap the host-endian fixed buffer in LittleEndian<...>:
FixedUtf16CodeUnits<K>– stores exactly K UTF-16 code units inline (host-endian)LittleEndian<FixedUtf16CodeUnits<K>>– stores exactly K UTF-16LE code units inline
The older explicit-endian names still exist too:
FixedUtf16LeCodeUnits<K>– stores exactly K UTF-16LE code units inlineFixedUtf16BeCodeUnits<K>– stores exactly K UTF-16BE code units inline
There are also three convention-specific wrappers for common layouts:
FixedUtf16LePacked<K>/FixedUtf16BePacked<K>FixedUtf16LeNullPadded<K>/FixedUtf16BeNullPadded<K>FixedUtf16LeSpacePadded<K>/FixedUtf16BeSpacePadded<K>
Example: a C-layout struct containing a fixed-size UTF-16LE name field:
use ;
const NAME_UNITS: usize = 16;
Example: a C-layout struct containing a fixed-size UTF-16BE, space-padded field:
use ;
const TITLE_UNITS: usize = 12;
Fixed number of Unicode codepoints (for inline tags/labels)
If you need “exactly N Unicode scalar values” inline in a struct (useful for tags, short labels,
or fixed-width identifiers), enable text_fixed and use FixedCodepointString<N>.
Cross-language / FFI notes (JavaScript, Windows, etc.)
- JavaScript strings are specified in terms of UTF-16 code units (historically, “UCS-2”, but modern JS uses UTF-16 semantics). If you’re bridging to JS via FFI or a binary protocol, UTF-16 helpers can be used to make the encoding/decoding explicit.
- Many native APIs and ABIs (notably Windows wide strings) use UTF-16LE code units. The
Utf16*LEandFixedUtf16Le*types are intended to make those representations easy to model. - If you need a stable, platform-independent wire format, prefer explicit
*BE/*LEtypes over host-endian aliases.
Performance
This section has moved up to Performance notes near the top of the README and is backed by the results in PERFORMANCE.md.
See Also
This crate allows for the manipulation of specific-endian structures in memory. It does not provide any facility for reading or writing those structures, which would probably be necessary in most use cases. See the following other crates for that functionality:
- Rust's standard
std::mem::[transmute](https://doc.rust-lang.org/std/mem/fn.transmute.html): - safe-transmute
- endian-trait
- byteordered
- persistance - A library that wraps structs in mmap, and can be used well with this to make those structs portable.
- endian-type - Essentially the same “typed endianness” idea, but with a different feature set.
- endian-types - Another similar “typed” approach.
Similar tools and when to use them
This crate is a good fit when you want endianness to be part of the type and enforced throughout your code.
Depending on your problem, these alternatives may be a better fit:
byteorder: great for parsing from&[u8]/Readwith explicit read/write calls. Ideal when you don’t want to introduce endian-tagged types into your domain model.zerocopyandbytemuck: good when you want safe-ish “view structs as bytes” patterns with strict layout guarantees.zerocopyin particular has built-in endian-aware integer wrappers.binrw,scroll,nom: higher-level parsing frameworks. Often great for file formats where you want declarative parsing and offsets.
If you just need “read a u32 from bytes” and you don’t need the typed-wrappers approach, start with byteorder. If you’re building reusable wire types and want the compiler to keep you honest, simple_endian shines.
IO helpers (features: io-core, io-std)
This crate provides optional, feature-gated IO helpers for reading and writing endian-aware values directly from Read/Write streams.
io-coreenables the generic machinery (works forno_stdenvironments with custom IO traits)io-stdenablesstd::io::{Read, Write}integrationiois a convenience alias forio-std
Enable them by adding the io-std (or io) feature in your Cargo.toml:
[]
= "0.4"
= ["io-std"]
With io-std, you can use the helper functions:
read_specific<R, E>(reader: &mut R) -> io::Result<E>— Read an endian-wrapped value of typeE(for exampleBigEndian<u32>) fromreader.write_specific<W, E>(writer: &mut W, v: &E) -> io::Result<()>— Write the endian-wrapped value towriter.
Additionally, helper traits are provided so types can implement custom read/write behavior:
EndianRead— types implementing this exposeread_from<R: Read>(reader: &mut R) -> io::Result<Self>.EndianWrite— types implementing this exposewrite_to<W: Write>(&self, writer: &mut W) -> io::Result<()>.
Big- and Little-endian wrappers implement those traits for the built-in types, so you can use the generic functions like this:
use ;
use *;
use Cursor;
Notes
- The current implementation supports types with sizes 1, 2, 4, 8 and 16 bytes (integers and floats). Attempts to read/write unsupported sizes return an
io::Error. - Extensive unit tests for the IO helpers are included and run when you enable the IO features.
If you want more realistic demos, check out derive_protocol, enum_protocol (multi-byte tags), examples/messaging_app/, and fat16_driver.