Expand description
CRC functions
CRC functions and macros
A cyclic redundancy check (CRC), is a common checksum algorithm that is simple to implement in circuitry, and effective at detecting bit-level errors.
Looking at CRCs mathematically, they are nothing more than the remainder after polynomial division by a constant, allowing for efficient implementations that leverage our polynomial types and hardware-accelerated carry-less multiplication.
use gf256::crc::crc32c;
assert_eq!(crc32c(b"Hello World!", 0), 0xfe6cf1dc);
Note this module requires feature crc
.
A fully featured implementation of CRCs can be found in
examples/crc.rs
:
$ RUSTFLAGS="-Ctarget-cpu=native" cargo run --features thread-rng,lfsr,crc,shamir,raid,rs --example crc
testing crc("Hello World!")
naive_crc => 0x1c291ca3
less_naive_crc => 0x1c291ca3
word_less_naive_crc => 0x1c291ca3
table_crc => 0x1c291ca3
small_table_crc => 0x1c291ca3
barret_crc => 0x1c291ca3
word_barret_crc => 0x1c291ca3
reversed_barret_crc => 0x1c291ca3
word_reversed_barret_crc => 0x1c291ca3
How do CRCs work?
CRCs mathematically are fascinating as they are nothing more than the binary remainder after polynomial division by some constant.
Take some data for example:
input = "hi"
= 0110100001101001
Choose a size for our CRC, say, 8-bits. Pad our data by that number of
zeros, and divide by some polynomial constant with that number of
bits + 1, in this case I’m using 0b100000111
:
input = "hi"
= 01101000 01101001
pad with 8 zeros:
= 01101000 01101001 00000000
divide by 0b100000111:
= 01101000 01101001 00000000
^ 1000001 11
= 00101001 10101001 00000000
^ 100000 111
= 00001001 01001001 00000000
^ 1000 00111
= 00000001 01110001 00000000
^ 1 00000111
= 00000000 01110110 00000000
^ 1000001 11
= 00000000 00110111 11000000
^ 100000 111
= 00000000 00010111 00100000
^ 10000 0111
= 00000000 00000111 01010000
^ 100 000111
= 00000000 00000011 01001100
^ 10 0000111
= 00000000 00000001 01000010
^ 1 00000111
---------------------------
= 00000000 00000000 01000101
|
remainder = 01000101 <-------'
And this is our CRC!
Note we are performing polynomial division! So instead of shifts and subtractions, we are doing shifts and xors. See the p module’s documentation for more info on viewing data as binary polynomials.
There’s some interesting things to note:
-
The original message bits will always end up as zero if the size of our polynomial matches the number of appended zeros + 1
-
The message bits seem to have a rather large impact on every bit in the resulting CRC, a property of a good checksum.
We can of course compute this directly with our polynomial types:
let data = p32(0b0110100001101001) << 8;
let polynomial = p32(0b100000111);
assert_eq!(data % polynomial, p32(0b01000101));
One fun feature of CRCs is that, since the CRC is the remainder after division, if we replace our padding zeros with the CRC (note this is the same as appending the CRC to our original message), computing the CRC again will give us a value of zero:
let data_with_crc = p32(0b011010000110100101000101);
let polynomial = p32(0b100000111);
assert_eq!(data_with_crc % polynomial, p32(0));
This is because we’ve effectively calculated:
m' = m - (m % p)
Where m
is our original message with padded zeros and p
is our polynomial.
Since the remainder is defined as:
a % b = a - b*floor(a/b)
We can substitute m
and p
in:
m' = (m - (m % p))
m' = m - (m - p*floor(m/p))
m' = p*floor(m/p)
Which shows that m'
is now some integer-polynomial multiple of p
. And since
the remainder of a multiple is always zero, the result of a second remainder,
our second CRC, should also be zero.
m' % p = p*floor(m/p) % p
m' % p = p*floor(m/p) - (p*floor(p*floor(m/p)/p))
m' % p = p*floor(m/p) - (p*floor(floor(m/p)))
m' % p = p*floor(m/p) - p*floor(m/p)
m' % p = 0
This trick can sometimes be useful for simplifying the CRC validation.
And we can create this exact CRC using gf256:
use ::gf256::crc;
#[crc::crc(polynomial=0b100000111, reflected=false, xor=0)]
fn crc8() {}
assert_eq!(crc8(&[0b01101000, 0b01101001], 0), 0b01000101);
The reflected
and xor
options are extra tweaks to the CRC algorithm that are
commonly found in standard CRCs. More info on these in the crc macro
documentation.
Optimizations
CRCs are simple and fast in circuitry, but not so much in software due to relying on bit-level operations. Fortunately there are several optimizations we can do to speed up CRC calculation in software.
A straightforward implementation of polynomial remainder is available in naive
mode, though it has been tweaked to work well with byte-level slices. Additional modes:
-
In
table
, CRCs use a precomputed remainder table to compute the remainder a byte at a time.The trick for computing the remainder table is to use the xor of the argument with the current remainder as the index into the table. This allows you to compute the remainder of any arbitrarily sized inputs using only a single byte index lookup at a time.
This is the most common implementation of CRCs you will find, due to its speed and portability.
-
In
small_table
mode, the same strategy astable
mode is used, but with a 16 element remainder table computer the remainder a nibble at a time. -
In
barret
mode, CRCs use Barret-reduction to efficiently compute the remainder using only multiplication by precomputed constants.This mode is especially effective when hardware carry-less multiplication instructions are available.
If hardware carry-less multiplication is available, barret
mode is the fastest
option for CRCs, so CRC implementations will use barret
by default.
If hardware carry-less multiplication is not available, table
mode will be
used, unless the feature small-tables
is enabled, in which case small_table
mode will be used. If the feature no-tables
is enabled, barret
mode will be
used as it outperforms a naive implementation even when hardware carry-less
multiplication is not available.
Though note the default mode is susceptible to change.
Choosing a polynomial
Choosing a good CRC polynomial is rather complicated. It depends on the length of the data you are protecting and the type of errors you are protecting against.
The best resource I’ve found for choosing good CRC polynomials is the research done in Philip Koopman’s Cyclic Redundancy Code (CRC) Polynomial Selection For Embedded Networks.
Generally you want to choose a CRC that has the largest “Hamming distance ” for your message length, or at least a good Hamming distance over your range of message lengths. Hamming distance is the number of bit-flips required to get to another message with a valid CRC, so a larger Hamming distance means more bit errors before you fail to detect that something is wrong.
Philip Koopman also has a list of good CRC polynomials and their effective Hamming distances at various message lengths here.
Note you may see several different formats for CRC polynomials! Where the
mathematically correct polynomial may be 0x104c11db7
, you may see a truncated
0x04c11db7
or 0x82608edb
representation to fit into 32-bits, or a
bit-reflected 0x1db710641
, 0xedb88320
, or 0x1db71064
representation
(what a mess!). Make sure you understand the correct bit-width and endianness
of a given polynomial before using it.
A note on CRC32 vs CRC32C
Did I mention choosing a good CRC polynomial is rather complicated? What if I told you that the most popular 32-bit CRC polynomial since 1975 was actually a sub-optimal polynomial for general-purpose 32-bits CRCs?
Well that is the situation the world finds itself in today. The most popular
32-bit CRC polynomial 0x104c11db7
, called just CRC32 (though sometimes referred
to as CRC32B, perhaps only because it predates CRC32C?), performs poorly the
moment there is more than 2 bit errors.
The more recent polynomial 0x11edc6f41
, called CRC32C, provides much better
error detection for a wider range of bit errors without any change to the
underlying algorithm.
But CRC32 is still in heavy use today, so gf256 provides both
crc32
and crc32c
.
It’s suggested to use crc32c
for new applications.
Functions
Calculate the CRC for a piece of data.
Calculate the CRC for a piece of data.
Calculate the CRC for a piece of data.
Calculate the CRC for a piece of data.
Calculate the CRC for a piece of data.
Attribute Macros
A macro for generating custom CRC functions.