Crate labrador_ldpc [−] [src]
Labrador-LDPC implements a selection of LDPC error correcting codes, including encoders and decoders.
It is designed for use with other Labrador components but does not have any dependencies
on anything (including std
) and thus may be used totally standalone. It is reasonably
efficient on both serious computers and on small embedded systems. Considerations have
been made to accommodate both use cases.
No memory allocations are made inside this crate so most methods require you to pass in an allocated block of memory for them to initialise, and then later require you to pass it back when it must be read. Check individual method documentation for further details.
Please note this library is still in version 0 and so the API is likely to change.
In particular the current interface for passing initialised values (g
, cs
, etc)
into encoders and decoders is not ergonomic and is likely to change. On the other
hand the codes themselves will not change (although new ones may be added) and so newer
versions of the library will still be able to communicate with older versions indefinitely.
Example
extern crate labrador_ldpc; use labrador_ldpc::LDPCCode; fn main() { // Pick the TC128 code, n=128 k=64 // (that's 8 bytes of user data encoded into 16 bytes) let code = LDPCCode::TC128; // Generate some data to encode let txdata: Vec<u8> = (0..8).collect(); // Allocate memory for the encoded data let mut txcode = vec![0u8; code.n()/8]; // Encode code.encode_small(&txdata, &mut txcode); // Copy the transmitted data and corrupt a few bits let mut rxcode = txcode.clone(); rxcode[0] ^= 0x55; // Allocate and initialise the data needed to run a decoder // (we only need ci and cs for this code and decoder, but // normally you'd need vi and vs as well). let mut ci = vec![0; code.sparse_paritycheck_ci_len()]; let mut cs = vec![0; code.sparse_paritycheck_cs_len()]; code.init_sparse_paritycheck_checks(&mut ci, &mut cs); // Allocate some memory for the decoder's working area and output let mut working = vec![0u8; code.decode_bf_working_len()]; let mut rxdata = vec![0u8; code.output_len()]; // Decode code.decode_bf(&ci, &cs, None, None, &rxcode, &mut rxdata, &mut working); // Check the errors got corrected assert_eq!(&rxdata[..8], &txdata[..8]); }
Codes
Nomenclature: we use n to represent the code length (number of bits you have to transmit per codeword), k to represent the code dimension (number of useful information bits per codeword), and r to represent the rate k/n, the number of useful information bits per bit transmitted.
Several codes are available in a range of lengths and rates. Current codes come from two sets of CCSDS recommendations, their TC (telecommand) short-length low-rate codes, and their TM (telemetry) higher-length various-rates codes.
The TC codes are available in rate r=1/2 and dimensions k=128, k=256, and k=512. They are the same codes defined in CCSDS document 231.1-O-1 and subsequent revisions (although the n=256 code is eventually removed, it lives on here as it's quite useful).
The TM codes are available in r=1/2, r=2/3, and r=4/5, for dimensions k=1024 and k=4096. They are the same codes defined in CCSDS document 131.0-B-2 and subsequent revisions.
For more information on the codes themselves please see the CCSDS publications: https://public.ccsds.org/
The available codes are the variants of the LDPCCode
enum, and pretty much everything
else (encoders, decoders, utility methods) are implemented as methods on this enum.
Which code should I pick?: for short and highly-reliable messages, the TC codes make sense, especially if they need to be decoded on a constrained system such as an embedded platform. For most other data transfer, the TM codes are more flexible and generally better suited.
The very large k=16384 TM codes have not been included due to the complexity in generating their generator matrices and the very long constants involved, but it would be theoretically possible to include them. The relevant parity check constants are already included.
Generator Matrices
To encode a codeword, we need a generator matrix, which is a large binary matrix of shape k rows by n columns. For each bit set in the data to encode, we sum the corresponding row of the generator matrix to find the output codeword. Because all our codes are systematic, the first k bits of our codewords are exactly the input data, which means we only need to encode the final n-k parity bits at the end of the codeword.
These final n-k columns of the generator are stored in a compact form, where only a small number of the final rows are stored, and the rest can be generated from those at runtime. This is compact to avoid wasting space in flash memory, but makes encoding data somewhat slow, so there is a choice to expand to the full generator matrix instead. See the encoding methods below for more details.
The relevant constants are in the codes.compact_generators
module, with names like TC128_G
.
To expand these into a full generator matrix, you use the LDPCCode.init_generator
method.
It would be possible to write an encoder that did not use the generator matrix at all, not even the constants, and instead used the parity check matrix to encode data. This doesn't exist yet.
Parity Check Matrices
These are the counterpart to the generator matrices of the previous section. They are used by the decoders to work out which bits are wrong and need to be changed. When fully expanded, they are a large matrix with n-k rows (one per parity check) of n columns (one per input data bit, or variable). We can store them in an extremely compact form due to the way these specific codes have been constructed, but they must be expanded before use.
The constants are in codes.compact_parity_checks
and reflect the construction defined
in the CCSDS documents. They can be expanded into the full parity matrix using the
LDPCCode.init_paritycheck()
method, but this isn't actually needed by the decoders.
Because the parity check matrices are sparse, it is much more efficient to store them as arrays
of the indices of non-zero entries. For each check (row), we store the column indices that are
non-zero in the array ci
(*c*heck *i*ndex), and we store indices into ci
in the array cs
(*c*heck *s*tarts). In other words, for check 100, the relevant variables are stored in
ci[cs[100]..cs[101]]
. We then do the same from the other direction, storing the non-zero rows
for each variable in the vi
and vs
arrays. These arrays are initialised using the
LDPCCode.init_sparse_paritycheck()
method, or you can initialise just the ci
and cs
arrays using LDPCCode.init_sparse_paritycheck_checks()
(which is useful for the bf
decoder
with the TC
codes, where you don't need vi
or vs
).
It would be possible to write a decoder that directly accessed the parity constants, removing the need for any expansion into RAM, but this doesn't exist yet.
Encoders
There are two encoders available. Both of them take a &[u8]
of input data and encode it
to a &mut [u8]
.
The low-memory encoder,
encode_small
, which is as much as a hundred times slower but does not need to have the generator matrix in RAM at all, and so has a low memory overhead. The only memory required is that needed for the input and output codewords in a compact binary representation (i.e. one u8 per eight bits of data).The fast encoder,
encode_fast
, is much quicker, but requires the generator matrixg
to have been initialised at runtime by theLDPCCode.init_generator()
method. The precise overhead depends on the code in use, and is available as a constantCodeParams.generator_len
or at runtime asLDPCCode.generator_len()
(it's the length of a u32 array, rather than the number of bytes), in addition to the memory required for the input and output data.
The required memory (in bytes) to encode with each code is:
Code | Input | Output | Generator const | Expanded generator |
---|---|---|---|---|
(Data, RAM) | (Codeword, RAM) | (text) | (encode_fast only, RAM) |
|
=k/8 | =n/8 | =k*(n-k)/8 | ||
TC128 | 8 | 16 | 32 | 512 |
TC256 | 16 | 32 | 64 | 2048 |
TC512 | 32 | 64 | 128 | 8192 |
TM1280 | 128 | 160 | 1024 | 32768 |
TM1536 | 128 | 192 | 1024 | 65536 |
TM2048 | 128 | 256 | 1024 | 131072 |
TM5120 | 512 | 640 | 4096 | 524288 |
TM6144 | 512 | 768 | 4096 | 1048576 |
TM8192 | 512 | 1024 | 4096 | 2097152 |
Both encoders require the same input data and output codeword storage, and the same generator
constants. Only the encode_fast
encoder requires the additional overhead of the expanded
generator in RAM. These constants (expressed as lengths of Vecs of the appropriate type)
are available both at compile-time in the CodeParams
consts, and at runtime with methods
on LDPCCode
such as generator_len()
. You can therefore allocate the required memory either
statically or dynamically at runtime.
Decoders
There are two decoders available:
- The low-memory decoder,
decode_bf
, uses a bit flipping algorithm with hard information. This is maybe 1 or 2dB from optimal for decoding, but requires much less RAM and does still correct errors. It's only really useful on something without a hardware floating point unit or with very little memory available. - The high-performance decoder,
decode_mp
, uses a modified min-sum decoding algorithm to perform near-optimal decoding albeit with much higher memory overhead.
The required memory (in bytes) to decode with each code is:
Code | Hard input | Soft input | Output | Parity const | bf overhead |
mp overhead |
---|---|---|---|---|---|---|
(bf , RAM) |
(mp , RAM) |
(RAM) | (text) | (RAM) | (RAM) | |
=n/8 | =n*4 | =(n+p)/8 | ||||
TC128 | 16 | 512 | 16 | 32 | 1282 | 6532 |
TC256 | 32 | 1024 | 32 | 32 | 2562 | 13060 |
TC512 | 64 | 2048 | 64 | 32 | 5122 | 26116 |
TM1280 | 160 | 5120 | 176 | 369 | 24964 | 63492 |
TM1536 | 192 | 6144 | 224 | 324 | 30468 | 75780 |
TM2048 | 256 | 8192 | 320 | 279 | 41476 | 100356 |
TM5120 | 640 | 20480 | 704 | 369 | 99844 | 253956 |
TM6144 | 768 | 24576 | 896 | 324 | 121860 | 303108 |
TM8192 | 1024 | 32768 | 1280 | 279 | 165892 | 401412 |
The overhead column gives the combination of the required expanded parity constants (ci
,
cs
, vi
, and vs
), and the required working area for that decoder. The bf
decoder with
the TC
codes only requires the ci
and cs
parity data, giving the smallest memory
overhead.
Both decoders require the same output storage and parity constants. The bf
decoder takes
smaller hard inputs and has a much smaller working area, while the mp
decoder requires
soft inputs (one f32 per input bit) and uses soft information internally, requiring a larger
working area.
The required sizes are available both at compile-time in the CodeParams
consts, and at
runtime with methods on LDPCCode
such as sparse_paritycheck_cs_len()
. You can therefore
allocate the required memory either statically or dynamically at runtime.
Please see the individual decoder methods for more details on their requirements.
Bit Flipping Decoder
This decoder is based on the original Gallager decoder. It is not very optimal but is very fast. The idea is to see which bits are connected to the highest number of parity checks that are not currently satisfied, and flip those bits, and iterate until things get better. However, this routine cannot correct erasures (it only knows about bit flips). All of the TM codes are punctured, which means some parity bits are not transmitted and so are unknown at the receiver. We use a separate algorithm to decode the erasures first, based on a paper by Archonta, Kanistras and Paliouras, doi:10.1109/MOCAST.2016.7495161.
Message Passing Decoder
This is a modified min-sum decoder that computes the probability of each bit being set given
the other bits connected to it via the parity check matrix. It takes soft information in,
so naturally covers the punctured codes as well. This implementation is based on one described
by Savin, arXiv:0803.1090. It is both reasonably efficient (no atahn
required), and
performs very close to optimal sum-product decoding. Error performance could possibly be
improved using a min-sum correction factor for our codes, and decoding speed could be improved
by constructing a set of inverse lookup tables for the parity checks, at the expense of
significantly increased memory use.
A fixed point version of this decoder would be useful, as it could potentially be optimised for embedded DSP operation. 8 bits per symbol seems plenty of resolution for the soft information, so error performance could be unaffected while gaining very substantial speed and memory improvements.
Reexports
pub use codes::LDPCCode; |
Modules
codes |
This module contains the available LDPC codes, and the associated constants and methods to load their generator and parity check matrices. |
decoder |
This module provides decoding functions for turning codewords into data. |
encoder |
This module provides encoding functions for turning data into codewords. |