Skip to main content

commonware_coding/
zoda.rs

1//! This module implements the [ZODA](https://eprint.iacr.org/2025/034) coding scheme.
2//!
3//! At a high level, the scheme works like any other coding scheme: you start with
4//! a piece of data, and split it into shards, and a commitment. Each shard can
5//! be checked to belong to the commitment, and, given enough shards, the data can
6//! be reconstructed.
7//!
8//! What makes ZODA interesting is that upon receiving and checking one shard,
9//! you become convinced that there exists an original piece of data that will
10//! be reconstructable given enough shards. This fails in the case of, e.g.,
11//! plain Reed-Solomon coding. For example, if you give people random shards,
12//! instead of actually encoding data, then when they attempt to reconstruct the
13//! data, they can come to different results depending on which shards they use.
14//!
15//! Ultimately, this stems from the fact that you can't know if your shard comes
16//! from a valid encoding of the data until you have enough shards to reconstruct
17//! the data. With ZODA, you know that the shard comes from a valid encoding as
18//! soon as you've checked it.
19//!
20//! # Variant
21//!
22//! ZODA supports different configurations based on the coding scheme you use
23//! for sharding data, and for checking it.
24//!
25//! We use the Reed-Solomon and Hadamard variant of ZODA: in essence, this means
26//! that the shards are Reed-Solomon encoded, and we include additional checksum
27//! data which does not help reconstruct the data.
28//!
29//! ## Deviations
30//!
31//! In the paper, a sample consists of rows chosen at random from the encoding of
32//! the data. With multiple participants receiving samples, they might receive
33//! overlapping samples, which we don't want. Instead, we shuffle the rows of
34//! the encoded data, and each participant receives a different segment.
35//! From that participant's perspective, they've received a completely random
36//! choice of rows. The other participants' rows are less random, since they're
37//! guaranteed to not overlap. However, no guarantee on the randomness of the other
38//! rows is required: each sample is large enough to guarantee that the data
39//! has been validly encoded.
40//!
41//! We also use a Fiat-Shamir transform to make all randomness sampled
42//! non-interactively, based on the commitment to the encoded data.
43//!
44//! # Protocol
45//!
46//! Let n denote the minimum number of shards needed to recover the data.
47//! Let k denote the number of extra shards to generate.
48//!
49//! We consider the data as being an array of elements in a field F, of 64 bits.
50//!
51//! Given n and k, we have a certain number of required samples R.
52//! We can split these into row samples S, and column samples S',
53//! such that S * S' = R.
54//!
55//! Given a choice of S, our data will need to be arranged into a matrix of size
56//!
57//!   n S x c
58//!
59//! with c being >= 1.
60//!
61//! We choose S as close to R as possible without padding the data. We then
62//! choose S' so that S * S' >= R.
63//!
64//! We also then double S', because the field over which we compute checksums
65//! only has 64 bits. This effectively makes the checksum calculated over the
66//! extension field F^2. Because we don't actually need to multiply elements
67//! in F^2 together, but only ever take linear combinations with elements in F,
68//! we can effectively compute over the larger field simply by using 2 "virtual"
69//! checksum columns per required column.
70//!
71//! For technical reasons, the encoded data will have not have (n + k) S rows,
72//! but pad((n + k) S) rows, where pad returns the next power of two.
73//! This is to our advantage, in that given n shards, we will be able to reconstruct
74//! the data, but these shards consists of rows sampled at random from
75//! pad((n + k) S) rows, thus requiring fewer samples.
76//!
77//! ## Encoding
78//!
79//! 1. The data is arranged as a matrix X of size n S x c.
80//! 2. The data is Reed-Solomon encoded, turning it into a matrix X' of size pad((n + k) S) x c.
81//! 3. The rows of X' are committed to using a vector commitment V (concretely, a Merkle Tree).
82//! 4. V, along with the size of the data, in bytes, are committed to, producing Com.
83//! 5. Com is hashed to create randomness, first to generate a matrix H of size c x S',
84//!    and then to shuffle the rows of X'.
85//! 6. Z := X H, a matrix of size n S x S' is computed.
86//! 7. The ith shard (starting from 0) then consists of:
87//!    - the size of the data, in bytes,
88//!    - the vector commitment, V,
89//!    - the checksum Z,
90//!    - rows i * S..(i + 1) * S of Y, along with a proof of inclusion in V, at the original index.
91//!
92//! ## Weakening
93//!
94//! When transmitting a weak shard to other people, only the following are transmitted:
95//! - rows i * S..(i + 1) * S of Y, along with the inclusion proofs.
96//!
97//! ## Checking
98//!
99//! Let A_{S} denote the matrix formed by taking the rows in a given subset S.
100//!
101//! 1. Check that Com is the hash of V and the size of the data, in bytes.
102//! 2. Use Com to compute H of size c x S', and figure recompute the ith row sample S_i.
103//! 3. Check that Z is of size n S x S'.
104//! 4. Encode Z to get Z', a matrix of size pad((n + k) S) x S'.
105//!
106//! These steps now depend on the particular shard.
107//!
108//! 5. Check that X'_{S_i} (the shard's data) is a matrix of size S x c.
109//! 6. Use the inclusion proofs to check that each row of X'_{S_i} is included in V,
110//!    at the correct index.
111//! 7. Check that X'_{S_i} H = Z'_{S_i}
112//!
113//! ## Decoding
114//!
115//! 1. Given n checked shards, you have n S encoded rows, which can be Reed-Solomon decoded.
116
117use crate::{Config, Scheme, ValidatingScheme};
118use bytes::BufMut;
119use commonware_codec::{Encode, EncodeSize, FixedSize, RangeCfg, Read, ReadExt, Write};
120use commonware_cryptography::{
121    transcript::{Summary, Transcript},
122    Digest, Hasher,
123};
124use commonware_math::{
125    fields::goldilocks::F,
126    ntt::{EvaluationVector, Matrix},
127};
128use commonware_parallel::Strategy;
129use commonware_storage::bmt::{Builder as BmtBuilder, Error as BmtError, Proof};
130use rand::seq::SliceRandom as _;
131use std::{marker::PhantomData, sync::Arc};
132use thiserror::Error;
133
134/// Create an iterator over the data of a buffer, interpreted as little-endian u64s.
135fn iter_u64_le(data: impl bytes::Buf) -> impl Iterator<Item = u64> {
136    struct Iter<B> {
137        remaining_u64s: usize,
138        tail: usize,
139        inner: B,
140    }
141
142    impl<B: bytes::Buf> Iter<B> {
143        fn new(inner: B) -> Self {
144            let remaining_u64s = inner.remaining() / 8;
145            let tail = inner.remaining() % 8;
146            Self {
147                remaining_u64s,
148                tail,
149                inner,
150            }
151        }
152    }
153
154    impl<B: bytes::Buf> Iterator for Iter<B> {
155        type Item = u64;
156
157        fn next(&mut self) -> Option<Self::Item> {
158            if self.remaining_u64s > 0 {
159                self.remaining_u64s -= 1;
160                return Some(self.inner.get_u64_le());
161            }
162            if self.tail > 0 {
163                let mut chunk = [0u8; 8];
164                self.inner.copy_to_slice(&mut chunk[..self.tail]);
165                self.tail = 0;
166                return Some(u64::from_le_bytes(chunk));
167            }
168            None
169        }
170    }
171    Iter::new(data)
172}
173
174fn collect_u64_le(max_length: usize, data: impl Iterator<Item = u64>) -> Vec<u8> {
175    let mut out = Vec::with_capacity(max_length);
176    for d in data {
177        out.extend_from_slice(&d.to_le_bytes());
178    }
179    out.truncate(max_length);
180    out
181}
182
183fn row_digest<H: Hasher>(row: &[F]) -> H::Digest {
184    let mut h = H::new();
185    for x in row {
186        h.update(&x.to_le_bytes());
187    }
188    h.finalize()
189}
190
191mod topology {
192    use super::Error;
193    use crate::Config;
194    use commonware_math::fields::goldilocks::F;
195    use commonware_utils::BigRationalExt as _;
196    use num_rational::BigRational;
197
198    const SECURITY_BITS: usize = 126;
199    // Fractional precision for log2 calculations when computing required samples.
200    // We use the next power of 2 above SECURITY_BITS (128 = 2^7), which provides
201    // 1/128 fractional precision, sufficient for these security calculations.
202    const LOG2_PRECISION: usize = SECURITY_BITS.next_power_of_two().trailing_zeros() as usize;
203
204    /// Contains the sizes of various objects in the protocol.
205    #[derive(Debug, Clone, Copy, PartialEq)]
206    pub struct Topology {
207        /// How many bytes the data has.
208        pub data_bytes: usize,
209        /// How many columns the data has.
210        pub data_cols: usize,
211        /// How many rows the data has.
212        pub data_rows: usize,
213        /// How many rows the encoded data has.
214        pub encoded_rows: usize,
215        /// How many samples each shard has.
216        pub samples: usize,
217        /// How many column samples we need.
218        pub column_samples: usize,
219        /// How many shards we need to recover.
220        pub min_shards: usize,
221        /// How many shards there are in total (each shard containing multiple rows).
222        pub total_shards: usize,
223    }
224
225    impl Topology {
226        const fn with_cols(data_bytes: usize, n: usize, k: usize, cols: usize) -> Self {
227            let data_els = F::bits_to_elements(8 * data_bytes);
228            let data_rows = data_els.div_ceil(cols);
229            let samples = data_rows.div_ceil(n);
230            Self {
231                data_bytes,
232                data_cols: cols,
233                data_rows,
234                encoded_rows: ((n + k) * samples).next_power_of_two(),
235                samples,
236                column_samples: 0,
237                min_shards: n,
238                total_shards: n + k,
239            }
240        }
241
242        pub(crate) fn required_samples(&self) -> usize {
243            let k = BigRational::from_usize(self.encoded_rows - self.data_rows);
244            let m = BigRational::from_usize(self.encoded_rows);
245            let fraction = (&k + BigRational::from_u64(1)) / (BigRational::from_usize(2) * &m);
246
247            // Compute log2(one_minus). When m is close to n, one_minus is close to 1, making log2(one_minus)
248            // a small negative value that requires sufficient precision to correctly capture the sign.
249            let one_minus = BigRational::from_usize(1) - &fraction;
250            let log_term = one_minus.log2_ceil(LOG2_PRECISION);
251            if log_term >= BigRational::from_u64(0) {
252                return usize::MAX;
253            }
254
255            let required = BigRational::from_usize(SECURITY_BITS) / -log_term;
256            required.ceil_to_u128().unwrap_or(u128::MAX) as usize
257        }
258
259        fn correct_column_samples(&mut self) {
260            // We make sure we have enough column samples to get 126 bits of security.
261            //
262            // This effectively does two elements per column. To get strictly greater
263            // than 128 bits, we would need to add another column per column_sample.
264            // We also have less than 128 bits in other places because of the bounds
265            // on the messages encoded size.
266            self.column_samples =
267                F::bits_to_elements(SECURITY_BITS) * self.required_samples().div_ceil(self.samples);
268        }
269
270        /// Figure out what size different values will have, based on the config and the data.
271        pub fn reckon(config: &Config, data_bytes: usize) -> Self {
272            let n = config.minimum_shards.get() as usize;
273            let k = config.extra_shards.get() as usize;
274            // The following calculations don't tolerate data_bytes = 0, so we
275            // temporarily correct that to be at least 1, then make sure to adjust
276            // it back again to 0.
277            let corrected_data_bytes = data_bytes.max(1);
278            // The goal here is to try and maximize the number of columns in the
279            // data. ZODA is more efficient the more columns there are. However,
280            // we need to make sure that every shard has enough samples to guarantee
281            // correct encoding, and that the number of encoded rows can contain
282            // all of the samples in each shard, without overlap.
283            //
284            // To determine if a column configuration is good, we need to choose
285            // the number of encoded rows. To do this, we pick a number of samples
286            // `S` such that `S * n >= data_rows`. Then, our encoded rows will
287            // equal `((n + k) * S).next_power_of_two()`. If the number of required
288            // samples `R` for this configuration satisfies `(n + k) * R <= encoded_rows`,
289            // then this configuration is valid, using `R` as the necessary number
290            // of samples.
291            //
292            // We try increasing column counts, picking the configuration that's good.
293            // It's possible that the first configuration, with one column, is not good.
294            // To correct for that, we need to add extra checksum columns to guarantee
295            // security.
296            let mut out = Self::with_cols(corrected_data_bytes, n, k, 1);
297            loop {
298                let attempt = Self::with_cols(corrected_data_bytes, n, k, out.data_cols + 1);
299                let required_samples = attempt.required_samples();
300                if required_samples.saturating_mul(n + k) <= attempt.encoded_rows {
301                    out = Self {
302                        samples: required_samples.max(attempt.samples),
303                        ..attempt
304                    };
305                } else {
306                    break;
307                }
308            }
309            out.correct_column_samples();
310            out.data_bytes = data_bytes;
311            out
312        }
313
314        pub fn check_index(&self, i: u16) -> Result<(), Error> {
315            if (0..self.total_shards).contains(&(i as usize)) {
316                return Ok(());
317            }
318            Err(Error::InvalidIndex(i))
319        }
320    }
321}
322use topology::Topology;
323
324/// A shard of data produced by the encoding scheme.
325#[derive(Clone, Debug)]
326pub struct StrongShard<D: Digest> {
327    data_bytes: usize,
328    root: D,
329    inclusion_proof: Proof<D>,
330    rows: Matrix<F>,
331    checksum: Arc<Matrix<F>>,
332}
333
334impl<D: Digest> PartialEq for StrongShard<D> {
335    fn eq(&self, other: &Self) -> bool {
336        self.data_bytes == other.data_bytes
337            && self.root == other.root
338            && self.inclusion_proof == other.inclusion_proof
339            && self.rows == other.rows
340            && self.checksum == other.checksum
341    }
342}
343
344impl<D: Digest> Eq for StrongShard<D> {}
345
346impl<D: Digest> EncodeSize for StrongShard<D> {
347    fn encode_size(&self) -> usize {
348        self.data_bytes.encode_size()
349            + self.root.encode_size()
350            + self.inclusion_proof.encode_size()
351            + self.rows.encode_size()
352            + self.checksum.encode_size()
353    }
354}
355
356impl<D: Digest> Write for StrongShard<D> {
357    fn write(&self, buf: &mut impl BufMut) {
358        self.data_bytes.write(buf);
359        self.root.write(buf);
360        self.inclusion_proof.write(buf);
361        self.rows.write(buf);
362        self.checksum.write(buf);
363    }
364}
365
366impl<D: Digest> Read for StrongShard<D> {
367    type Cfg = crate::CodecConfig;
368
369    fn read_cfg(
370        buf: &mut impl bytes::Buf,
371        cfg: &Self::Cfg,
372    ) -> Result<Self, commonware_codec::Error> {
373        let data_bytes = usize::read_cfg(buf, &RangeCfg::from(..=cfg.maximum_shard_size))?;
374        let max_els = cfg.maximum_shard_size / F::SIZE;
375        Ok(Self {
376            data_bytes,
377            root: ReadExt::read(buf)?,
378            inclusion_proof: Read::read_cfg(buf, &max_els)?,
379            rows: Read::read_cfg(buf, &(max_els, ()))?,
380            checksum: Arc::new(Read::read_cfg(buf, &(max_els, ()))?),
381        })
382    }
383}
384
385#[cfg(feature = "arbitrary")]
386impl<D: Digest> arbitrary::Arbitrary<'_> for StrongShard<D>
387where
388    D: for<'a> arbitrary::Arbitrary<'a>,
389{
390    fn arbitrary(u: &mut arbitrary::Unstructured<'_>) -> arbitrary::Result<Self> {
391        Ok(Self {
392            data_bytes: u.arbitrary::<u32>()? as usize,
393            root: u.arbitrary()?,
394            inclusion_proof: u.arbitrary()?,
395            rows: u.arbitrary()?,
396            checksum: Arc::new(u.arbitrary()?),
397        })
398    }
399}
400
401#[derive(Clone, Debug)]
402pub struct WeakShard<D: Digest> {
403    inclusion_proof: Proof<D>,
404    shard: Matrix<F>,
405}
406
407impl<D: Digest> PartialEq for WeakShard<D> {
408    fn eq(&self, other: &Self) -> bool {
409        self.inclusion_proof == other.inclusion_proof && self.shard == other.shard
410    }
411}
412
413impl<D: Digest> Eq for WeakShard<D> {}
414
415impl<D: Digest> EncodeSize for WeakShard<D> {
416    fn encode_size(&self) -> usize {
417        self.inclusion_proof.encode_size() + self.shard.encode_size()
418    }
419}
420
421impl<D: Digest> Write for WeakShard<D> {
422    fn write(&self, buf: &mut impl BufMut) {
423        self.inclusion_proof.write(buf);
424        self.shard.write(buf);
425    }
426}
427
428impl<D: Digest> Read for WeakShard<D> {
429    type Cfg = crate::CodecConfig;
430
431    fn read_cfg(
432        buf: &mut impl bytes::Buf,
433        cfg: &Self::Cfg,
434    ) -> Result<Self, commonware_codec::Error> {
435        let max_data_bits = cfg.maximum_shard_size.saturating_mul(8);
436        let max_data_els = F::bits_to_elements(max_data_bits).max(1);
437        Ok(Self {
438            // Worst case: every row is one data element, and the sample size is all rows.
439            inclusion_proof: Read::read_cfg(buf, &max_data_els)?,
440            shard: Read::read_cfg(buf, &(max_data_els, ()))?,
441        })
442    }
443}
444
445#[cfg(feature = "arbitrary")]
446impl<D: Digest> arbitrary::Arbitrary<'_> for WeakShard<D>
447where
448    D: for<'a> arbitrary::Arbitrary<'a>,
449{
450    fn arbitrary(u: &mut arbitrary::Unstructured<'_>) -> arbitrary::Result<Self> {
451        Ok(Self {
452            inclusion_proof: u.arbitrary()?,
453            shard: u.arbitrary()?,
454        })
455    }
456}
457
458/// A ZODA shard that has been checked for integrity already.
459#[derive(Clone)]
460pub struct CheckedShard {
461    index: usize,
462    shard: Matrix<F>,
463}
464
465/// Take indices up to `total`, and shuffle them.
466///
467/// The shuffle depends, deterministically, on the transcript.
468///
469/// # Panics
470///
471/// Panics if `total` exceeds `u32::MAX`.
472fn shuffle_indices(transcript: &Transcript, total: usize) -> Vec<u32> {
473    let total: u32 = total
474        .try_into()
475        .expect("encoded_rows exceeds u32::MAX; data too large for ZODA");
476    let mut out = (0..total).collect::<Vec<_>>();
477    out.shuffle(&mut transcript.noise(b"shuffle"));
478    out
479}
480
481/// Create a checking matrix of the right shape.
482///
483/// This matrix is random, using the transcript as a deterministic source of randomness.
484fn checking_matrix(transcript: &Transcript, topology: &Topology) -> Matrix<F> {
485    Matrix::rand(
486        &mut transcript.noise(b"checking matrix"),
487        topology.data_cols,
488        topology.column_samples,
489    )
490}
491
492/// Data used to check [WeakShard]s.
493#[derive(Clone)]
494pub struct CheckingData<D: Digest> {
495    topology: Topology,
496    root: D,
497    checking_matrix: Matrix<F>,
498    encoded_checksum: Matrix<F>,
499    shuffled_indices: Vec<u32>,
500}
501
502impl<D: Digest> CheckingData<D> {
503    /// Calculate the values of this struct, based on information received.
504    ///
505    /// We control `config`.
506    ///
507    /// We're provided with `commitment`, which should hash over `root`,
508    /// and `data_bytes`.
509    ///
510    /// We're also give a `checksum` matrix used to check the shards we receive.
511    fn reckon(
512        config: &Config,
513        commitment: &Summary,
514        data_bytes: usize,
515        root: D,
516        checksum: &Matrix<F>,
517    ) -> Result<Self, Error> {
518        let topology = Topology::reckon(config, data_bytes);
519        let mut transcript = Transcript::new(NAMESPACE);
520        transcript.commit((topology.data_bytes as u64).encode());
521        transcript.commit(root.encode());
522        let expected_commitment = transcript.summarize();
523        if *commitment != expected_commitment {
524            return Err(Error::InvalidShard);
525        }
526        let mut transcript = Transcript::resume(expected_commitment);
527        let checking_matrix = checking_matrix(&transcript, &topology);
528        if checksum.rows() != topology.data_rows || checksum.cols() != topology.column_samples {
529            return Err(Error::InvalidShard);
530        }
531        // Commit to the checksum before generating the indices to check.
532        //
533        // Nota bene: `checksum.encode()` is *serializing* the checksum, not
534        // Reed-Solomon encoding it.
535        //
536        // cf. the implementation of `Scheme::encode` for ZODA for why it's important
537        // that we do Reed-Solomon encoding of the checksum ourselves.
538        transcript.commit(checksum.encode());
539        let encoded_checksum = checksum
540            .as_polynomials(topology.encoded_rows)
541            .expect("checksum has too many rows")
542            .evaluate()
543            .data();
544        let shuffled_indices = shuffle_indices(&transcript, topology.encoded_rows);
545
546        Ok(Self {
547            topology,
548            root,
549            checking_matrix,
550            encoded_checksum,
551            shuffled_indices,
552        })
553    }
554
555    fn check<H: Hasher<Digest = D>>(
556        &self,
557        index: u16,
558        weak_shard: &WeakShard<D>,
559    ) -> Result<CheckedShard, Error> {
560        self.topology.check_index(index)?;
561        if weak_shard.shard.rows() != self.topology.samples
562            || weak_shard.shard.cols() != self.topology.data_cols
563        {
564            return Err(Error::InvalidWeakShard);
565        }
566        let index = index as usize;
567        let these_shuffled_indices = &self.shuffled_indices
568            [index * self.topology.samples..(index + 1) * self.topology.samples];
569
570        // Build elements for BMT multi-proof verification using the deterministically
571        // computed indices for this shard
572        let proof_elements: Vec<(H::Digest, u32)> = these_shuffled_indices
573            .iter()
574            .zip(weak_shard.shard.iter())
575            .map(|(&i, row)| (row_digest::<H>(row), i))
576            .collect();
577
578        // Verify the multi-proof
579        let mut hasher = H::new();
580        if weak_shard
581            .inclusion_proof
582            .verify_multi_inclusion(&mut hasher, &proof_elements, &self.root)
583            .is_err()
584        {
585            return Err(Error::InvalidWeakShard);
586        }
587
588        let shard_checksum = weak_shard.shard.mul(&self.checking_matrix);
589        // Check that the shard checksum rows match the encoded checksums
590        for (row, &i) in shard_checksum.iter().zip(these_shuffled_indices) {
591            if row != &self.encoded_checksum[i as usize] {
592                return Err(Error::InvalidWeakShard);
593            }
594        }
595        Ok(CheckedShard {
596            index,
597            shard: weak_shard.shard.clone(),
598        })
599    }
600}
601
602#[derive(Debug, Error)]
603pub enum Error {
604    #[error("invalid shard")]
605    InvalidShard,
606    #[error("invalid weak shard")]
607    InvalidWeakShard,
608    #[error("invalid index {0}")]
609    InvalidIndex(u16),
610    #[error("insufficient shards {0} < {1}")]
611    InsufficientShards(usize, usize),
612    #[error("insufficient unique rows {0} < {1}")]
613    InsufficientUniqueRows(usize, usize),
614    #[error("failed to create inclusion proof: {0}")]
615    FailedToCreateInclusionProof(BmtError),
616}
617
618const NAMESPACE: &[u8] = b"_COMMONWARE_CODING_ZODA";
619
620#[derive(Clone, Copy)]
621pub struct Zoda<H> {
622    _marker: PhantomData<H>,
623}
624
625impl<H> std::fmt::Debug for Zoda<H> {
626    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
627        write!(f, "Zoda")
628    }
629}
630
631impl<H: Hasher> Scheme for Zoda<H> {
632    type Commitment = Summary;
633
634    type StrongShard = StrongShard<H::Digest>;
635
636    type WeakShard = WeakShard<H::Digest>;
637
638    type CheckingData = CheckingData<H::Digest>;
639
640    type CheckedShard = CheckedShard;
641
642    type Error = Error;
643
644    fn encode(
645        config: &Config,
646        data: impl bytes::Buf,
647        strategy: &impl Strategy,
648    ) -> Result<(Self::Commitment, Vec<Self::StrongShard>), Self::Error> {
649        // Step 1: arrange the data as a matrix.
650        let data_bytes = data.remaining();
651        let topology = Topology::reckon(config, data_bytes);
652        let data = Matrix::init(
653            topology.data_rows,
654            topology.data_cols,
655            F::stream_from_u64s(iter_u64_le(data)),
656        );
657
658        // Step 2: Encode the data.
659        let encoded_data = data
660            .as_polynomials(topology.encoded_rows)
661            .expect("data has too many rows")
662            .evaluate()
663            .data();
664
665        // Step 3: Commit to the rows of the data using a Binary Merkle Tree.
666        let row_hashes: Vec<H::Digest> = strategy.map_collect_vec(0..encoded_data.rows(), |i| {
667            row_digest::<H>(&encoded_data[i])
668        });
669        let mut bmt_builder = BmtBuilder::<H>::new(row_hashes.len());
670        for hash in &row_hashes {
671            bmt_builder.add(hash);
672        }
673        let bmt = bmt_builder.build();
674        let root = bmt.root();
675
676        // Step 4: Commit to the root, and the size of the data.
677        let mut transcript = Transcript::new(NAMESPACE);
678        transcript.commit((topology.data_bytes as u64).encode());
679        transcript.commit(root.encode());
680        let commitment = transcript.summarize();
681
682        // Step 5: Generate a checking matrix and checksum with the commitment.
683        let mut transcript = Transcript::resume(commitment);
684        let checking_matrix = checking_matrix(&transcript, &topology);
685        let checksum = Arc::new(data.mul(&checking_matrix));
686        // Bind index sampling to this checksum to prevent follower-specific malleability.
687        // It's important to commit to the checksum itself, rather than its encoding,
688        // because followers have to encode the checksum itself to prevent the leader from
689        // cheating.
690        transcript.commit(checksum.encode());
691        let shuffled_indices = shuffle_indices(&transcript, encoded_data.rows());
692
693        // Step 6: Produce the shards in parallel.
694        let shard_results: Vec<Result<StrongShard<H::Digest>, Error>> =
695            strategy.map_collect_vec(0..topology.total_shards, |shard_idx| {
696                let indices = &shuffled_indices
697                    [shard_idx * topology.samples..(shard_idx + 1) * topology.samples];
698                let rows = Matrix::init(
699                    indices.len(),
700                    topology.data_cols,
701                    indices
702                        .iter()
703                        .flat_map(|&i| encoded_data[i as usize].iter().copied()),
704                );
705                let inclusion_proof = bmt
706                    .multi_proof(indices)
707                    .map_err(Error::FailedToCreateInclusionProof)?;
708                Ok(StrongShard {
709                    data_bytes,
710                    root,
711                    inclusion_proof,
712                    rows,
713                    checksum: checksum.clone(),
714                })
715            });
716        let shards = shard_results
717            .into_iter()
718            .collect::<Result<Vec<_>, Error>>()?;
719        Ok((commitment, shards))
720    }
721
722    fn weaken(
723        config: &Config,
724        commitment: &Self::Commitment,
725        index: u16,
726        shard: Self::StrongShard,
727    ) -> Result<(Self::CheckingData, Self::CheckedShard, Self::WeakShard), Self::Error> {
728        let weak_shard = WeakShard {
729            inclusion_proof: shard.inclusion_proof,
730            shard: shard.rows,
731        };
732        let checking_data = CheckingData::reckon(
733            config,
734            commitment,
735            shard.data_bytes,
736            shard.root,
737            shard.checksum.as_ref(),
738        )?;
739        let checked_shard = checking_data.check::<H>(index, &weak_shard)?;
740        Ok((checking_data, checked_shard, weak_shard))
741    }
742
743    fn check(
744        _config: &Config,
745        _commitment: &Self::Commitment,
746        checking_data: &Self::CheckingData,
747        index: u16,
748        weak_shard: Self::WeakShard,
749    ) -> Result<Self::CheckedShard, Self::Error> {
750        checking_data.check::<H>(index, &weak_shard)
751    }
752
753    fn decode(
754        _config: &Config,
755        _commitment: &Self::Commitment,
756        checking_data: Self::CheckingData,
757        shards: &[Self::CheckedShard],
758        _strategy: &impl Strategy,
759    ) -> Result<Vec<u8>, Self::Error> {
760        let Topology {
761            encoded_rows,
762            data_cols,
763            samples,
764            data_rows,
765            data_bytes,
766            min_shards,
767            ..
768        } = checking_data.topology;
769        if shards.len() < min_shards {
770            return Err(Error::InsufficientShards(shards.len(), min_shards));
771        }
772        let mut evaluation = EvaluationVector::<F>::empty(encoded_rows.ilog2() as usize, data_cols);
773        for shard in shards {
774            let indices =
775                &checking_data.shuffled_indices[shard.index * samples..(shard.index + 1) * samples];
776            for (&i, row) in indices.iter().zip(shard.shard.iter()) {
777                evaluation.fill_row(u64::from(i) as usize, row);
778            }
779        }
780        // This should never happen, because we check each shard, and the shards
781        // should have distinct rows. But, as a sanity check, this doesn't hurt.
782        let filled_rows = evaluation.filled_rows();
783        if filled_rows < data_rows {
784            return Err(Error::InsufficientUniqueRows(filled_rows, data_rows));
785        }
786        Ok(collect_u64_le(
787            data_bytes,
788            F::stream_to_u64s(
789                evaluation
790                    .recover()
791                    .coefficients_up_to(data_rows)
792                    .flatten()
793                    .copied(),
794            ),
795        ))
796    }
797}
798
799impl<H: Hasher> ValidatingScheme for Zoda<H> {}
800
801#[cfg(test)]
802mod tests {
803    use super::*;
804    use crate::Config;
805    use commonware_cryptography::Sha256;
806    use commonware_math::{
807        algebra::{FieldNTT as _, Ring as _},
808        ntt::PolynomialVector,
809    };
810    use commonware_parallel::Sequential;
811    use commonware_utils::NZU16;
812
813    const STRATEGY: Sequential = Sequential;
814
815    #[test]
816    fn topology_reckon_handles_small_extra_shards() {
817        let config = Config {
818            minimum_shards: NZU16!(3),
819            extra_shards: NZU16!(1),
820        };
821        let topology = Topology::reckon(&config, 16);
822        assert_eq!(topology.min_shards, 3);
823        assert_eq!(topology.total_shards, 4);
824
825        // Verify we hit the 1-column fallback and the security invariant holds.
826        // When the loop in reckon() exits without finding a multi-column config,
827        // correct_column_samples() must compensate by adding column samples.
828        assert_eq!(topology.data_cols, 1);
829        let required = topology.required_samples();
830        let provided = topology.samples * (topology.column_samples / 2);
831        assert!(
832            provided >= required,
833            "security invariant violated: provided {provided} < required {required}"
834        );
835    }
836
837    #[test]
838    fn decode_rejects_duplicate_indices() {
839        let config = Config {
840            minimum_shards: NZU16!(2),
841            extra_shards: NZU16!(1),
842        };
843        let data = b"duplicate shard coverage";
844        let (commitment, shards) = Zoda::<Sha256>::encode(&config, &data[..], &STRATEGY).unwrap();
845        let shard0 = shards[0].clone();
846        let (checking_data, checked_shard0, _weak_shard0) =
847            Zoda::<Sha256>::weaken(&config, &commitment, 0, shard0).unwrap();
848        let duplicate = CheckedShard {
849            index: checked_shard0.index,
850            shard: checked_shard0.shard.clone(),
851        };
852        let shards = vec![checked_shard0, duplicate];
853        let result =
854            Zoda::<Sha256>::decode(&config, &commitment, checking_data, &shards, &STRATEGY);
855        match result {
856            Err(Error::InsufficientUniqueRows(actual, expected)) => {
857                assert!(actual < expected);
858            }
859            other => panic!("expected insufficient unique rows error, got {other:?}"),
860        }
861    }
862
863    #[test]
864    fn checksum_malleability() {
865        /// Construct the vanishing polynomial over specific indices.
866        ///
867        /// When encoded, this will be 0 at those indices, and non-zero elsewhere.
868        fn vanishing(lg_domain: u8, vanish_indices: &[u32]) -> PolynomialVector<F> {
869            let w = F::root_of_unity(lg_domain).expect("domain too large for Goldilocks");
870            let mut domain = Vec::with_capacity(1usize << lg_domain);
871            let mut x = F::one();
872            for _ in 0..(1usize << lg_domain) {
873                domain.push(x);
874                x *= &w;
875            }
876            let roots: Vec<F> = vanish_indices.iter().map(|&i| domain[i as usize]).collect();
877            let mut out = EvaluationVector::empty(lg_domain as usize, 1);
878            domain.into_iter().enumerate().for_each(|(i, x)| {
879                let mut acc = F::one();
880                for root in &roots {
881                    acc *= &(x - root);
882                }
883                out.fill_row(i, &[acc]);
884            });
885            out.recover()
886        }
887
888        let config = Config {
889            minimum_shards: NZU16!(2),
890            extra_shards: NZU16!(1),
891        };
892        let data = vec![0x5Au8; 256 * 1024];
893        let (commitment, mut shards) =
894            Zoda::<Sha256>::encode(&config, &data[..], &STRATEGY).unwrap();
895
896        let leader_i = 0usize;
897        let a_i = 1usize;
898        let b_i = 2usize;
899
900        // Apply a shift to the checksums
901        {
902            let (checking_data, _, _) = Zoda::<Sha256>::weaken(
903                &config,
904                &commitment,
905                leader_i as u16,
906                shards[leader_i].clone(),
907            )
908            .unwrap();
909
910            let samples = checking_data.topology.samples;
911            let a_indices =
912                checking_data.shuffled_indices[a_i * samples..(a_i + 1) * samples].to_vec();
913            let lg_rows = checking_data.topology.encoded_rows.ilog2() as usize;
914            let shift = vanishing(lg_rows as u8, &a_indices);
915            let mut checksum = (*shards[1].checksum).clone();
916            for (i, shift_i) in shift.coefficients_up_to(checksum.rows()).enumerate() {
917                for j in 0..checksum.cols() {
918                    checksum[(i, j)] += &shift_i[0];
919                }
920            }
921            shards[1].checksum = Arc::new(checksum);
922            shards[2].checksum = shards[1].checksum.clone();
923        }
924
925        assert!(matches!(
926            Zoda::<Sha256>::weaken(&config, &commitment, b_i as u16, shards[b_i].clone()),
927            Err(Error::InvalidWeakShard)
928        ));
929
930        // Without robust Fiat-Shamir, this will succeed.
931        // This should be rejected once follower-specific challenge binding is fixed.
932        assert!(matches!(
933            Zoda::<Sha256>::weaken(&config, &commitment, a_i as u16, shards[a_i].clone()),
934            Err(Error::InvalidWeakShard)
935        ));
936    }
937
938    #[cfg(feature = "arbitrary")]
939    mod conformance {
940        use super::*;
941        use commonware_codec::conformance::CodecConformance;
942        use commonware_cryptography::sha256::Digest as Sha256Digest;
943
944        commonware_conformance::conformance_tests! {
945            CodecConformance<StrongShard<Sha256Digest>>,
946            CodecConformance<WeakShard<Sha256Digest>>,
947        }
948    }
949}