he-ring 0.6.0

A library that provides fast implementations of rings commonly used in homomorphic encryption, built on feanor-math.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
# Implementing BFV using HE-Ring, version 2

In this example, we will again implement the BFV scheme, but this time aiming for good performance.
Perhaps unsurprisingly, the result will look quite similar to [`crate::bfv`] - the main difference is that [`crate::bfv`] is defined more generically, and allows more configuration and different parameters.

If you are unsure about the details of BFV, refer to the version 1 of our BFV implementation [`crate::examples::bfv_impl_v1`] first.
At the end of [`crate::examples::bfv_impl_v1`], we list some points that have to be considered when aiming for state-of-the-art performance - these will be the main points for this example.
Mostly, this boils down to two points: Use a Residue Number System, and a Number-Theoretic Transform-based multiplication.

## The computational and mathematical foundations

### The Residue Number System (RNS)

In BFV, only approximate size of the ciphertext modulus `q` matters - usually its bitlength.
Hence, we have a lot of freedom when actually choosing `q`.
It has become standard to choose `q = p1 ... pr` as the product of distinct primes `p1, ..., pr`, which each fit into a machine integer, e.g. `i64`.
In this case, the Chinese Remainder theorem gives an isomorphism
```text
  Z/(q)  ->  Z/(p1) x ... x Z/(pr),  x  ->  (x mod p1, ..., x mod pr)
```
In other words, instead of storing an element of `Z/(q)` as an integer `0 <= x < q`, we can store `x mod p1, ..., x mod pr`.
Each of those `x mod pi` now fit into a `i64`, which means we don't need slow arbitrary-precision integers.

In particular, since this map is an isomorphism, we can perform addition and multiplication component-wise modulo each `pi`.
In other words, we have
```text
  (x + y) mod pi = (x mod pi) + (x mod pi) mod pi
  (x * y) mod pi = (x mod pi) * (x mod pi) mod pi
```
for each `pi`.
Now each `pi` fits into a `i64`, so we actually only perform additions, multiplications and modular reductions on `i64`s, and completely avoid expensive multiplications of very large numbers.

When it comes to implementation, `feanor-math` already provides the ring [`feanor_math::rings::zn::zn_rns::Zn`] for `Z/qZ` when `q` can be represented in RNS basis. We will heavily use this from now on.

### The Number-Theoretic Transform

On a mathematical level, the Number-Theoretic Transform (NTT) is quite similar to what we did before - it just refers to the Chinese Remainder theorem, with prime numbers replaced by prime ideals in a cyclotomic number ring modulo a prime that splits completely.
In more down-to-the-earth terms, the NTT is the evaluation of a polynomial at all `n`-th roots of unity.
In other words, we say it is the map
```text
  NTT: Fp^n  ->  Fp^n,  (a[0], ..., a[n - 1])  ->  (sum_i a[i] ξ_n^i)
```
where `ξ` is a `n`-th root of unity in the finite field `Fp` - note that for `ξ` to exist, we require `p = 1 mod n`.

The NTT is important, because of two well-known facts:
 - The NTT can be computed very fast using a "Fast Fourier Transform" (FFT), in time `O(n log n)`; The same holds for the inverse map, which we call `InvNTT()`.
 - Addition and multiplication are component-wise for elements in NTT form. With NTT form, we refer to the values `NTT(x[0], ..., x[n - 1])` of an element `x = sum_i x[i] X^n` in the ring `R_p = Z[X]/(Phi_n(X), p)`.

In other words, if we store each element `x` in `R/(p)` using the values `(x'[0], ..., x'[n - 1]) = NTT(x[0], ..., x[n - 1])`, we can compute the multiplication of elements `x, y` by
```text
  z'[0] = x'[0] * y'[0],  ...,  z'[n - 1] = x'[n - 1] * y'[n - 1]
```
The individual multiplications here are very fast, since they are performed on values of `Fp`.

Note here a little mismatch: We can actually represent an element `x` in `R/(p)` by a sum `x[0] + x[1] X + ... + x[phi(n) - 1] X^(phi(n) - 1)` of only `phi(n)` summands. However, padding this to `n` when necessary does not pose a significant problem.

### The Double-RNS representation

So how do we combine these two techniques?

Very simply, for an element `x in R_q` we first compute `x mod pi` in `R/(pi)` for each `pi`, and then store `NTT(x mod pi)` for each `pi`.
These values are called the Double-RNS representation of `x`, because of the mathematical similarities between NTT and RNS, which means we view the NTT as "a second RNS".
This representation allows for very fast arithmetic, and is implemented by [`crate::ciphertext_ring::double_rns_ring::DoubleRNSRing`].

## Implementing Encryption and Decryption

Ok, so let's use [`crate::ciphertext_ring::double_rns_ring::DoubleRNSRing`] for the ciphertext ring.
We cannot use it for the plaintext ring, since to use the double-RNS representation, we need to have `p = 1 mod n` for each prime divisor `p | q` of `q`.
We also remark that due to technical details, we sometimes additionally require `p = 1 mod 2^k n` for some `k > log2(n) + 1`.

Anyway, we can now define our ring types
```rust
# use feanor_math::ring::*;
# use feanor_math::rings::zn::*;
# use feanor_math::integer::*;
# use he_ring::number_ring::*;
# use he_ring::number_ring::pow2_cyclotomic::*;
# use he_ring::ciphertext_ring::double_rns_ring::*;
# use he_ring::number_ring::quotient::*;
type NumberRing = Pow2CyclotomicNumberRing;
type PlaintextRing = NumberRingQuotient<NumberRing, zn_64::Zn>;
type CiphertextRing = DoubleRNSRing<NumberRing>;
```
Creating this ring is not completely trivial, since we need to find a suitable `q`.
```rust
# use feanor_math::algorithms::miller_rabin::*;
# use feanor_math::ring::*;
# use feanor_math::rings::zn::*;
# use feanor_math::integer::*;
# use feanor_math::primitive_int::StaticRing;
# use he_ring::number_ring::*;
# use he_ring::number_ring::pow2_cyclotomic::*;
# use he_ring::ciphertext_ring::double_rns_ring::*;
# use he_ring::number_ring::quotient::*;
# type NumberRing = Pow2CyclotomicNumberRing;
# type PlaintextRing = NumberRingQuotient<NumberRing, zn_64::Zn>;
# type CiphertextRing = DoubleRNSRing<NumberRing>;
fn create_ciphertext_ring(ring_degree: usize, mut number_of_rns_factors: usize) -> CiphertextRing {
    let n = ring_degree * 2;
    let mut rns_factors = Vec::new();
    // `current` should always be `= 1 mod n`
    let mut current = (1 << 57) - ((1 << 57) % n) + 1;
    while number_of_rns_factors > 0 {
        if is_prime(StaticRing::<i64>::RING, &(current as i64), 10) {
            rns_factors.push(current);
            number_of_rns_factors -= 1;
        }
        current -= n;
    }
    return <CiphertextRing as RingStore>::Type::new(
        Pow2CyclotomicNumberRing::new(n),
        zn_rns::Zn::new(rns_factors.iter().map(|p| zn_64::Zn::new(*p as u64)).collect(), BigIntRing::RING)
    );
}

fn create_plaintext_ring(ring_degree: usize, t: u64) -> PlaintextRing {
    // as before
#     return <PlaintextRing as RingStore>::Type::new(
#         Pow2CyclotomicNumberRing::new(ring_degree * 2), 
#         zn_64::Zn::new(t)
#     ); 
}
```
Note that HE-Ring provides the convenience function [`crate::number_ring::sample_primes()`] and [`crate::number_ring::extend_sampled_primes()`] to simplify this:
```rust
# use feanor_math::algorithms::miller_rabin::*;
# use feanor_math::ring::*;
# use feanor_math::rings::zn::*;
# use feanor_math::integer::*;
# use feanor_math::primitive_int::StaticRing;
# use feanor_math::ordered::OrderedRingStore;
# use he_ring::number_ring::*;
# use he_ring::number_ring::pow2_cyclotomic::*;
# use he_ring::ciphertext_ring::double_rns_ring::*;
# use he_ring::number_ring::quotient::*;
# type NumberRing = Pow2CyclotomicNumberRing;
# type PlaintextRing = NumberRingQuotient<NumberRing, zn_64::Zn>;
# type CiphertextRing = DoubleRNSRing<NumberRing>;
fn create_ciphertext_ring(ring_degree: usize, bitlength_of_q: usize) -> CiphertextRing {
    let number_ring = Pow2CyclotomicNumberRing::new(ring_degree * 2);
    let mut rns_factors = sample_primes(
        bitlength_of_q - 10, 
        bitlength_of_q, 
        57, 
        |bound| largest_prime_leq_congruent_to_one(int_cast(bound, StaticRing::<i64>::RING, BigIntRing::RING), number_ring. mod_p_required_root_of_unity() as i64).map(|p| int_cast(p, BigIntRing::RING, StaticRing::<i64>::RING))
    ).unwrap();
    return <CiphertextRing as RingStore>::Type::new(
        number_ring,
        zn_rns::Zn::new(rns_factors.into_iter().map(|p| zn_64::Zn::new(int_cast(p, StaticRing::<i64>::RING, BigIntRing::RING) as u64)). collect(), BigIntRing::RING)
    );
}

fn create_plaintext_ring(ring_degree: usize, t: u64) -> PlaintextRing {
    // as before
#     return <PlaintextRing as RingStore>::Type::new(
#         Pow2CyclotomicNumberRing::new(ring_degree * 2), 
#         zn_64::Zn::new(t)
#     ); 
}
```
Once we have this, key-generation, encryption and decryption can actually be implemented exactly as before.
It might be possible to optimize decryption somewhat (leveraging the RNS representation), but we skip that here.
Instead, we will use similar (but more impactful) techniques when implementing homomorphic multiplication.

```rust
# use feanor_math::algorithms::miller_rabin::*;
# use feanor_math::ring::*;
# use feanor_math::rings::zn::*;
# use feanor_math::homomorphism::*;
# use feanor_math::rings::extension::FreeAlgebraStore;
# use feanor_math::integer::*;
# use feanor_math::seq::VectorFn;
# use feanor_math::rings::finite::FiniteRingStore;
# use feanor_math::ordered::OrderedRingStore;
# use feanor_math::primitive_int::StaticRing;
# use rand::{Rng, RngCore, thread_rng};
# use rand_distr::StandardNormal;
# use he_ring::number_ring::*;
# use he_ring::number_ring::pow2_cyclotomic::*;
# use he_ring::ciphertext_ring::double_rns_ring::*;
# use he_ring::number_ring::quotient::*;
# type NumberRing = Pow2CyclotomicNumberRing;
# type PlaintextRing = NumberRingQuotient<NumberRing, zn_64::Zn>;
# type CiphertextRing = DoubleRNSRing<NumberRing>;
# fn create_ciphertext_ring(ring_degree: usize, bitlength_of_q: usize) -> CiphertextRing {
#     let number_ring = Pow2CyclotomicNumberRing::new(ring_degree * 2);
#     let mut rns_factors = sample_primes(
#         bitlength_of_q - 10, 
#         bitlength_of_q, 
#         56, 
#         |bound| largest_prime_leq_congruent_to_one(int_cast(bound, StaticRing::<i64>::RING, BigIntRing::RING), number_ring. mod_p_required_root_of_unity() as i64).map(|p| int_cast(p, BigIntRing::RING, StaticRing::<i64>::RING))
#     ).unwrap();
#     rns_factors.sort_unstable_by(|l, r| BigIntRing::RING.cmp(l, r));
#     return <CiphertextRing as RingStore>::Type::new(
#         number_ring,
#         zn_rns::Zn::new(rns_factors.into_iter().map(|p| zn_64::Zn::new(int_cast(p, StaticRing::<i64>::RING, BigIntRing::RING) as u64)). collect(), BigIntRing::RING)
#     );
# }
# fn create_plaintext_ring(ring_degree: usize, t: u64) -> PlaintextRing {
#     return <PlaintextRing as RingStore>::Type::new(
#         Pow2CyclotomicNumberRing::new(ring_degree * 2), 
#         zn_64::Zn::new(t)
#     ); 
# }
fn key_gen(ciphertext_ring: &CiphertextRing) -> El<CiphertextRing> {
    let mut rng = thread_rng();
    let sk = ciphertext_ring.from_canonical_basis(
        (0..ciphertext_ring.rank()).map(|_| ciphertext_ring.base_ring().int_hom().map(
            (rng.next_u32() as i32 % 3) - 1
        ))
    );
    return sk;
}

fn rlwe_sample(ciphertext_ring: &CiphertextRing, sk: &El<CiphertextRing>) -> (El<CiphertextRing>, El<CiphertextRing>) {
    let mut rng = thread_rng();
    let a = ciphertext_ring.random_element(|| rng.next_u64());
    let e = ciphertext_ring.from_canonical_basis(
        (0..ciphertext_ring.rank()).map(|_| ciphertext_ring.base_ring().int_hom().map(
            (rng.sample::<f64, _>(StandardNormal) * 3.2).round() as i32
        ))
    );
    return (ciphertext_ring.add(e, ciphertext_ring.mul_ref(&a, sk)), ciphertext_ring.negate(a));
}

fn enc_sym(
    plaintext_ring: &PlaintextRing, 
    ciphertext_ring: &CiphertextRing, 
    x: &El<PlaintextRing>, 
    sk: &El<CiphertextRing>
) -> (El<CiphertextRing>, El<CiphertextRing>) {
    let (b, a) = rlwe_sample(ciphertext_ring, sk);

    let q = ciphertext_ring.base_ring().modulus();
    let t = int_cast(*plaintext_ring.base_ring().modulus(), BigIntRing::RING, StaticRing::<i64>::RING);
    let Δ = ciphertext_ring.base_ring().can_hom(&BigIntRing::RING).unwrap().map(BigIntRing::RING.rounded_div(BigIntRing::RING.clone_el(q), &t));

    let x_mod_q = ciphertext_ring.from_canonical_basis(
        plaintext_ring.wrt_canonical_basis(x).iter().map(|c| ciphertext_ring.base_ring().int_hom().map(
            plaintext_ring.base_ring().smallest_lift(c) as i32
        ))
    );
    return (ciphertext_ring.add(ciphertext_ring.inclusion().mul_map(x_mod_q, Δ), b), a);
}

fn dec(
    plaintext_ring: &PlaintextRing, 
    ciphertext_ring: &CiphertextRing, 
    ct: &(El<CiphertextRing>, El<CiphertextRing>), 
    sk: &El<CiphertextRing>
) -> El<PlaintextRing> {
    let decryption_with_noise = ciphertext_ring.add_ref_fst(&ct.0, ciphertext_ring.mul_ref(&ct.1, sk));

    let t = int_cast(*plaintext_ring.base_ring().modulus(), BigIntRing::RING, StaticRing::<i64>::RING);
    let modulo_t = plaintext_ring.base_ring().can_hom(&BigIntRing::RING).unwrap();
    // next, we have to compute the scaling-by-`t/q`, which includes a rounding at the end
    return plaintext_ring.from_canonical_basis(
        ciphertext_ring.wrt_canonical_basis(&decryption_with_noise).iter().map(|c| modulo_t.map(BigIntRing::RING.rounded_div(
            BigIntRing::RING.mul_ref_fst(&t, ciphertext_ring.base_ring().smallest_lift(c)),
            ciphertext_ring.base_ring().modulus()
        )))
    );
}

fn hom_add(
    ciphertext_ring: &CiphertextRing, 
    lhs: &(El<CiphertextRing>, El<CiphertextRing>), 
    rhs: &(El<CiphertextRing>, El<CiphertextRing>)
) -> (El<CiphertextRing>, El<CiphertextRing>) {
    return (ciphertext_ring.add_ref(&lhs.0, &rhs.0), ciphertext_ring.add_ref(&lhs.1, &rhs.1));
}
```

## Homomorphic multiplication

As in [`crate::examples::bfv_impl_v1`], homomorphic multiplication is the most interesting operation.
The first question is, which ring do we use to evaluate `c0 * c0'` etc. without wrapping around `q`?
Previously, we worked without any modulus in `Z[X]/(Phi_n(X))`, which certainly works, but means we cannot use the double-RNS representation.
Instead, it makes sense to choose a larger modulus - say `qq'` - and perform the multiplication in `R_(qq')`.
If `qq'` is larger than the coefficients of `c0 * c0'` in `R`, this means we still get the correct result.
Since creating a double-RNS ring is somewhat expensive, we do this once and reuse the ring.
```rust
# use feanor_math::algorithms::miller_rabin::*;
# use feanor_math::ring::*;
# use feanor_math::rings::zn::*;
# use feanor_math::homomorphism::*;
# use feanor_math::seq::VectorView;
# use feanor_math::rings::extension::FreeAlgebraStore;
# use feanor_math::integer::*;
# use feanor_math::seq::VectorFn;
# use feanor_math::rings::finite::FiniteRingStore;
# use feanor_math::ordered::OrderedRingStore;
# use feanor_math::primitive_int::StaticRing;
# use rand::{Rng, RngCore, thread_rng};
# use rand_distr::StandardNormal;
# use he_ring::number_ring::*;
# use he_ring::number_ring::pow2_cyclotomic::*;
# use he_ring::ciphertext_ring::double_rns_ring::*;
# use he_ring::number_ring::quotient::*;
# type NumberRing = Pow2CyclotomicNumberRing;
# type PlaintextRing = NumberRingQuotient<NumberRing, zn_64::Zn>;
# type CiphertextRing = DoubleRNSRing<NumberRing>;
fn create_multiplication_ring(ciphertext_ring: &CiphertextRing) -> CiphertextRing {
    let number_ring = ciphertext_ring.get_ring().number_ring().clone();
    let rns_factors = extend_sampled_primes(
        &ciphertext_ring.base_ring().as_iter().map(|RNS_factor| int_cast(*RNS_factor.modulus(), BigIntRing::RING, StaticRing::<i64>::RING)).collect::<Vec<_>>(),
        BigIntRing::RING.abs_log2_ceil(ciphertext_ring.base_ring().modulus()).unwrap() * 2 + StaticRing::<i64>::RING.abs_log2_ceil(&(number_ring.rank() as i64)).unwrap() + 10, 
        BigIntRing::RING.abs_log2_ceil(ciphertext_ring.base_ring().modulus()).unwrap() * 2 + StaticRing::<i64>::RING.abs_log2_ceil(&(number_ring.rank() as i64)).unwrap() + 67, 
        57, 
        |bound| largest_prime_leq_congruent_to_one(int_cast(bound, StaticRing::<i64>::RING, BigIntRing::RING), number_ring.mod_p_required_root_of_unity() as i64).map(|p| int_cast(p, BigIntRing::RING, StaticRing::<i64>::RING))
    ).unwrap().into_iter().map(|p| 
        int_cast(p, StaticRing::<i64>::RING, BigIntRing::RING)
    ).collect::<Vec<_>>();
    return <CiphertextRing as RingStore>::Type::new(
        number_ring,
        zn_rns::Zn::new(rns_factors.into_iter().map(|p| zn_64::Zn::new(p as u64)).collect(), BigIntRing::RING)
    );
}
```
Next, we turn to the question on how to map the elements of `R_q` to `R_(qq')`.
While we could do it in the same way as before, i.e. take the shortest lift of each coefficient and map it into `Z/(qq')`, this would again mean using arbitrary-precision integers.

Instead we use a technique that was propsed by <https://ia.cr/2016/510>.
More concretely, they show how to implement the necessary operations as "RNS conversions", which directly operate on the RNS values, and don't involve big integers.
The details are not overly complicated, but would go beyond the scope of this introduction, so we just describe how to use the RNS conversion implementations that are implemented in HE-Ring.
In particular, we are interested in [`crate::rnsconv::lift::AlmostExactBaseConversion`] and [`crate::rnsconv::bfv_rescale::AlmostExactRescalingConvert`], since they implement the two operations that we require for BFV multiplication.
The first one takes care of the conversion `Z/(q) -> Z/(qq')`, and the second one does the downscaling at the end, which scales elements of `Z/(qq')` by `t/q` and maps them back to `Z/(q)`.
All implemented RNS conversions take the input in the form of the following matrix:
```text
   /  x[1] mod p1   x[2] mod p1   ...   x[m] mod p1   \
  |   x[1] mod p2   x[2] mod p2   ...   x[m] mod p2    |
  |         ⋮              ⋮                   ⋮        |
   \  x[1] mod pr   x[2] mod pr   ...   x[m] mod pr   /
```
where `x[1], ..., x[m]` are an arbitrary number of elements of `Z/(q)`.
The output is returned in the same format.

If we now want to apply such an RNS conversion to an element of `R_q`, we need to first convert this element back from double-RNS basis, to "single-RNS" representation.
This requires a relatively costly NTT, but no way of avoiding this is known.
Note however that it actually is not required to get the actual coefficients `x[i]` in the representatation `x = x[0] + x[1] X + ... + x[phi(n) - 1] X^(phi(n) - 1)`, but the coefficients w.r.t. any basis that consists of "short" elements suffice.
The reason for this is that the noise growth during multiplication depends on the size of `c0, c1, c0', c1'` w.r.t. the *canonical norm* of the number ring `R`.
More details on the mathematical background can be found in the original BFV paper.

We can access the coefficients w.r.t. some "short-element" basis of an element of `R_q` using the functions [`crate::ciphertext_ring::double_rns_ring::DoubleRNSRingBase::undo_fft()`] and [`crate::ciphertext_ring::double_rns_ring::DoubleRNSRingBase::as_matrix_wrt_small_basis()`].
Fortunately for us, the returned data is exactly in the right format.
This leaves us to implement BFV multiplication as follows.
```rust
#![feature(allocator_api)]
# use std::alloc::Global;
# use feanor_math::algorithms::miller_rabin::*;
# use feanor_math::ring::*;
# use feanor_math::rings::zn::*;
# use feanor_math::homomorphism::*;
# use feanor_math::rings::extension::FreeAlgebraStore;
# use feanor_math::integer::*;
# use feanor_math::seq::VectorFn;
# use feanor_math::seq::VectorView;
# use feanor_math::rings::finite::FiniteRingStore;
# use feanor_math::ordered::OrderedRingStore;
# use feanor_math::primitive_int::StaticRing;
# use rand::{Rng, RngCore, thread_rng};
# use rand_distr::StandardNormal;
# use he_ring::number_ring::*;
# use he_ring::rnsconv::bfv_rescale::AlmostExactRescalingConvert;
# use he_ring::rnsconv::RNSOperation;
# use he_ring::number_ring::pow2_cyclotomic::*;
# use he_ring::ciphertext_ring::double_rns_ring::*;
# use he_ring::number_ring::quotient::*;
# use he_ring::rnsconv::lift::AlmostExactBaseConversion;
# type NumberRing = Pow2CyclotomicNumberRing;
# type PlaintextRing = NumberRingQuotient<NumberRing, zn_64::Zn>;
# type CiphertextRing = DoubleRNSRing<NumberRing>;
# fn create_multiplication_ring(ciphertext_ring: &CiphertextRing) -> CiphertextRing {
#     let number_ring = ciphertext_ring.get_ring().number_ring().clone();
#     let rns_factors = extend_sampled_primes(
#         &ciphertext_ring.base_ring().as_iter().map(|RNS_factor| int_cast(*RNS_factor.modulus(), BigIntRing::RING, StaticRing::<i64>::RING)).collect::<Vec<_>>(),
#         BigIntRing::RING.abs_log2_ceil(ciphertext_ring.base_ring().modulus()).unwrap() * 2 + StaticRing::<i64>::RING.abs_log2_ceil(&(number_ring.rank() as i64)).unwrap() + 10, 
#         BigIntRing::RING.abs_log2_ceil(ciphertext_ring.base_ring().modulus()).unwrap() * 2 + StaticRing::<i64>::RING.abs_log2_ceil(&(number_ring.rank() as i64)).unwrap() + 67, 
#         57, 
#         |bound| largest_prime_leq_congruent_to_one(int_cast(bound, StaticRing::<i64>::RING, BigIntRing::RING), number_ring.mod_p_required_root_of_unity() as i64).map(|p| int_cast(p, BigIntRing::RING, StaticRing::<i64>::RING))
#     ).unwrap().into_iter().map(|p| 
#         int_cast(p, StaticRing::<i64>::RING, BigIntRing::RING)
#     ).collect::<Vec<_>>();
#     return <CiphertextRing as RingStore>::Type::new(
#         number_ring,
#         zn_rns::Zn::new(rns_factors.into_iter().map(|p| zn_64::Zn::new(p as u64)).collect(), BigIntRing::RING)
#     );
# }
fn hom_mul_three_component(
    plaintext_ring: &PlaintextRing, 
    ciphertext_ring: &CiphertextRing, 
    multiplication_ring: &CiphertextRing,
    lhs: &(SmallBasisEl<Pow2CyclotomicNumberRing>, SmallBasisEl<Pow2CyclotomicNumberRing>), 
    rhs: &(SmallBasisEl<Pow2CyclotomicNumberRing>, SmallBasisEl<Pow2CyclotomicNumberRing>)
) -> (SmallBasisEl<Pow2CyclotomicNumberRing>, SmallBasisEl<Pow2CyclotomicNumberRing>, SmallBasisEl<Pow2CyclotomicNumberRing>) {
    let (c0, c1) = (&lhs.0, &lhs.1);
    let (c0_prime, c1_prime) = (&rhs.0, &rhs.1);
    
    let lift_to_multiplication_ring_rnsconv = AlmostExactBaseConversion::new_with(
        ciphertext_ring.base_ring().as_iter().map(|Zp| zn_64::Zn::new(*Zp.modulus() as u64)).collect::<Vec<_>>(), 
        multiplication_ring.base_ring().as_iter().map(|Zp| zn_64::Zn::new(*Zp.modulus() as u64)).collect::<Vec<_>>(),
        Global
    );
    debug_assert!(lift_to_multiplication_ring_rnsconv.input_rings().iter().zip(ciphertext_ring.base_ring().as_iter()).all(|(lhs, rhs)| lhs.get_ring() == rhs.get_ring()));
    debug_assert!(lift_to_multiplication_ring_rnsconv.output_rings().iter().zip(multiplication_ring.base_ring().as_iter()).all(|(lhs, rhs)| lhs.get_ring() == rhs.get_ring()));
    let lift_to_multiplication_ring = |x: &SmallBasisEl<_, _>| {
        let mut result = multiplication_ring.get_ring().zero_non_fft();
        lift_to_multiplication_ring_rnsconv.apply(ciphertext_ring.get_ring().as_matrix_wrt_small_basis(&x), multiplication_ring.get_ring().as_matrix_wrt_small_basis_mut(&mut result));
        return multiplication_ring.get_ring().do_fft(result);
    };

    let unscaled_result = (
        multiplication_ring.mul(lift_to_multiplication_ring(&c0), lift_to_multiplication_ring(&c0_prime)),
        multiplication_ring.add(
            multiplication_ring.mul(lift_to_multiplication_ring(&c0), lift_to_multiplication_ring(&c1_prime)),
            multiplication_ring.mul(lift_to_multiplication_ring(&c1), lift_to_multiplication_ring(&c0_prime))
        ),
        multiplication_ring.mul(lift_to_multiplication_ring(&c1), lift_to_multiplication_ring(&c1_prime))
    );

    let scale_down_rnsconv = AlmostExactRescalingConvert::new_with(
        multiplication_ring.base_ring().as_iter().map(|Zp| zn_64::Zn::new(*Zp.modulus() as u64)).collect::<Vec<_>>(), 
        vec![ zn_64::Zn::new(*plaintext_ring.base_ring().modulus() as u64) ], 
        ciphertext_ring.base_ring().as_iter().map(|Zp| multiplication_ring.base_ring().as_iter().position(|Zp2| Zp2.modulus() == Zp.modulus()).unwrap()).collect::<Vec<_>>(),
        Global
    );
    debug_assert!(scale_down_rnsconv.input_rings().iter().zip(multiplication_ring.base_ring().as_iter()).all(|(lhs, rhs)| lhs.get_ring() == rhs.get_ring()));
    debug_assert!(scale_down_rnsconv.output_rings().iter().zip(ciphertext_ring.base_ring().as_iter()).all(|(lhs, rhs)| lhs.get_ring() == rhs.get_ring()));
    let scale_down = |x: El<CiphertextRing>| {
        let mut result = ciphertext_ring.get_ring().zero_non_fft();
        scale_down_rnsconv.apply(multiplication_ring.get_ring().as_matrix_wrt_small_basis(&multiplication_ring.get_ring().undo_fft(x)), ciphertext_ring.get_ring().as_matrix_wrt_small_basis_mut(&mut result));
        return result;
    };

    return (
        scale_down(unscaled_result.0),
        scale_down(unscaled_result.1),
        scale_down(unscaled_result.2)
    );
}
```
Note that we now take the inputs and return the outputs as `SmallBasisEl<Pow2CyclotomicNumberRing>`, instead of `El<CiphertextRing>`.
The difference is that `El<CiphertextRing>` will store elements in double-RNS representation, while [`crate::ciphertext_ring::double_rns_ring::SmallBasisEl`] uses the single-RNS small-basis representation.
Hence, by taking inputs as `SmallBasisEl`, we can avoid the costly NTT - assuming that the caller already has elements available in small-basis representation.

To conclude this example, we have to implement relinearization.
Note that previously, we did so by using a gadget product, which relies on the basis-`B` decomposition of elements.
We still use a gadget product, but this time, we use an RNS-compatible gadget vector.
This again allows us to avoid arbitrary-precision integers.
In fact, this means we can use the available implementation of gadget product in [`crate::gadget_product`], and thus this part will actually be much shorter than relinearization in [`crate::examples::bfv_impl_v1`].
Without going into the details, we just create two [`crate::gadget_product::GadgetProductRhsOperand`] instead of the `Vec<(El<CiphertextRing>, El<CiphertextRing>)>` as the relinearization key, and a single [`crate::gadget_product::GadgetProductLhsOperand`] for `c2` during relinearization.
We arrive at:
```rust
#![feature(allocator_api)]
# use std::alloc::Global;
# use feanor_math::algorithms::miller_rabin::*;
# use feanor_math::ring::*;
# use feanor_math::rings::zn::*;
# use feanor_math::homomorphism::*;
# use feanor_math::rings::extension::FreeAlgebraStore;
# use feanor_math::integer::*;
# use feanor_math::seq::VectorFn;
# use feanor_math::seq::VectorView;
# use feanor_math::rings::finite::FiniteRingStore;
# use feanor_math::primitive_int::StaticRing;
# use rand::{Rng, RngCore, thread_rng};
# use rand_distr::StandardNormal;
# use he_ring::number_ring::*;
# use he_ring::gadget_product::*;
# use he_ring::rnsconv::RNSOperation;
# use he_ring::number_ring::pow2_cyclotomic::*;
# use he_ring::ciphertext_ring::double_rns_ring::*;
# use he_ring::number_ring::quotient::*;
# type NumberRing = Pow2CyclotomicNumberRing;
# type PlaintextRing = NumberRingQuotient<NumberRing, zn_64::Zn>;
# type CiphertextRing = DoubleRNSRing<NumberRing>;
# fn rlwe_sample(ciphertext_ring: &CiphertextRing, sk: &El<CiphertextRing>) -> (El<CiphertextRing>, El<CiphertextRing>) {
#     let mut rng = thread_rng();
#     let a = ciphertext_ring.random_element(|| rng.next_u64());
#     let e = ciphertext_ring.from_canonical_basis(
#         (0..ciphertext_ring.rank()).map(|_| ciphertext_ring.base_ring().int_hom().map(
#             (rng.sample::<f64, _>(StandardNormal) * 3.2).round() as i32
#         ))
#     );
#     return (ciphertext_ring.add(e, ciphertext_ring.mul_ref(&a, sk)), ciphertext_ring.negate(a));
# }
type RelinKey = (GadgetProductRhsOperand<<CiphertextRing as RingStore>::Type>, GadgetProductRhsOperand<<CiphertextRing as RingStore>::Type>);

fn gen_relin_key(
    ciphertext_ring: &CiphertextRing, 
    sk: &El<CiphertextRing>, 
    digits: usize
) -> RelinKey {
    let sk_sqr = ciphertext_ring.pow(ciphertext_ring.clone_el(sk), 2);
    let mut result0 = GadgetProductRhsOperand::new(ciphertext_ring.get_ring(), digits);
    let mut result1 = GadgetProductRhsOperand::new(ciphertext_ring.get_ring(), digits);
    
    let gadget_vector_len = result0.gadget_vector(ciphertext_ring.get_ring()).len();
    for i in 0..gadget_vector_len {
        let (b, a) = rlwe_sample(ciphertext_ring, sk);
        let factor = result0.gadget_vector(ciphertext_ring.get_ring()).at(i);
        let (key0, key1) = (ciphertext_ring.add(b, ciphertext_ring.inclusion().mul_ref_map(&sk_sqr, &factor)), a);
        result0.set_rns_factor(ciphertext_ring.get_ring(), i, key0);
        result1.set_rns_factor(ciphertext_ring.get_ring(), i, key1);
    }
    return (result0, result1);
}

fn relinearize(
    ciphertext_ring: &CiphertextRing, 
    three_component_ciphertext: (SmallBasisEl<Pow2CyclotomicNumberRing>, SmallBasisEl<Pow2CyclotomicNumberRing>, SmallBasisEl<Pow2CyclotomicNumberRing>), 
    relin_key: &RelinKey
) -> (SmallBasisEl<Pow2CyclotomicNumberRing>, SmallBasisEl<Pow2CyclotomicNumberRing>) {
    let c2_decomposition = GadgetProductLhsOperand::from_double_rns_ring_with(ciphertext_ring.get_ring(), &three_component_ciphertext.2, relin_key.0.gadget_vector_digits());
    let mut result0 = three_component_ciphertext.0;
    ciphertext_ring.get_ring().add_assign_non_fft(&mut result0, &ciphertext_ring.get_ring().undo_fft(c2_decomposition.gadget_product(&relin_key.0, ciphertext_ring.get_ring())));
    let mut result1 = three_component_ciphertext.1;
    ciphertext_ring.get_ring().add_assign_non_fft(&mut result1, &ciphertext_ring.get_ring().undo_fft(c2_decomposition.gadget_product(&relin_key.1, ciphertext_ring.get_ring())));
    return (result0, result1);
}
```
Finally, let's test this implementation again!
```rust
#![feature(allocator_api)]
# use std::alloc::Global;
# use feanor_math::algorithms::miller_rabin::*;
# use feanor_math::ring::*;
# use feanor_math::rings::zn::*;
# use feanor_math::assert_el_eq;
# use feanor_math::homomorphism::*;
# use feanor_math::rings::extension::FreeAlgebraStore;
# use feanor_math::integer::*;
# use feanor_math::seq::VectorFn;
# use feanor_math::seq::VectorView;
# use feanor_math::rings::finite::FiniteRingStore;
# use feanor_math::primitive_int::StaticRing;
# use feanor_math::ordered::OrderedRingStore;
# use rand::{Rng, RngCore, thread_rng};
# use rand_distr::StandardNormal;
# use he_ring::number_ring::*;
# use he_ring::rnsconv::bfv_rescale::AlmostExactRescalingConvert;
# use he_ring::gadget_product::*;
# use he_ring::rnsconv::RNSOperation;
# use he_ring::number_ring::pow2_cyclotomic::*;
# use he_ring::ciphertext_ring::double_rns_ring::*;
# use he_ring::number_ring::quotient::*;
# use he_ring::rnsconv::lift::AlmostExactBaseConversion;
# type NumberRing = Pow2CyclotomicNumberRing;
# type PlaintextRing = NumberRingQuotient<NumberRing, zn_64::Zn>;
# type CiphertextRing = DoubleRNSRing<NumberRing>;
# type RelinKey = (GadgetProductRhsOperand<<CiphertextRing as RingStore>::Type>, GadgetProductRhsOperand<<CiphertextRing as RingStore>::Type>);
# fn create_ciphertext_ring(ring_degree: usize, bitlength_of_q: usize) -> CiphertextRing {
#     let number_ring = Pow2CyclotomicNumberRing::new(ring_degree * 2);
#     let mut rns_factors = sample_primes(
#         bitlength_of_q - 10, 
#         bitlength_of_q, 
#         57, 
#         |bound| largest_prime_leq_congruent_to_one(int_cast(bound, StaticRing::<i64>::RING, BigIntRing::RING), number_ring. mod_p_required_root_of_unity() as i64).map(|p| int_cast(p, BigIntRing::RING, StaticRing::<i64>::RING))
#     ).unwrap();
#     return <CiphertextRing as RingStore>::Type::new(
#         number_ring,
#         zn_rns::Zn::new(rns_factors.into_iter().map(|p| zn_64::Zn::new(int_cast(p, StaticRing::<i64>::RING, BigIntRing::RING) as u64)). collect(), BigIntRing::RING)
#     );
# }
# fn create_plaintext_ring(ring_degree: usize, t: u64) -> PlaintextRing {
#     return <PlaintextRing as RingStore>::Type::new(
#         Pow2CyclotomicNumberRing::new(ring_degree * 2), 
#         zn_64::Zn::new(t)
#     ); 
# }
# fn key_gen(ciphertext_ring: &CiphertextRing) -> El<CiphertextRing> {
#     let mut rng = thread_rng();
#     let sk = ciphertext_ring.from_canonical_basis(
#         (0..ciphertext_ring.rank()).map(|_| ciphertext_ring.base_ring().int_hom().map(
#             (rng.next_u32() as i32 % 3) - 1
#         ))
#     );
#     return sk;
# }
# fn rlwe_sample(ciphertext_ring: &CiphertextRing, sk: &El<CiphertextRing>) -> (El<CiphertextRing>, El<CiphertextRing>) {
#     let mut rng = thread_rng();
#     let a = ciphertext_ring.random_element(|| rng.next_u64());
#     let e = ciphertext_ring.from_canonical_basis(
#         (0..ciphertext_ring.rank()).map(|_| ciphertext_ring.base_ring().int_hom().map(
#             (rng.sample::<f64, _>(StandardNormal) * 3.2).round() as i32
#         ))
#     );
#     return (ciphertext_ring.add(e, ciphertext_ring.mul_ref(&a, sk)), ciphertext_ring.negate(a));
# }
# fn enc_sym(
#     plaintext_ring: &PlaintextRing, 
#     ciphertext_ring: &CiphertextRing, 
#     x: &El<PlaintextRing>, 
#     sk: &El<CiphertextRing>
# ) -> (El<CiphertextRing>, El<CiphertextRing>) {
#     let (b, a) = rlwe_sample(ciphertext_ring, sk);
#     let q = ciphertext_ring.base_ring().modulus();
#     let t = int_cast(*plaintext_ring.base_ring().modulus(), BigIntRing::RING, StaticRing::<i64>::RING);
#     let Δ = ciphertext_ring.base_ring().can_hom(&BigIntRing::RING).unwrap().map(BigIntRing::RING.rounded_div(BigIntRing::RING.clone_el(q), &t));
#     let x_mod_q = ciphertext_ring.from_canonical_basis(
#         plaintext_ring.wrt_canonical_basis(x).iter().map(|c| ciphertext_ring.base_ring().int_hom().map(
#             plaintext_ring.base_ring().smallest_lift(c) as i32
#         ))
#     );
#     return (ciphertext_ring.add(ciphertext_ring.inclusion().mul_map(x_mod_q, Δ), b), a);
# }
# fn dec(
#     plaintext_ring: &PlaintextRing, 
#     ciphertext_ring: &CiphertextRing, 
#     ct: &(El<CiphertextRing>, El<CiphertextRing>), 
#     sk: &El<CiphertextRing>
# ) -> El<PlaintextRing> {
#     let decryption_with_noise = ciphertext_ring.add_ref_fst(&ct.0, ciphertext_ring.mul_ref(&ct.1, sk));
#     let t = int_cast(*plaintext_ring.base_ring().modulus(), BigIntRing::RING, StaticRing::<i64>::RING);
#     let modulo_t = plaintext_ring.base_ring().can_hom(&BigIntRing::RING).unwrap();
#     return plaintext_ring.from_canonical_basis(
#         ciphertext_ring.wrt_canonical_basis(&decryption_with_noise).iter().map(|c| modulo_t.map(BigIntRing::RING.rounded_div(
#             BigIntRing::RING.mul_ref_fst(&t, ciphertext_ring.base_ring().smallest_lift(c)),
#             ciphertext_ring.base_ring().modulus()
#         )))
#     );
# }
# fn hom_add(
#     ciphertext_ring: &CiphertextRing, 
#     lhs: &(El<CiphertextRing>, El<CiphertextRing>), 
#     rhs: &(El<CiphertextRing>, El<CiphertextRing>)
# ) -> (El<CiphertextRing>, El<CiphertextRing>) {
#     return (ciphertext_ring.add_ref(&lhs.0, &rhs.0), ciphertext_ring.add_ref(&lhs.1, &rhs.1));
# }
# fn create_multiplication_ring(ciphertext_ring: &CiphertextRing) -> CiphertextRing {
#     let number_ring = ciphertext_ring.get_ring().number_ring().clone();
#     let rns_factors = extend_sampled_primes(
#         &ciphertext_ring.base_ring().as_iter().map(|RNS_factor| int_cast(*RNS_factor.modulus(), BigIntRing::RING, StaticRing::<i64>::RING)).collect::<Vec<_>>(),
#         BigIntRing::RING.abs_log2_ceil(ciphertext_ring.base_ring().modulus()).unwrap() * 2 + StaticRing::<i64>::RING.abs_log2_ceil(&(number_ring.rank() as i64)).unwrap() + 10, 
#         BigIntRing::RING.abs_log2_ceil(ciphertext_ring.base_ring().modulus()).unwrap() * 2 + StaticRing::<i64>::RING.abs_log2_ceil(&(number_ring.rank() as i64)).unwrap() + 67, 
#         57, 
#         |bound| largest_prime_leq_congruent_to_one(int_cast(bound, StaticRing::<i64>::RING, BigIntRing::RING), number_ring.mod_p_required_root_of_unity() as i64).map(|p| int_cast(p, BigIntRing::RING, StaticRing::<i64>::RING))
#     ).unwrap().into_iter().map(|p| 
#         int_cast(p, StaticRing::<i64>::RING, BigIntRing::RING)
#     ).collect::<Vec<_>>();
#     return <CiphertextRing as RingStore>::Type::new(
#         number_ring,
#         zn_rns::Zn::new(rns_factors.into_iter().map(|p| zn_64::Zn::new(p as u64)).collect(), BigIntRing::RING)
#     );
# }
# fn hom_mul_three_component(
#     plaintext_ring: &PlaintextRing, 
#     ciphertext_ring: &CiphertextRing, 
#     multiplication_ring: &CiphertextRing,
#     lhs: &(SmallBasisEl<Pow2CyclotomicNumberRing>, SmallBasisEl<Pow2CyclotomicNumberRing>), 
#     rhs: &(SmallBasisEl<Pow2CyclotomicNumberRing>, SmallBasisEl<Pow2CyclotomicNumberRing>)
# ) -> (SmallBasisEl<Pow2CyclotomicNumberRing>, SmallBasisEl<Pow2CyclotomicNumberRing>, SmallBasisEl<Pow2CyclotomicNumberRing>) {
#     let (c0, c1) = (&lhs.0, &lhs.1);
#     let (c0_prime, c1_prime) = (&rhs.0, &rhs.1);
#     let lift_to_multiplication_ring_rnsconv = AlmostExactBaseConversion::new_with(
#         ciphertext_ring.base_ring().as_iter().map(|Zp| zn_64::Zn::new(*Zp.modulus() as u64)).collect::<Vec<_>>(), 
#         multiplication_ring.base_ring().as_iter().map(|Zp| zn_64::Zn::new(*Zp.modulus() as u64)).collect::<Vec<_>>(),
#         Global
#     );
#     debug_assert!(lift_to_multiplication_ring_rnsconv.input_rings().iter().zip(ciphertext_ring.base_ring().as_iter()).all(|(lhs, rhs)| lhs.get_ring() == rhs.get_ring()));
#     debug_assert!(lift_to_multiplication_ring_rnsconv.output_rings().iter().zip(multiplication_ring.base_ring().as_iter()).all(|(lhs, rhs)| lhs.get_ring() == rhs.get_ring()));
#     let lift_to_multiplication_ring = |x: &SmallBasisEl<_, _>| {
#         let mut result = multiplication_ring.get_ring().zero_non_fft();
#         lift_to_multiplication_ring_rnsconv.apply(ciphertext_ring.get_ring().as_matrix_wrt_small_basis(&x), multiplication_ring.get_ring().as_matrix_wrt_small_basis_mut(&mut result));
#         return multiplication_ring.get_ring().do_fft(result);
#     };
#     let unscaled_result = (
#         multiplication_ring.mul(lift_to_multiplication_ring(&c0), lift_to_multiplication_ring(&c0_prime)),
#         multiplication_ring.add(
#             multiplication_ring.mul(lift_to_multiplication_ring(&c0), lift_to_multiplication_ring(&c1_prime)),
#             multiplication_ring.mul(lift_to_multiplication_ring(&c1), lift_to_multiplication_ring(&c0_prime))
#         ),
#         multiplication_ring.mul(lift_to_multiplication_ring(&c1), lift_to_multiplication_ring(&c1_prime))
#     );
#     let scale_down_rnsconv = AlmostExactRescalingConvert::new_with(
#         multiplication_ring.base_ring().as_iter().map(|Zp| zn_64::Zn::new(*Zp.modulus() as u64)).collect::<Vec<_>>(), 
#         vec![ zn_64::Zn::new(*plaintext_ring.base_ring().modulus() as u64) ], 
#         ciphertext_ring.base_ring().as_iter().map(|Zp| multiplication_ring.base_ring().as_iter().position(|Zp2| Zp2.modulus() == Zp.modulus()).unwrap()).collect::<Vec<_>>(),
#         Global
#     );
#     debug_assert!(scale_down_rnsconv.input_rings().iter().zip(multiplication_ring.base_ring().as_iter()).all(|(lhs, rhs)| lhs.get_ring() == rhs.get_ring()));
#     debug_assert!(scale_down_rnsconv.output_rings().iter().zip(ciphertext_ring.base_ring().as_iter()).all(|(lhs, rhs)| lhs.get_ring() == rhs.get_ring()));
#     let scale_down = |x: El<CiphertextRing>| {
#         let mut result = ciphertext_ring.get_ring().zero_non_fft();
#         scale_down_rnsconv.apply(multiplication_ring.get_ring().as_matrix_wrt_small_basis(&multiplication_ring.get_ring().undo_fft(x)), ciphertext_ring.get_ring().as_matrix_wrt_small_basis_mut(&mut result));
#         return result;
#     };
#     return (
#         scale_down(unscaled_result.0),
#         scale_down(unscaled_result.1),
#         scale_down(unscaled_result.2)
#     );
# }
# fn gen_relin_key(
#     ciphertext_ring: &CiphertextRing, 
#     sk: &El<CiphertextRing>, 
#     digits: usize
# ) -> RelinKey {
#     let sk_sqr = ciphertext_ring.pow(ciphertext_ring.clone_el(sk), 2);
#     let mut result0 = GadgetProductRhsOperand::new(ciphertext_ring.get_ring(), digits);
#     let mut result1 = GadgetProductRhsOperand::new(ciphertext_ring.get_ring(), digits);
#     let modulo_q = ciphertext_ring.base_ring().can_hom(&BigIntRing::RING).unwrap();
#     
#     let gadget_vector_len = result0.gadget_vector(ciphertext_ring.get_ring()).len();
#     for i in 0..gadget_vector_len {
#         let (b, a) = rlwe_sample(ciphertext_ring, sk);
#         let factor = result0.gadget_vector(ciphertext_ring.get_ring()).at(i);
#         let (key0, key1) = (ciphertext_ring.add(b, ciphertext_ring.inclusion().mul_ref_map(&sk_sqr, &factor)), a);
#         result0.set_rns_factor(ciphertext_ring.get_ring(), i, key0);
#         result1.set_rns_factor(ciphertext_ring.get_ring(), i, key1);
#     }
#     return (result0, result1);
# }
# fn relinearize(
#     ciphertext_ring: &CiphertextRing, 
#     three_component_ciphertext: (SmallBasisEl<Pow2CyclotomicNumberRing>, SmallBasisEl<Pow2CyclotomicNumberRing>, SmallBasisEl<Pow2CyclotomicNumberRing>), 
#     relin_key: &RelinKey
# ) -> (SmallBasisEl<Pow2CyclotomicNumberRing>, SmallBasisEl<Pow2CyclotomicNumberRing>) {
#     let c2_decomposition = GadgetProductLhsOperand::from_double_rns_ring_with(ciphertext_ring.get_ring(), &three_component_ciphertext.2, relin_key.0.gadget_vector_digits());
#     let mut result0 = three_component_ciphertext.0;
#     ciphertext_ring.get_ring().add_assign_non_fft(&mut result0, &ciphertext_ring.get_ring().undo_fft(c2_decomposition.gadget_product(&relin_key.0, ciphertext_ring.get_ring())));
#     let mut result1 = three_component_ciphertext.1;
#     ciphertext_ring.get_ring().add_assign_non_fft(&mut result1, &ciphertext_ring.get_ring().undo_fft(c2_decomposition.gadget_product(&relin_key.1, ciphertext_ring.get_ring())));
#     return (result0, result1);
# }
let C = create_ciphertext_ring(1 << 8, 100);
let C_mul = create_multiplication_ring(&C);
let P = create_plaintext_ring(1 << 8, 5);
let digits = 2;
let sk = key_gen(&C);
let relin_key = gen_relin_key(&C, &sk, digits);

let message = P.int_hom().map(2);
let ciphertext = enc_sym(&P, &C, &message, &sk);

// we now have to explicity change from double-RNS to small-basis representation to use hom-mul
let ciphertext = (C.get_ring().undo_fft(ciphertext.0), C.get_ring().undo_fft(ciphertext.1));
let ciphertext_sqr = hom_mul_three_component(&P, &C, &C_mul, &ciphertext, &ciphertext);
let ciphertext_sqr_relin = relinearize(&C, ciphertext_sqr, &relin_key);

// finally, we have to explicitly change representation back
let ciphertext_sqr_relin = (C.get_ring().do_fft(ciphertext_sqr_relin.0), C.get_ring().do_fft(ciphertext_sqr_relin.1));
let result = dec(&P, &C, &ciphertext_sqr_relin, &sk);
assert_el_eq!(&P, P.pow(message, 2), result);
```

## Performance

Plugging in larger values in the above example, I find that on my desktop PC, homomorphic multiplication for `N = 16384` and `q ~ 2^100` takes only 50 ms!
This is clearly an impressive speedup compared to the 7.7 seconds we got in the first example [`crate::examples::bfv_impl_v1`].
In fact, a little more benchmarking shows that our implementation almost matches state-of-the-art HE libraries!
For example, for `N = 32768` and `q ~ 800`, I get 0.88 seconds, which is close to the 0.5 seconds that SEAL version 4.1.2 takes on my system.

Nevertheless, there does remain some optimization potential:
 - Currently, the native NTT of HE-Ring is the same as in `feanor-math`, and it is not quite as fast as others.
   You can use the HEXL library (using the `feanor-math-hexl` library), which will give you an even faster NTT!
 - Apart from multiplication, decryption can also profit from a careful use of RNS conversions.
 - When doing the first RNS conversion `q -> qq'` during multiplication, note that the `mod q` part of the result is the same as the input.
   Hence, one can replace [`crate::rnsconv::lift::AlmostExactBaseConversion`] with [`crate::rnsconv::shared_lift::AlmostExactSharedBaseConversion`] which avoids recomputing these values and thus is slightly faster.

Implementing these points is left as an exercise.