wolfssl-sys 4.0.0

System bindings for WolfSSL
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
# Intel QuickAssist Adapter Asynchronous Support

The wolfSSL / wolfCrypt libraries support hardware crypto acceleration using the Intel QuickAssist adapter. This software has been tested using the Intel DH8970 and DH8950 QuickAssist adapters.

## Overview

Support has been added for wolfCrypt for RSA public/private (CRT/non-CRT), AES CBC/GCM, ECDH/ECDSA, DH, DES3, SHA, SHA224, SHA256, SHA384, SHA512, MD5 and HMAC. RSA padding is done via software. The wolfCrypt tests and benchmarks have asynchronous support. The wolfCrypt benchmark tool support multi-threading. The wolfSSL SSL/TLS async support has been extended to include all PKI, Encryption/Decryption and hashing/HMAC. An async hardware simulator has been added to test the asynchronous support without hardware.

The Intel QuickAssist port files are located in `wolfcrypt/src/port/intel/quickassist.c` and `wolfssl/wolfcrypt/port/intel/quickassist.h`. The QuickAssist memory handling for NUMA and normal malloc is in `wolfcrypt/src/port/intel/quickassist_mem.c`.

The asynchronous crypto files are located at `wolfcrypt/src/async.c` and `wolfssl/wolfcrypt/async.h`. These files are not in the public repository. Please contact info@wolfssl.com if interested in our asynchronous support to request an evaluation.


## Building

1. Download Driver: The latest driver for QAT can be found here: https://www.intel.com/content/www/us/en/download/19734

2. Notes:

* If you have the older driver installed you may need to remove it or unload the module and reboot.
* If you are using the QAT hardware hashing, you may need to disable the params checking, which doesn't support a last partial with 0 length source input. Code runs and works, but parameter checking will fail.
Use `./configure --disable-param-check && sudo make install`
* If you want to use legacy algorithms like RSA 1024 bit then Use `./configure --enable-legacy-algorithms`
* Recommend not using `make -j` due to synchronization issues on dependencies.

3. Setup `QAT` and `wolfssl` next to each other in the same folder.

4. Build QAT Driver

Prerequisites Ubuntu:
`sudo apt-get install libudev-dev pciutils-dev g++ pkg-config libssl-dev`
OR
Prerequisites CentOS:
`sudo yum install pciutils libudev-devel kernel-devel-$(uname -r) gcc openssl-devel`

```sh
mkdir QAT
cd QAT
tar -zxof QAT.L.4.23.0-00001.tar.gz
./configure
sudo make install
...
There is 3 QAT acceleration device(s) in the system:
 qat_dev0 - type: c6xx,  inst_id: 0,  node_id: 1,  bsf: 0000:84:00.0,  #accel: 5 #engines: 10 state: up
 qat_dev1 - type: c6xx,  inst_id: 1,  node_id: 1,  bsf: 0000:85:00.0,  #accel: 5 #engines: 10 state: up
 qat_dev2 - type: c6xx,  inst_id: 2,  node_id: 1,  bsf: 0000:86:00.0,  #accel: 5 #engines: 10 state: up
```

```sh
$ lspci -d 8086: | grep QuickAssist
84:00.0 Co-processor: Intel Corporation C62x Chipset QuickAssist Technology (rev 04)
85:00.0 Co-processor: Intel Corporation C62x Chipset QuickAssist Technology (rev 04)
86:00.0 Co-processor: Intel Corporation C62x Chipset QuickAssist Technology (rev 04)
```

5. Build wolfSSL:

```sh
cd ../wolfssl
./configure --with-intelqa=../QAT --enable-asynccrypt
make
```


## Usage

Running wolfCrypt test and benchmark must be done with `sudo` to allow hardware access. By default the QuickAssist code uses the "SSL" process name via `QAT_PROCESS_NAME` in quickassist.h to match up to the hardware configuration.

Note: `sudo make check` will fail since default QAT configuration doesn't allow multiple concurrent processes to use hardware. You can run each of the make check scripts individually with sudo. The hardware configuration can be customized by editing the `QAT/build/dh895xcc_qa_dev0.conf` file to allow multiple processes.

Here are some build options for tuning your use:

1. `QAT_USE_POLLING_CHECK`: Enables polling check to ensure only one poll per crypto instance.
2. `WC_ASYNC_THREAD_BIND`: Enables binding of thread to crypto hardware instance.
3. `WOLFSSL_DEBUG_MEMORY_PRINT`: Enables verbose malloc/free printing. This option is used along with `WOLFSSL_DEBUG_MEMORY` and `WOLFSSL_TRACK_MEMORY`.
4. `WC_ASYNC_THRESH_NONE`: Disables the default thresholds for determining if software AES/DES3 is used. Otherwise you can define `WC_ASYNC_THRESH_AES_CBC`, `WC_ASYNC_THRESH_AES_GCM` and `WC_ASYNC_THRESH_DES3_CBC` with your own values. The defaults are AES CBC: 1024, AES GCM 128, DES3 1024. If the symmetric operation is over this size it will use QAT hardware. Otherwise software.
5. `WC_ASYNC_NO_CRYPT`: When defined with disable QAT use for AES/DES3.
6. `WC_ASYNC_NO_HASH`: When defined disables the QAT for hashing (MD5,SHA,SHA256,SHA512).
7. `WC_ASYNC_NO_RNG`: When defined disables the QAT DRBG (default for QAT v1.7)
8. `WC_NO_ASYNC_THREADING`: Disables the thread affinity code for optionally linking a thread to a specific QAT instance. To use this feature you must also define `WC_ASYNC_THREAD_BIND`.
9. `WC_ASYNC_BENCH_THREAD_COUNT`: Use specific number of threads for benchmarking.
10. `QAT_HASH_ENABLE_PARTIAL`: Enables partial hashing support, which allows sending blocks to hardware prior to final. Otherwise all hash updates are cached.

The QuickAssist v1.6 driver uses its own memory management system in `quickassist_mem.c`. This can be tuned using the following defines:

1. `USE_QAE_STATIC_MEM`: Uses a global pool for the list of allocations. This improves performance, but consumes extra up front memory. The pre-allocation size can be tuned using `QAE_USER_MEM_MAX_COUNT`.
2. `USE_QAE_THREAD_LS` : Uses thread-local-storage and removes the mutex. Can improve performance in multi-threaded environment, but does use extra memory.

For QuickAssist v1.7 or later the newer usdm memory driver is used directly.

### Recommended wolfSSL Build Options

```sh
$ ./configure --with-intelqa=../QAT --enable-asynccrypt \
    --enable-aesni --enable-intelasm \
    --enable-sp --enable-sp-asm \
    CFLAGS="-DWC_ASYNC_NO_HASH"
```

* `--with-intelqa=../QAT`: Enables the Intel QuickAssist mode.
* `--enable-asynccrypt`: Enables asynchronous cryptography mode.
* `--enable-aesni`: Enables the Intel AES-NI assembly speedups.
* `--enable-intelasm`: Enables the Intel ASM (AVX/AVX2) speedups.
* `--enable-sp`: Enable Single Precision math to speedup standard key sizes and curves.
* `--enable-sp-asm`: Enable Single Precision assembly speedups.
* `WC_ASYNC_NO_HASH`: Disable the QAT hashing and use Intel AVX accelerated software hashing. Overhead for using QAT hashing is not yet well tuned.


### wolfCrypt Test with QAT
```
sudo ./wolfcrypt/test/testwolfcrypt
IntelQA: Instances 2
...
RSA      test passed!
```

### wolfCrypt Benchmark with QAT 8970 (multi-threaded)

Multiple concurrent threads will be started based on the number of CPU's available. If you want to exclude the software benchmarks use `./configure CFLAGS="-DNO_SW_BENCH"`.

```
Intel QuickAssist DH8950 on Intel(R) Xeon(R) CPU E5-2678 v3 @ 2.50GHz:

Recommended wolfSSL build options when benchmarking.
$ ./configure --enable-sp --enable-sp-asm --enable-aesni --enable-intelasm --enable-intelrand --enable-keygen --enable-sha3 --enable-asynccrypt --with-intelqa=../QAT CFLAGS="-DWC_ASYNC_THRESH_NONE -DQAT_MAX_PENDING=40 -DWC_ASYNC_BENCH_THREAD_COUNT=2"
$ make

$ sudo ./wolfcrypt/benchmark/benchmark -rsa_sign -base10 -threads 2 -print
------------------------------------------------------------------------------
 wolfSSL version 4.5.0
------------------------------------------------------------------------------
IntelQA: Instances 18
wolfCrypt Benchmark (block bytes 1048576, min 1.0 sec each)
CPUs: 2
RNG             SW    79 mB took 1.030 seconds,   76.388 mB/s Cycles per byte =  32.65
RNG             SW    79 mB took 1.042 seconds,   75.456 mB/s Cycles per byte =  33.05
AES-128-CBC-enc SW   729 mB took 1.006 seconds,  724.266 mB/s Cycles per byte =   3.44
AES-128-CBC-enc SW   729 mB took 1.007 seconds,  723.825 mB/s Cycles per byte =   3.45
AES-128-CBC-dec SW  5185 mB took 1.000 seconds, 5184.260 mB/s Cycles per byte =   0.48
AES-128-CBC-dec SW  5190 mB took 1.000 seconds, 5189.351 mB/s Cycles per byte =   0.48
AES-192-CBC-enc SW   608 mB took 1.003 seconds,  606.175 mB/s Cycles per byte =   4.11
AES-192-CBC-enc SW   608 mB took 1.004 seconds,  605.855 mB/s Cycles per byte =   4.12
AES-192-CBC-dec SW  4325 mB took 1.000 seconds, 4325.333 mB/s Cycles per byte =   0.58
AES-192-CBC-dec SW  4331 mB took 1.001 seconds, 4325.809 mB/s Cycles per byte =   0.58
AES-256-CBC-enc SW   524 mB took 1.005 seconds,  521.465 mB/s Cycles per byte =   4.78
AES-256-CBC-enc SW   524 mB took 1.006 seconds,  521.190 mB/s Cycles per byte =   4.79
AES-256-CBC-dec SW  3707 mB took 1.000 seconds, 3705.767 mB/s Cycles per byte =   0.67
AES-256-CBC-dec SW  3707 mB took 1.001 seconds, 3703.024 mB/s Cycles per byte =   0.67
AES-128-CBC-enc HW  2443 mB took 1.000 seconds, 2442.819 mB/s Cycles per byte =   1.02
AES-128-CBC-enc HW  2443 mB took 1.000 seconds, 2442.770 mB/s Cycles per byte =   1.02
AES-128-CBC-dec HW  2380 mB took 1.001 seconds, 2378.716 mB/s Cycles per byte =   1.05
AES-128-CBC-dec HW  2380 mB took 1.001 seconds, 2378.657 mB/s Cycles per byte =   1.05
AES-192-CBC-enc HW  2365 mB took 1.002 seconds, 2359.520 mB/s Cycles per byte =   1.06
AES-192-CBC-enc HW  2365 mB took 1.002 seconds, 2359.471 mB/s Cycles per byte =   1.06
AES-192-CBC-dec HW  2417 mB took 1.002 seconds, 2411.874 mB/s Cycles per byte =   1.03
AES-192-CBC-dec HW  2417 mB took 1.002 seconds, 2411.831 mB/s Cycles per byte =   1.03
AES-256-CBC-enc HW  2223 mB took 1.001 seconds, 2221.082 mB/s Cycles per byte =   1.12
AES-256-CBC-enc HW  2218 mB took 1.001 seconds, 2215.793 mB/s Cycles per byte =   1.13
AES-256-CBC-dec HW  2113 mB took 1.002 seconds, 2108.506 mB/s Cycles per byte =   1.18
AES-256-CBC-dec HW  2113 mB took 1.002 seconds, 2108.354 mB/s Cycles per byte =   1.18
AES-128-GCM-enc SW  1919 mB took 1.001 seconds, 1916.366 mB/s Cycles per byte =   1.30
AES-128-GCM-enc SW  2595 mB took 1.001 seconds, 2591.465 mB/s Cycles per byte =   0.96
AES-128-GCM-dec SW  2611 mB took 1.000 seconds, 2610.093 mB/s Cycles per byte =   0.96
AES-128-GCM-dec SW  2218 mB took 1.002 seconds, 2213.073 mB/s Cycles per byte =   1.13
AES-192-GCM-enc SW  2317 mB took 1.001 seconds, 2315.896 mB/s Cycles per byte =   1.08
AES-192-GCM-enc SW  2286 mB took 1.002 seconds, 2281.953 mB/s Cycles per byte =   1.09
AES-192-GCM-dec SW  2207 mB took 1.001 seconds, 2206.098 mB/s Cycles per byte =   1.13
AES-192-GCM-dec SW  1589 mB took 1.002 seconds, 1586.020 mB/s Cycles per byte =   1.57
AES-256-GCM-enc SW  2071 mB took 1.001 seconds, 2069.342 mB/s Cycles per byte =   1.21
AES-256-GCM-enc SW  2108 mB took 1.002 seconds, 2103.268 mB/s Cycles per byte =   1.19
AES-256-GCM-dec SW  2108 mB took 1.001 seconds, 2105.715 mB/s Cycles per byte =   1.18
AES-256-GCM-dec SW  2108 mB took 1.002 seconds, 2103.563 mB/s Cycles per byte =   1.19
AES-128-GCM-enc HW  2427 mB took 1.002 seconds, 2422.522 mB/s Cycles per byte =   1.03
AES-128-GCM-enc HW  2433 mB took 1.002 seconds, 2427.722 mB/s Cycles per byte =   1.03
AES-128-GCM-dec HW  1861 mB took 1.001 seconds, 1860.039 mB/s Cycles per byte =   1.34
AES-128-GCM-dec HW  1861 mB took 1.001 seconds, 1860.019 mB/s Cycles per byte =   1.34
AES-192-GCM-enc HW  2380 mB took 1.000 seconds, 2379.218 mB/s Cycles per byte =   1.05
AES-192-GCM-enc HW  2386 mB took 1.000 seconds, 2384.418 mB/s Cycles per byte =   1.05
AES-192-GCM-dec HW  1971 mB took 1.002 seconds, 1966.480 mB/s Cycles per byte =   1.27
AES-192-GCM-dec HW  1971 mB took 1.002 seconds, 1966.458 mB/s Cycles per byte =   1.27
AES-256-GCM-enc HW  2254 mB took 1.002 seconds, 2249.535 mB/s Cycles per byte =   1.11
AES-256-GCM-enc HW  2254 mB took 1.002 seconds, 2249.487 mB/s Cycles per byte =   1.11
AES-256-GCM-dec HW  1746 mB took 1.001 seconds, 1744.049 mB/s Cycles per byte =   1.43
AES-256-GCM-dec HW  1746 mB took 1.001 seconds, 1744.018 mB/s Cycles per byte =   1.43
CHACHA          SW  1478 mB took 1.000 seconds, 1478.220 mB/s Cycles per byte =   1.69
CHACHA          SW  1347 mB took 1.003 seconds, 1342.833 mB/s Cycles per byte =   1.86
CHA-POLY        SW   949 mB took 1.002 seconds,  946.915 mB/s Cycles per byte =   2.63
CHA-POLY        SW   949 mB took 1.005 seconds,  944.670 mB/s Cycles per byte =   2.64
MD5             SW   603 mB took 1.003 seconds,  601.383 mB/s Cycles per byte =   4.15
MD5             SW   613 mB took 1.005 seconds,  610.413 mB/s Cycles per byte =   4.09
MD5             HW   409 mB took 1.002 seconds,  408.088 mB/s Cycles per byte =   6.11
MD5             HW   409 mB took 1.003 seconds,  407.845 mB/s Cycles per byte =   6.12
POLY1305        SW  2621 mB took 1.000 seconds, 2620.709 mB/s Cycles per byte =   0.95
POLY1305        SW  2616 mB took 1.001 seconds, 2613.824 mB/s Cycles per byte =   0.95
SHA             SW   377 mB took 1.003 seconds,  376.342 mB/s Cycles per byte =   6.63
SHA             SW   383 mB took 1.011 seconds,  378.592 mB/s Cycles per byte =   6.59
SHA             HW   535 mB took 1.005 seconds,  531.941 mB/s Cycles per byte =   4.69
SHA             HW   535 mB took 1.006 seconds,  531.644 mB/s Cycles per byte =   4.69
SHA-224         SW   351 mB took 1.010 seconds,  347.715 mB/s Cycles per byte =   7.17
SHA-224         SW   351 mB took 1.014 seconds,  346.285 mB/s Cycles per byte =   7.20
SHA-224         HW   414 mB took 1.012 seconds,  409.434 mB/s Cycles per byte =   6.09
SHA-224         HW   419 mB took 1.012 seconds,  414.387 mB/s Cycles per byte =   6.02
SHA-256         SW   351 mB took 1.011 seconds,  347.292 mB/s Cycles per byte =   7.18
SHA-256         SW   315 mB took 1.013 seconds,  310.424 mB/s Cycles per byte =   8.03
SHA-256         HW   419 mB took 1.004 seconds,  417.688 mB/s Cycles per byte =   5.97
SHA-256         HW   419 mB took 1.005 seconds,  417.427 mB/s Cycles per byte =   5.98
SHA-384         SW   530 mB took 1.001 seconds,  529.040 mB/s Cycles per byte =   4.71
SHA-384         SW   530 mB took 1.003 seconds,  528.139 mB/s Cycles per byte =   4.72
SHA-384         HW   357 mB took 1.001 seconds,  356.156 mB/s Cycles per byte =   7.00
SHA-384         HW   367 mB took 1.010 seconds,  363.498 mB/s Cycles per byte =   6.86
SHA-512         SW   530 mB took 1.002 seconds,  528.589 mB/s Cycles per byte =   4.72
SHA-512         SW   446 mB took 1.009 seconds,  441.540 mB/s Cycles per byte =   5.65
SHA-512         HW   367 mB took 1.004 seconds,  365.434 mB/s Cycles per byte =   6.83
SHA-512         HW   367 mB took 1.005 seconds,  365.224 mB/s Cycles per byte =   6.83
SHA3-224        SW   236 mB took 1.014 seconds,  232.784 mB/s Cycles per byte =  10.71
SHA3-224        SW   236 mB took 1.018 seconds,  231.794 mB/s Cycles per byte =  10.76
SHA3-224        HW   220 mB took 1.006 seconds,  218.860 mB/s Cycles per byte =  11.40
SHA3-224        HW   236 mB took 1.015 seconds,  232.538 mB/s Cycles per byte =  10.73
SHA3-256        SW   163 mB took 1.000 seconds,  162.463 mB/s Cycles per byte =  15.35
SHA3-256        SW   225 mB took 1.023 seconds,  220.278 mB/s Cycles per byte =  11.32
SHA3-256        HW   692 mB took 1.004 seconds,  689.291 mB/s Cycles per byte =   3.62
SHA3-256        HW   692 mB took 1.007 seconds,  687.092 mB/s Cycles per byte =   3.63
SHA3-384        SW   173 mB took 1.022 seconds,  169.214 mB/s Cycles per byte =  14.74
SHA3-384        SW   173 mB took 1.024 seconds,  168.878 mB/s Cycles per byte =  14.77
SHA3-384        HW   173 mB took 1.023 seconds,  169.202 mB/s Cycles per byte =  14.74
SHA3-384        HW   173 mB took 1.024 seconds,  168.948 mB/s Cycles per byte =  14.76
SHA3-512        SW   121 mB took 1.026 seconds,  117.548 mB/s Cycles per byte =  21.22
SHA3-512        SW   121 mB took 1.027 seconds,  117.375 mB/s Cycles per byte =  21.25
SHA3-512        HW   121 mB took 1.026 seconds,  117.585 mB/s Cycles per byte =  21.21
SHA3-512        HW   121 mB took 1.028 seconds,  117.335 mB/s Cycles per byte =  21.26
HMAC-MD5        SW   608 mB took 1.000 seconds,  608.096 mB/s Cycles per byte =   4.10
HMAC-MD5        SW   613 mB took 1.004 seconds,  611.102 mB/s Cycles per byte =   4.08
HMAC-MD5        HW   414 mB took 1.001 seconds,  413.762 mB/s Cycles per byte =   6.03
HMAC-MD5        HW   414 mB took 1.004 seconds,  412.554 mB/s Cycles per byte =   6.05
HMAC-SHA        SW   383 mB took 1.011 seconds,  378.446 mB/s Cycles per byte =   6.59
HMAC-SHA        SW   383 mB took 1.013 seconds,  377.729 mB/s Cycles per byte =   6.60
HMAC-SHA        HW   535 mB took 1.008 seconds,  530.760 mB/s Cycles per byte =   4.70
HMAC-SHA        HW   514 mB took 1.009 seconds,  509.292 mB/s Cycles per byte =   4.90
HMAC-SHA224     SW   267 mB took 1.008 seconds,  265.316 mB/s Cycles per byte =   9.40
HMAC-SHA224     SW   351 mB took 1.012 seconds,  346.982 mB/s Cycles per byte =   7.19
HMAC-SHA224     HW   404 mB took 1.003 seconds,  402.579 mB/s Cycles per byte =   6.20
HMAC-SHA224     HW   393 mB took 1.011 seconds,  388.951 mB/s Cycles per byte =   6.41
HMAC-SHA256     SW   294 mB took 1.007 seconds,  291.426 mB/s Cycles per byte =   8.56
HMAC-SHA256     SW   351 mB took 1.012 seconds,  347.205 mB/s Cycles per byte =   7.18
HMAC-SHA256     HW   419 mB took 1.004 seconds,  417.677 mB/s Cycles per byte =   5.97
HMAC-SHA256     HW   419 mB took 1.009 seconds,  415.514 mB/s Cycles per byte =   6.00
HMAC-SHA384     SW   530 mB took 1.002 seconds,  528.479 mB/s Cycles per byte =   4.72
HMAC-SHA384     SW   530 mB took 1.007 seconds,  526.093 mB/s Cycles per byte =   4.74
HMAC-SHA384     HW   367 mB took 1.004 seconds,  365.498 mB/s Cycles per byte =   6.82
HMAC-SHA384     HW   367 mB took 1.006 seconds,  364.878 mB/s Cycles per byte =   6.84
HMAC-SHA512     SW   530 mB took 1.002 seconds,  528.616 mB/s Cycles per byte =   4.72
HMAC-SHA512     SW   530 mB took 1.006 seconds,  526.513 mB/s Cycles per byte =   4.74
HMAC-SHA512     HW   367 mB took 1.003 seconds,  365.816 mB/s Cycles per byte =   6.82
HMAC-SHA512     HW   367 mB took 1.007 seconds,  364.560 mB/s Cycles per byte =   6.84
RSA     1024 key gen   SW     40 ops took 1.191 sec, avg 29.780 ms, 33.580 ops/sec
RSA     1024 key gen   SW     40 ops took 1.428 sec, avg 35.694 ms, 28.016 ops/sec
RSA     2048 key gen   SW     40 ops took 4.154 sec, avg 103.853 ms, 9.629 ops/sec
RSA     2048 key gen   SW     40 ops took 5.687 sec, avg 142.172 ms, 7.034 ops/sec
RSA     1024 key gen   HW    120 ops took 1.064 sec, avg 8.866 ms, 112.790 ops/sec
RSA     1024 key gen   HW    120 ops took 1.072 sec, avg 8.932 ms, 111.953 ops/sec
RSA     2048 key gen   HW     40 ops took 1.389 sec, avg 34.717 ms, 28.804 ops/sec
RSA     2048 key gen   HW     40 ops took 1.437 sec, avg 35.935 ms, 27.828 ops/sec
RSA     2048 sign      SW   1000 ops took 1.046 sec, avg 1.046 ms, 956.197 ops/sec
RSA     2048 sign      SW   1000 ops took 1.052 sec, avg 1.052 ms, 950.320 ops/sec
RSA     2048 verify    SW  32300 ops took 1.001 sec, avg 0.031 ms, 32271.670 ops/sec
RSA     2048 verify    SW  32200 ops took 1.003 sec, avg 0.031 ms, 32117.110 ops/sec
RSA     2048 sign      HW  12300 ops took 1.001 sec, avg 0.081 ms, 12288.056 ops/sec
RSA     2048 sign      HW  19600 ops took 1.003 sec, avg 0.051 ms, 19537.967 ops/sec
RSA     2048 verify    HW 116000 ops took 1.000 sec, avg 0.009 ms, 115971.935 ops/sec
RSA     2048 verify    HW 118000 ops took 1.000 sec, avg 0.008 ms, 117962.707 ops/sec
DH      2048 key gen   SW   2080 ops took 1.000 sec, avg 0.481 ms, 2079.830 ops/sec
DH      2048 key gen   SW   2120 ops took 1.016 sec, avg 0.479 ms, 2086.548 ops/sec
DH      2048 agree     SW   2100 ops took 1.023 sec, avg 0.487 ms, 2053.478 ops/sec
DH      2048 agree     SW   2100 ops took 1.026 sec, avg 0.489 ms, 2046.644 ops/sec
DH      2048 key gen   HW  43720 ops took 1.000 sec, avg 0.023 ms, 43712.257 ops/sec
DH      2048 key gen   HW  43320 ops took 1.000 sec, avg 0.023 ms, 43299.560 ops/sec
DH      2048 agree     HW  32500 ops took 1.001 sec, avg 0.031 ms, 32471.874 ops/sec
DH      2048 agree     HW  39400 ops took 1.001 sec, avg 0.025 ms, 39351.757 ops/sec
ECC      256 key gen   SW  41320 ops took 1.001 sec, avg 0.024 ms, 41298.692 ops/sec
ECC      256 key gen   SW  41280 ops took 1.001 sec, avg 0.024 ms, 41258.674 ops/sec
ECC      256 key gen   HW  41320 ops took 1.000 sec, avg 0.024 ms, 41309.127 ops/sec
ECC      256 key gen   HW  41280 ops took 1.001 sec, avg 0.024 ms, 41244.118 ops/sec
ECDHE    256 agree     SW  13400 ops took 1.005 sec, avg 0.075 ms, 13328.731 ops/sec
ECDHE    256 agree     SW  13300 ops took 1.006 sec, avg 0.076 ms, 13221.465 ops/sec
ECDSA    256 sign      SW  29900 ops took 1.002 sec, avg 0.034 ms, 29841.744 ops/sec
ECDSA    256 sign      SW  30000 ops took 1.003 sec, avg 0.033 ms, 29910.091 ops/sec
ECDSA    256 verify    SW  10700 ops took 1.006 sec, avg 0.094 ms, 10641.471 ops/sec
ECDSA    256 verify    SW  10700 ops took 1.009 sec, avg 0.094 ms, 10604.105 ops/sec
ECDHE    256 agree     HW  26600 ops took 1.000 sec, avg 0.038 ms, 26594.522 ops/sec
ECDHE    256 agree     HW  19000 ops took 1.002 sec, avg 0.053 ms, 18964.479 ops/sec
ECDSA    256 sign      HW  22300 ops took 1.001 sec, avg 0.045 ms, 22286.137 ops/sec
ECDSA    256 sign      HW  22000 ops took 1.002 sec, avg 0.046 ms, 21963.146 ops/sec
ECDSA    256 verify    HW  12600 ops took 1.002 sec, avg 0.080 ms, 12569.531 ops/sec
ECDSA    256 verify    HW  12600 ops took 1.005 sec, avg 0.080 ms, 12542.829 ops/sec
Benchmark complete
RNG             SW  151.844 mB/s
AES-128-CBC-enc SW 1448.090 mB/s
AES-128-CBC-dec SW 10373.612 mB/s
AES-192-CBC-enc SW 1212.030 mB/s
AES-192-CBC-dec SW 8651.141 mB/s
AES-256-CBC-enc SW 1042.655 mB/s
AES-256-CBC-dec SW 7408.791 mB/s
AES-128-CBC-enc HW 4885.588 mB/s
AES-128-CBC-dec HW 4757.373 mB/s
AES-192-CBC-enc HW 4718.991 mB/s
AES-192-CBC-dec HW 4823.705 mB/s
AES-256-CBC-enc HW 4436.875 mB/s
AES-256-CBC-dec HW 4216.860 mB/s
AES-128-GCM-enc SW 4507.831 mB/s
AES-128-GCM-dec SW 4823.166 mB/s
AES-192-GCM-enc SW 4597.849 mB/s
AES-192-GCM-dec SW 3792.119 mB/s
AES-256-GCM-enc SW 4172.610 mB/s
AES-256-GCM-dec SW 4209.278 mB/s
AES-128-GCM-enc HW 4850.244 mB/s
AES-128-GCM-dec HW 3720.058 mB/s
AES-192-GCM-enc HW 4763.636 mB/s
AES-192-GCM-dec HW 3932.937 mB/s
AES-256-GCM-enc HW 4499.022 mB/s
AES-256-GCM-dec HW 3488.068 mB/s
CHACHA          SW 2821.053 mB/s
CHA-POLY        SW 1891.585 mB/s
MD5             SW 1211.796 mB/s
MD5             HW  815.933 mB/s
POLY1305        SW 5234.533 mB/s
SHA             SW  754.934 mB/s
SHA             HW 1063.586 mB/s
SHA-224         SW  694.001 mB/s
SHA-224         HW  823.821 mB/s
SHA-256         SW  657.716 mB/s
SHA-256         HW  835.115 mB/s
SHA-384         SW 1057.178 mB/s
SHA-384         HW  719.655 mB/s
SHA-512         SW  970.129 mB/s
SHA-512         HW  730.657 mB/s
SHA3-224        SW  464.579 mB/s
SHA3-224        HW  451.398 mB/s
SHA3-256        SW  382.741 mB/s
SHA3-256        HW 1376.382 mB/s
SHA3-384        SW  338.092 mB/s
SHA3-384        HW  338.150 mB/s
SHA3-512        SW  234.923 mB/s
SHA3-512        HW  234.921 mB/s
HMAC-MD5        SW 1219.198 mB/s
HMAC-MD5        HW  826.316 mB/s
HMAC-SHA        SW  756.175 mB/s
HMAC-SHA        HW 1040.052 mB/s
HMAC-SHA224     SW  612.297 mB/s
HMAC-SHA224     HW  791.530 mB/s
HMAC-SHA256     SW  638.631 mB/s
HMAC-SHA256     HW  833.191 mB/s
HMAC-SHA384     SW 1054.571 mB/s
HMAC-SHA384     HW  730.376 mB/s
HMAC-SHA512     SW 1055.130 mB/s
HMAC-SHA512     HW  730.377 mB/s
RSA   1024 key gen   SW 61.596 ops/sec
RSA   2048 key gen   SW 16.663 ops/sec
RSA   1024 key gen   HW 224.743 ops/sec
RSA   2048 key gen   HW 56.632 ops/sec
RSA   2048 sign      SW 1906.517 ops/sec
RSA   2048 verify    SW 64388.780 ops/sec
RSA   2048 sign      HW 31826.022 ops/sec
RSA   2048 verify    HW 233934.642 ops/sec
DH    2048 key gen   SW 4166.378 ops/sec
DH    2048 agree     SW 4100.122 ops/sec
DH    2048 key gen   HW 87011.816 ops/sec
DH    2048 agree     HW 71823.630 ops/sec
ECC    256 key gen   SW 82557.366 ops/sec
ECC    256 key gen   HW 82553.245 ops/sec
ECDHE  256 agree     SW 26550.196 ops/sec
ECDSA  256 sign      SW 59751.835 ops/sec
ECDSA  256 verify    SW 21245.576 ops/sec
ECDHE  256 agree     HW 45559.001 ops/sec
ECDSA  256 sign      HW 44249.283 ops/sec
ECDSA  256 verify    HW 25112.360 ops/sec
IntelQA: Stop
```

### wolfCrypt Benchmark with QAT (single-threaded)

To use the benchmark tool against hardware in single threaded mode build the library with `CFLAGS="-DWC_NO_ASYNC_THREADING"`.

```
sudo ./wolfcrypt/benchmark/benchmark -rsa_sign -dh -ecc
IntelQA: Instances 2
wolfCrypt Benchmark (block bytes 1048576, min 1.0 sec each)
RSA     2048 public    HW 161000 ops took 1.000 sec, avg 0.006 ms, 160989.829 ops/sec
RSA     2048 private   HW  18600 ops took 1.002 sec, avg 0.054 ms, 18566.416 ops/sec
DH      2048 key gen   HW  48945 ops took 1.000 sec, avg 0.020 ms, 48931.782 ops/sec
DH      2048 agree     HW  43300 ops took 1.001 sec, avg 0.023 ms, 43248.876 ops/sec
ECDHE    256 agree     HW  26400 ops took 1.001 sec, avg 0.038 ms, 26382.639 ops/sec
ECDSA    256 sign      HW  23900 ops took 1.004 sec, avg 0.042 ms, 23810.849 ops/sec
ECDSA    256 verify    HW  13800 ops took 1.000 sec, avg 0.072 ms, 13799.878 ops/sec
IntelQA: Stop
```

### wolfSSL Asynchronous Test Mode

Enable asynccrypt alone to use async simulator.
`./configure --enable-asynccrypt`


## Debugging

To enable debug messages:
`./configure --enable-asynccrypt --with-intelqa=../QAT --enable-debug --disable-shared CFLAGS="-DQAT_DEBUG" && make`


## Support

For questions or issues email us at support@wolfssl.com.