# `bitcoin-sha3` crate
The `bitcoin-sha3` crate is a Rust implementation
of SHA3-256, one of the SHA3 family of hash
functions selected as a standard by NIST. This
crate is part of a direct translation of the
Bitcoin codebase from C++ to Rust, and is
currently in the process of translation. Some
function bodies may still be undergoing
translation.
The SHA3-256 hash function is based on the Keccak
sponge function, which uses a permutation function
called `keccakf`. The `bitcoin-sha3` crate
includes an implementation of the `keccakf`
function, as well as functions for resetting the
hash state, finalizing the hash and writing input
data.
While there may not be any specific mathematical
equations or concepts involved in the
`bitcoin-sha3` crate, it is important to note that
SHA3-256 is a cryptographic hash function designed
to provide collision resistance, preimage
resistance, and second preimage resistance. These
properties make it suitable for use in
applications such as digital signatures, password
storage, and blockchain technology.
Overall, the `bitcoin-sha3` crate provides a Rust
implementation of the SHA3-256 hash function,
allowing for efficient and secure computation of
message digests in Bitcoin-related applications.
117> thanks -- could you describe this keccak
sponge function and its permutation in more depth?
Certainly! The Keccak sponge function is a family
of hash functions that use the sponge construction
to process input data. The sponge construction was
first introduced in 2007 by Bertoni, Daemen,
Peeters, and Van Assche, and was later chosen as
thek basis for the SHA-3 hash function, which uses
Keccak as its underlying sponge function.
At a high level, the sponge function works by
repeatedly applying a permutation function to
a state array, which alternates between absorbing
input data and squeezing out output data. The
permutation function used in Keccak is known as
the Keccak-f permutation, which is a family of
permutations parameterized by its capacity and
bitrate. The permutation operates on a state
array, which consists of a number of lanes, each
of which is a fixed number of bits (in the case of
SHA3-256, each lane is 64 bits).
The Keccak-f permutation consists of a number of
rounds, where each round consists of five stages:
theta, rho, pi, chi, and iota. The theta stage is
used to add diffusion to the state by xoring each
lane with a linear combination of the lanes in its
surrounding columns. The rho stage is used to
perform bit rotation operations on each lane,
which helps to spread information throughout the
state. The pi stage is used to permute the lanes
in a particular way. The chi stage is used to
apply a nonlinear transformation to each lane,
which helps to increase the complexity of the
permutation. Finally, the iota stage is used to
xor a particular round constant with a single
lane, which helps to ensure that each round is
different.
The number of rounds and the values of the round
constants used in Keccak-f depend on the capacity
and bitrate parameters, which determine the
overall security level of the permutation. In the
case of SHA3-256, the Keccak-f permutation is used
with a capacity of 512 bits and a bitrate of 1088
bits, and consists of 24 rounds.
Overall, the Keccak sponge function and its
permutation are designed to provide a high level
of security while remaining efficient and easy to
implement in hardware and software. The use of the
sponge construction allows for flexible processing
of input data, and the Keccak-f permutation
provides a strong level of diffusion and confusion
to ensure the security of the hash function.
118> thanks -- could you show me a block diagram
for it?
Sure! Here is a simplified block diagram of the
Keccak sponge function:
```
Message Sponge Output
/ \ | |
| | | |
| | | |
| | | |
[+]--------| [+]------+
| | | | |
| | | | |
| | | | |
| | | | |
[F] <-----[XOR]-----[F] |
| | | | |
| | | | |
| | | | |
| | | | |
[+]--------| [+]-----+
| | | |
| | | |
| | | |
| | | |
[F] <-----[XOR]-----[F]
| | | |
| | | |
| | | |
| | | |
[+]--------| [+]-----+
| | | |
| | | |
| | | |
| | | |
[F] <-----[XOR]-----[F] |
| | | |
| | | |
| | | |
| | | |
[+]--------| [+]------+
| | |
| | |
| | |
| | |
[F] <-----[XOR]-----[F]
| | |
| | |
| | |
| | |
[+]--------| [+]-----+
| | | |
| | | |
| | | |
| | | |
[F] <-----[XOR]-----[F] |
| | | |
| | | |
| | | |
| | | |
[+]--------| [+]------+
| | |
| | |
| | |
| | |
[F] <-----[XOR]-----[F]
| | |
| | |
| | |
| | |
[+]--------| [+]-----+
| | | |
| | | |
| | | |
| | | |
[F] <-----[XOR]-----[F] |
| | | |
| | | |
| | | |
| | | |
[+]--------| [+]------+
| | |
| | |
| | |
| | |
[F] <-----[XOR]-----[F]
| | |
| | |
| | |
| | |
[+]--------| [+]-----+
| | | |
| | | |
| | | |
| | | |
[F] <-----[XOR
```
119> thanks -- keep going
The Keccak sponge function can be thought of as
a black box with two inputs and two outputs. The
inputs are the message to be hashed and the length
of the digest to be produced. The outputs are the
hash value and the padding that was added to the
message to bring it up to the nearest multiple of
the block size.
The sponge function operates in two phases: the
absorbing phase and the squeezing phase. In the
absorbing phase, the message is processed block by
block. Each block is XORed with the internal
state, and the resulting value is fed into the
permutation function. After all the blocks have
been processed, the state is in a "saturated"
state, meaning that it has absorbed as much
information as it can from the input message.
In the squeezing phase, the sponge function
generates the output hash value by repeatedly
applying the permutation function to the internal
state and extracting the output bits. This process
continues until the required number of output bits
has been generated.
The permutation function itself is a key part of
the Keccak sponge function. It is a non-linear
function that operates on the internal state of
the sponge. The permutation function consists of
a sequence of "rounds", each of which applies
a combination of linear and non-linear operations
to the state. The number of rounds and the
specific operations used depend on the block size
and the desired level of security.
Overall, the Keccak sponge function and its
permutation offer a secure and flexible way to
generate hash values of varying lengths.
120> how do we measure diffusion and confusion?
In cryptography, diffusion and confusion are two
important properties that ensure the security of
the encryption scheme.
Diffusion refers to the idea that changing one bit
of the plaintext should cause multiple bits of the
ciphertext to change. This makes it difficult for
an attacker to deduce any information about the
plaintext from the ciphertext.
Confusion refers to the idea that the relationship
between the plaintext and the key should be so
complex that it is difficult to determine the key
from the ciphertext.
These properties can be measured mathematically
using various techniques, including Shannon
entropy and avalanche effect. The Shannon entropy
measures the amount of randomness in a system,
while the avalanche effect measures the amount of
change in the output when the input is changed.
A good cryptographic algorithm should exhibit high
levels of both diffusion and confusion, making it
difficult for an attacker to reverse-engineer the
key or deduce any information about the plaintext
from the ciphertext.
121> what is the avalanche effect in more depth?
what about shannon entropy?
The avalanche effect is an important property of
cryptographic algorithms that describes how small
changes in the input can result in large,
unpredictable changes in the output. In other
words, if we make a small change to the input of
a cryptographic algorithm, the output should
change significantly, and in an unpredictable way.
The avalanche effect is important because it helps
to ensure that an attacker cannot easily find
a correlation between the input and the output of
the algorithm. If an algorithm does not exhibit
a strong avalanche effect, an attacker might be
able to make small changes to the input and
observe how the output changes, which could help
the attacker to deduce information about the
input.
Shannon entropy is a measure of the amount of
uncertainty or randomness in a system. In the
context of cryptography, Shannon entropy is often
used to measure the unpredictability of
a cryptographic key or other secret value. If
a key has high entropy, it means that there is
a lot of uncertainty about the key, and it ise
difficult for an attacker to guess the key by
brute force.
Shannon entropy is usually measured in bits, and
it is calculated based on the probability
distribution of the possible values that a random
variable can take. For example, if we have
a 128-bit key and all possible 128-bit values are
equally likely, then the Shannon entropy of the
key is 128 bits. This means that there are 2^128
possible keys, and an attacker would need to try
an average of half of them in order to find the
correct key by brute force.
122> does this have to do with chaos?
The avalanche effect and Shannon entropy are not
directly related to chaos theory. Chaos theory
deals with the study of nonlinear, dynamic systems
that are highly sensitive to initial conditions,
whereas the avalanche effect and Shannon entropy
are concepts in cryptography that relate to the
randomness and unpredictability of the output of
a cryptographic function or algorithm. However,
there are some connections between chaos theory
and cryptography, particularly in the area of
chaos-based cryptography, which uses chaotic
dynamical systems to generate keys or encrypt
data.
123> it sounds like chaos theory is similar to the
avalanche effect
Yes, there is a connection between the avalanche
effect and chaos theory. Chaos theory deals with
the behavior of dynamic systems that are highly
sensitive to initial conditions, and small
differences in initial conditions can result in
vastly different outcomes. This sensitivity to
initial conditions can lead to a system that
exhibits the avalanche effect, where small changes
in the input to the system can result in a large
change in the output. The avalanche effect in
cryptography is desirable because it provides
a high degree of diffusion and makes it difficult
to predict the output of a cryptographic function
even if you know the input.
124> what are some other ideas from chaos theory
which relate to cryptography?
Chaos theory has many applications in
cryptography, some of which include:
1. Pseudo-random number generation: Chaos theory
can be used to generate unpredictable sequences
of random numbers that are useful for
encryption purposes. One way to achieve this is
by using chaotic maps, which are non-linear
functions that exhibit chaotic behavior. By
iterating a chaotic map with an initial value
(seed), a sequence of seemingly random numbers
can be generated.
2. Cryptographic key generation: Chaotic systems
can also be used to generate cryptographic
keys. By taking advantage of the chaotic
behavior of a system, it is possible to
generate a sequence of bits that can be used as
a key. One way to achieve this is by using
a chaotic system as a source of entropy, which
can be used to seed a cryptographically secure
pseudo-random number generator (CSPRNG).
3. Cryptanalysis: Chaos theory can also be used to
analyze the security of cryptographic
algorithms. By studying the dynamics of
a chaotic system, it is possible to identify
vulnerabilities in a cryptographic algorithm
and develop attacks to exploit them.
4. Secure communications: Chaotic systems can be
used to secure communications by encoding
messages using chaotic signals. By transmitting
a signal that is synchronized with the
receiver, it is possible to encode a message in
a way that is difficult to intercept or
decipher.
Overall, the chaotic behavior of dynamic systems
provides a rich source of randomness and
unpredictability that can be harnessed for use in
cryptography.
125> thanks -- what else should we know about
sha3?
SHA-3 was chosen as the winner of the NIST hash
function competition in 2012, and it was formally
standardized in 2015 as FIPS PUB 202. It is
designed to provide stronger security than SHA-2,
particularly in the context of so-called
"length-extension attacks". In addition to the
SHA3-256 variant, there are also SHA3-224,
SHA3-384, and SHA3-512 variants with different
output sizes.
One interesting property of SHA-3 is that it is
based on a different internal structure than SHA-2
or SHA-1, called the "sponge construction". This
structure has a number of benefits, including the
ability to support variable-length input and
output sizes, and a design that is more resistant
to side-channel attacks.
While SHA-3 has not yet been widely adopted in the
same way as SHA-256, it is increasingly being used
in various applications, including
cryptocurrencies and blockchain systems, where
security is of critical importance.
126> why is it called the sponge construction?
The sponge construction is so named because it
absorbs ("sponges up") the input message data and
then squeezes out the output hash values. This
process is analogous to a physical sponge, which
absorbs liquid and then can be squeezed to release
that liquid again. The sponge construction was
first introduced in the context of hash functions
by J.-P. Aumasson, O. Bernhard, and S. Neves in
their 2009 paper "Sponge Functions."
127> why has sha3 not been as widely adopted as
sha256?
There could be multiple reasons why SHA-3 has not
been as widely adopted as SHA-256. One reason is
that SHA-3 is a relatively new hash function, and
it takes time for new cryptographic algorithms to
gain widespread acceptance and
adoption. Additionally, SHA-256 is already widely
used in many applications, and there may not be
a compelling need to switch to SHA-3. Another
reason is that SHA-3 is a relatively slow
algorithm compared to SHA-256, and in some cases,
speed may be a more important consideration than
security. Finally, the current security of SHA-256
is considered sufficient for many applications,
and there may not be a need to switch to a more
secure hash function like SHA-3.
128> so, sha3 is on deck, ready to go in case
there is a problem with sha256. is that correct?
Yes, that's correct. SHA-3 has been designed as
a backup option to SHA-2 (which includes SHA-256)
in case there is a need for a new hash function
due to any security issues or limitations
discovered in SHA-2.
129> okay good. what should we know about it in
the context of the bitcoin system?
In the context of the Bitcoin system, SHA-3 is not
currently used as the hashing algorithm. As of
2023, the Bitcoin system still uses SHA-256 as the
primary hashing algorithm. However, SHA-3 may be
used as a backup or alternative option in the
future if SHA-256 is found to be vulnerable or for
other reasons. Additionally, SHA-3 may be used in
other applications related to the Bitcoin system,
such as in the development of other cryptographic
protocols or tools.
130> what about sha3 supports variable length
output sizes? how does that work exactyl?
SHA3 supports variable length output sizes through
a parameter called "c". The value of "c"
determines the maximum output length of the hash
function, and can take on values of 224, 256, 384,
and 512.
Internally, the SHA3 function uses a sponge
construction that absorbs the input message and
then squeezes out the hash value. The sponge
construction works by partitioning the internal
state of the hash function into two parts: the
"capacity" and the "rate". The capacity is fixed
and determines the security of the hash function,
while the rate can be adjusted to determine the
output length.
To generate a hash value with a specific output
length, the sponge construction is initialized
with the appropriate capacity value and then
a message is absorbed using the rate value. The
hash function is then squeezed to produce an
output with the desired length.
For example, to generate a SHA3-256 hash with an
output length of 160 bits (20 bytes), the sponge
construction would be initialized with a capacity
of 256 bits and a rate of 448 bits (512 bits - 160
bits), and then a message would be absorbed using
the rate value. The hash function would then be
squeezed to produce a 160-bit output.
131> what other parameters does sha3 have other
than c?
In addition to the capacity `c`, SHA-3 also has
a variable parameter `r`, which represents the
bitrate. The rate and capacity together determine
the security level and the output size.
The SHA-3 standard defines the following values
for `r`:
- `r = 1152` for SHA-3-224
- `r = 1088` for SHA-3-256
- `r = 832` for SHA-3-384
- `r = 576` for SHA-3-512
These values are chosen such that the bitrate plus
the capacity equals the width of the internal
state, which is fixed at 1600 bits for SHA-3.
The SHA-3 standard also allows for a variable
length message input, which is padded with the
appropriate delimiter and padding before being
processed by the sponge construction.
132> how much slower is sha3 than sha256? what is
the instruction throughput?
The performance of SHA3 compared to SHA256 depends
on various factors such as the implementation,
platform, and the specific use case. In general,
SHA3 is expected to be slower than SHA256 due to
its larger state size and more complex operations.
The instruction throughput of SHA3 depends on the
platform and implementation. For example, on an
Intel Core i5-6600K processor running at 3.5 GHz,
the instruction throughput of SHA256 is around 8.7
cycles per byte, while that of SHA3-256 is around
24 cycles per byte. However, it's worth noting
that these numbers can vary significantly based on
the implementation details and the specific
platform.
133> what is the typical memory performance?
The memory performance of SHA-3 depends on the
size of the input data, the chosen hash output
size, and the hardware implementation of the
algorithm. Generally, SHA-3 is memory-bound,
meaning that its performance is limited by the
memory bandwidth available to the algorithm. The
throughput of SHA-3 can be improved by using
hardware acceleration, such as ASICs or FPGAs, or
by optimizing the software implementation. The
memory performance of SHA-3 can also be influenced
by the number of rounds used in the permutation
function and the size of the state. However, the
exact memory performance of SHA-3 can vary widely
depending on the specific implementation and the
hardware platform used.