Courtois 2014
Courtois 2014
Courtois 2014
Z. Kotulski et al. (Eds.): CSS 2014, CCIS 448, pp. 131–144, 2014.
c Springer-Verlag Berlin Heidelberg 2014
132 N.T. Courtois, M. Grajek, and R. Naik
certain very hard cryptographic puzzles based on hash functions. However these
solutions are NOT bitcoins. The puzzles are rather part of the bitcoin trust
infrastructure. In fact the puzzles are connected together to form a chain and
as the length of this chain grows, so does the security level. Bitcoins are simply
awarded to people who produce these “Proofs of Work”. Ownership of bitcoins
is achieved through digital signatures: the owner of a certain private key is the
owner of a certain quantity of bitcoins. This private key is the unique way to
transfer the bitcoin to another computer or person.
The operation of so called bitcoin mining or creating bitcoins out of the thin
air is not only possible. It is essential, it is encouraged, and it is a crucial and
necessary part of the Bitcoin ecosystem.
Cryptographic computations executed in a peer-to-peer network with tens of
thousands of independent hashing nodes are the heart of the security assurance
provided by this virtual currency system. It would be very difficult and extremely
costly for one entity to corrupt all these independent miners. The sum of all
this collective computational work provides some sort of solid cryptographic
proof and prevents attacks on this system. This is also how the network polices
itself: miners are expected to approve only correctly formed transactions. Bitcoin
implements a specific sort of distributed and decentralized electronic notary
system without a central authority. Well almost. Certain decisions about how
the system works, what exactly the bitcoin software does and how [3], are still
pretty centralized. In particular mining activity concentrates in extremely few
very large mining pools one of which controls more than 40 % of hash power,
see Table 2 in [9].
In a nutshell, bitcoin miners make money when they find a 32-bit value which,
when hashed together with the data from other transactions with a standard
hash function gives a hash with a certain number of 64 or more leading zeros.
This is an extremely rare event. It is in general believed that there is no way to
produce these data otherwise than by very long and costly computations. This
question of how to improve this process is the central question in this paper.
The goal of the miner is to solve a certain cryptographic puzzle which we will
later call a CISO Hash Problem. The solution will be called a CISO block.
Great majority of miners ignore what exactly they are doing, either running
open source software, or having purchased specialized hardware to do mining
very efficiently. However miners must know that the operation is very timely
and that they need to be permanently connected to the network. The solutions
to these puzzles are linked to each other and form a unique chain of blocks.
This is usually called the block chain. The whole block chain is public and the
whole of it can for example be consulted at http://blockexplorer.com/. All
new blocks which are found need to be broadcast to all network participants as
soon as possible. The miners need to be very reactive and they do it because
it is in their interest. They need to listen to broadcasts in order to receive the
Optimizing SHA256 in Bitcoin Mining 133
data about recent transactions which they are expected to approve. Then they
need to broadcast any solution (CISO block) which they have found as soon as
they found it, because their solution is likely to be part of the "main chain of
blocks" only if it is widely known. Once the solution is known it "discourages"
other miners from searching for the same block. Instead they can concentrate on
searching for the next block which will confirm the present block and will make
the miner be able to claim his a reward for producing this CISO block.
Our goal is to clarify how this system works. In this paper we mostly consider
this to be a static problem which needs to be solved. We refer to [8] for a
discussion on how the difficulty of this problem changes with time.
The problem of bitcoin mining is very closely related to well known problems
in cryptography. One crucial question is as follows: how does the bitcoin min-
ing differ from traditional questions in cryptanalysis of block ciphers and hash
functions and is there a more efficient way to mine bitcoins.
First we are going to briefly describe the problem as a static computation
problem about a certain block cipher. Then we are going to look at how the
problem evolves in time and how solutions to the CISO problem are converted
to shares in the bitcoin currency. Finally we are going to study what the possible
solutions and optimizations are.
SHA-256 is a hash function built from a block cipher following the so called
Davies-Meyer construction. The principle of the Davies-Meyer construction is
that the input value is at the end added to the output and that it transforms an
encryption algorithm into a “hashing” algorithm, a building piece of a standard
hash function. The underlying block cipher has 64 rounds and thus a 2048-bit
expanded internal key (64x32 bits). This key is obtained from the message block
to be compressed, which has 512 bits at the input and is expanded four times
to form this 2048-bit internal key for our block cipher. In one sense on Fig.
1 we convert the problem of bitcoin mining or of solving CISO hash puzzles,
to a specific problem with three distinct applications of the block cipher which
underlies SHA-256 connected together to form certain circuit.
Fig. 1. The Block Hashing Algorithm of bitcoin revisited and seen as a Constrained
Input Small Output (CISO) problem. We see two applications of SHA-256 together with
internal details of the Davies-Meyer construction. We can view it as a triple application
of a specific block cipher. An interesting question is whether there is a more efficient
cryptographic shortcut or inversion attack or some non-trivial optimizations which
allow to save a constant factor. Such optimizations, if they exist, could be worth some
serious money as they would allow to produce bitcoins cheaper
Optimizing SHA256 in Bitcoin Mining 135
the future owner of this freshly created portion of bitcoin currency which
this block is intended to embody. The current CISO block which the miner
is trying to create by solving the current CISO problem and all subsequent
CISO blocks will provide an accumulation of evidence about all these events.
This security guarantee increases with time. In principle miners can decide
which transactions are going to be recognized by the system. However the
transaction fees which will be obtained by the miners for the transactions
included in this block are an incentive to include every single transaction.
4. Timestamp on 32 bits. This is the current time in seconds.
5. Target on 32 bits. More precisely the global variable target is on 256 bits
and what is stored here is a compressed version of target which is frequently
called difficulty. We have difficulty = 6, 119, 726, 089 ≈ 232.5 as of 14
April 2014. We have difficulty · 232 = 1/probability = 2256 /target.
6. Nonce on 32 bits. This nonce is freely chosen by the miner. Interestingly
the nonce has only 32 bits while the current value of target makes that the
probability of obtaining a suitable H2 by accident is as low as 2−64.5 .
This means that the miner needs to be able to generate different versions of
the puzzle with a different Merkle root (or with other differences)
7. Padding+ Len has 384 bits for H1 and 256 bits for H2. Two constants
due to the specification of SHA-256 hash function which is used here twice
with data of different sizes: the input hashed has 640 and 256 bits respectively
in each application of SHA-256. These two values never change.
With respect to the input data requirements and constraints above and the
output constraint H2 < target we have:
Fig. 2. One compression function in SHA-256. It comprises a 256-bit block cipher with
64 rounds, a key expansion mechanism from 512 to 2048 bits, and a final set of eight
32-bit additions
to twice the output size. In our case we have a compression function from 512
to 256 bits, cf. Fig. 2.
The block size in this block cipher is 256 bits, the key size is 512 bits which
is expanded to 64 subkeys on 32 bits each for each of 64 rounds of the cipher.
The first 16 subkeys for the first 16 rounds are identical to the message and are
copied in the same order cf. [15] and later (Algorithm [1])
In addition in order to hash a full message, SHA-256 applies a Merkle-Damgard
padding and length extension which makes it a secure hash function for messages
of variable length. In the pre-processing stage, we must append one binary 1 and
many zeros to the message in such a way that the resulting length is equal to
448 modulo 512, cf. [15]. Then we append the length of the message in bits as a
64-bit big-endian integer. Full SHA-256 is applied twice. In the first application
of SHA-256 in bitcoin mining the message has a fixed length of 640 bits which
requires two applications of the compression function. In the second application
SHA-256 is applied to 256 bits.
It may therefore seem that a bitcoin miner needs to compute the compression
function 3.0 times for each nonce and for each Merkle hash. In the following
sections we are going to work on reducing this figure down to about 1.89.
We recall from Section 2.1 that new bitcoins can be created when the miner
succeeds to hash some data from the bitcoin network together with a 32-bit
random nonce and is able to obtain a number on 256 bits which starts with
a certain number of 64 or more zeros. We called it Constrained Input Small
Output problem or shortly the CISO problem, cf. Fig. 1. The process needs to
be iterated with different values of MerkleRoot and different 32-bit nonces until a
suitable “CISO configuration” is found in which the output satisfies H2 < target
as explained in Section 2.3.
138 N.T. Courtois, M. Grajek, and R. Naik
Remark 1. In Section 3.5 above and here in Section 3.6 we saved 18+2 additions.
However these savings are illusory, because we can save many more additions
by another method. On Fig. 3 we show that one only needs essentially 2 additions
in order to implement the whole round function of SHA256, this instead of 6+1
full adders in each compression function cf. Fig. 1.
We are going to use Carry Save Adders (CSA) in order to delay the propaga-
tion of carries and save a lot of circuit area. The main idea which is attributed to
140 N.T. Courtois, M. Grajek, and R. Naik
Table 1. Key in the first 16 rounds out of 64 in each computation and their provenance
Computation of H1 Computation of H2
Round t 32 bit Wt Description Round t 32 bit Wt Description
last 32 bits of 0 XXXXXXXX H10
0 XXXXXXXX hash Markle Root 1 XXXXXXXX H11
1 XXXXXXXX timestamp 2 XXXXXXXX H12
2 XXXXXXXX target 3 XXXXXXXX H13
nonce (00000000
4 XXXXXXXX H14
3 XXXXXXXX to FFFFFFFF) 5 XXXXXXXX H15
4 0x80000000 Padding starts 6 XXXXXXXX H16
5 0x00000000 | 7 XXXXXXXX H17
6 0x00000000 | Padding
7 0x00000000 | 8 0x80000000 starts
8 0x00000000 | 9 0x00000000 |
9 0x00000000 | 10 0x00000000 |
10 0x00000000 | 11 0x00000000 |
11 0x00000000 | 12 0x00000000 |
12 0x00000000 | Padding
13 0x00000000 Padding ends 13 0x00000000 ends
14 0x00000000 length H 14 0x00000000 length H
15 0x00000280 length L 15 0x00000100 length L
John von Neumann, is to propagate the carries only locally delaying a complete
propagation to the very end. This allows a dramatic reduction in the cost of
implementing multiple additions: three or more additions do NOT cost much
more than one single addition.
More precisely Carry Save Adders (CSA) allow to add n numbers for any n ≥ 3
and to form two numbers which need to be added to obtain the final result. This
is obtained by a successive transformation of 3 numbers into 2 numbers with a
Carry Save Adder (CSA) which has a very low cost and a final addition of 2
numbers. A Carry Save Adder takes 3 integers a, b, c on k bits written in binary
and outputs two numbers ps (partial sum) and sc (shift-carry) as follows:
psi =ai ⊕ bi ⊕ ci ,
(3)
sci+1 =ai bi ∨ ai ci ∨ bi ci .
Fig. 3. How to compute one round of SHA-256 with just two full adders
Theorem 1. [Hash Speed] The amortized average cost of trying one output H2
to see if it has 64 or more leading zeros is at most about 1.89 computations
of the compression function of SHA-256 instead of 3.0, which represents an
improvement by 37%.
Justification: We have saved about 7 rounds and many additions. However known
ASIC implementations also save many additions and actually the designs which
achieve the lowest possible area are not necessarily the fastest. Therefore we
are just going to estimate the RELATIVE savings w.r.t. best standard ASIC
implementations of full SHA256 such as in [5,12,13,16,17,18,19,20,21,25,29,30].
Thus overall cost minus savings are equivalent to a total of
64 + 64 7 7
− =2− ≈ 1.89 (5)
64 64 64
compression functions.
This 1.89 compression functions is equivalent to saving of 37% compared to
the initial cost of 3.0 compression functions as per Fig. 1. It also shows how
much can be gained in bitcoin mining compared to using an optimized SHA256
ASIC implementation three times.
Remark 2. Our problem is essentially the same as a brute force attack on a block
cipher. The same computation is done a very large number of times, yet cheaper,
maybe just a small factor cheaper. It is not correct to believe that block ciphers
are well understood in cryptography. On the contrary, it appears that for more
Optimizing SHA256 in Bitcoin Mining 143
or less any block cipher there may exist an attack which will be just slightly
faster than brute force, see [24]. An efficient low-data software algebraic attack
could also be a solution to this problem, cf. [10,23,28].
5 Conclusion
In this paper we explain how bitcoin electronic currency works and show that
the profitability of bitcoin mining depends on a certain cryptographic constant
which we showed to be at most 1.89. Normally very few people care about this
sort of fine cryptographic engineering details. However here it is different. This
observation allows bitcoin miners to save many millions of dollars each year.
References
1. Aumasson, J.-P., Khovratovich, D.: First Analysis of Keccak (2009),
http://131002.net/data/papers/AK09.pdf
2. Barber, S., Boyen, X., Shi, E., Uzun, E.: Bitter to Better — How to Make Bitcoin a
Better Currency. In: Keromytis, A.D. (ed.) FC 2012. LNCS, vol. 7397, pp. 399–414.
Springer, Heidelberg (2012)
3. Nakamoto, S., et al.: Bitcoin QT: http://bitcoin.org/en/download
4. Boyar, J., Matthews, P., Peralta, R.: Logic Minimization Techniques with Appli-
cations to Cryptology. Journal of Cryptology 26, 280–312 (2013)
5. Chaves, R., Kuzmanov, G., Sousa, L., Vassiliadis, S.: Improving SHA-2 hardware
implementations. In: Goubin, L., Matsui, M. (eds.) CHES 2006. LNCS, vol. 4249,
pp. 298–310. Springer, Heidelberg (2006)
6. Courtois, N.T., Hulme, D., Mourouzis, T.: Solving Circuit Optimisation Problems
in Cryptography and Cryptanalysis. In: Proceedings of SHARCS 2012 Workshop,
UK, pp. 179–191 (2011)
7. Courtois, N.T., Hulme, D., Mourouzis, T.: Multiplicative Complexity and Solv-
ing Generalized Brent Equations With SAT Solvers. In: COMPUTATION TOOLS
2012, The Third International Conference on Computational Logics, Algebras, Pro-
gramming, Tools, and Benchmarking. ARIA, Nice (2012)
8. Courtois, N.T., Grajek, M., Naik, R.: The Unreasonable Fundamental Incertitudes
Behind Bitcoin Mining (2013), http://arxiv.org/abs/1310.7935
9. Courtois, N.T., Bahack, L.: On Subversive Miner Strategies and Block Withholding
Attack in Bitcoin Digital Currency (2014), http://arxiv.org/abs/1402.1718
10. Courtois, N.T., Bard, G.V.: Algebraic Cryptanalysis of the Data Encryption Stan-
dard. In: Galbraith, S.D. (ed.) Cryptography and Coding 2007. LNCS, vol. 4887,
pp. 152–169. Springer, Heidelberg (2007)
11. Courtois, N.T., Mourouzis, T.: Black-Box Collision Attacks on the Compression
Function of the GOST Hash Function. In: Proceedings of 6th International Con-
ference on Security and Cryptography SECRYPT, Spain (2011)
12. Dadda, L., Macchetti, M., Jeff Owen, J.: An ASIC design for a high speed imple-
mentation of the hash function SHA-256 384, 512. In: ACM Great Lakes Sympo-
sium on VLSI, pp. 421–425. ACM (2004)
13. Dadda, L., Macchetti, M., Owen, J.: The Design of a High Speed ASIC Unit for
the Hash Function SHA-256 (384, 512). In: DATE 2004, pp. 70–75. IEEE (2004)
144 N.T. Courtois, M. Grajek, and R. Naik
14. Virtual currencies: Mining digital gold, From the print edition: Finance and eco-
nomics, The Economist (2013)
15. National Institute of Standards and Technology (NIST). FIPS PUB 180-2, SHA256
Standard (2002),
http://csrc.nist.gov/publications/fips/
fips180-2/fips180-2withchangenotice.pdf
16. Feldhofer, M., Rechberger, C.: A Case Against Currently Used Hash Functions in
RFID Protocols. In: Meersman, R., Tari, Z., Herrero, P. (eds.) OTM 2006 Work-
shops. LNCS, vol. 4277, pp. 372–381. Springer, Heidelberg (2006)
17. Knezevic, M.: Efficient Hardware Implementations of Cryptographic Primitives.
PhD thesis, Katholieke Universiteit Leuven (2011)
18. Lee, Y.K., Chan, H., Verbauwhede, I.: Iteration bound analysis and throughput
optimum architecture of SHA-256 (384,512) for hardware implementations. In:
Kim, S., Yung, M., Lee, H.-W. (eds.) WISA 2007. LNCS, vol. 4867, pp. 102–114.
Springer, Heidelberg (2008)
19. Macchetti, M., Dadda, L.: Quasi-Pipelined Hash Circuits. In: IEEE Symposium on
Computer Arithmetic, pp. 222–229 (2005)
20. Michail, H.E., Athanasiou, G., Kritikakou, A., Goutis, C.E., Gregoriades, A., Pa-
padopoulou, V.G.: Ultra High Speed SHA-256 Hashing Cryptographic Module for
IPSec Hardware/Software Codesign. In: SECRYPT, pp. 309–313 (2010)
21. Michail, H.E., Athanasiou, G., Gregoriades, A., Panagiotou, C.L., Goutis, C.E.:
High Throughput Hardware/Software Co-design Approach SHA-256 Hashing
Cryptographic Module. Global Journal of Computer Science and Technology 10,
15 (2010)
22. Guo, J., Matusiewicz, K.: Preimages for Step-Reduced SHA-2 (2008),
http://eprint.iacr.org/2009/477.pdf
23. Heusser, J.: SAT solving - An alternative to brute force bitcoin mining (2013),
http://jheusser.github.io/2013/02/03/satcoin.html
24. Huang, J., Lai, X.: What is the Effective Key Length for a Block Cipher: an Attack
on Every Block Cipher (2012), http://eprint.iacr.org/2012/677
25. Kim, M., Ryou, J., Jun, S.: Efficient Hardware Architecture of SHA-256 Algorithm
for Trusted Mobile Computing. In: Yung, M., Liu, P., Lin, D. (eds.) Inscrypt 2008.
LNCS, vol. 5487, pp. 240–252. Springer, Heidelberg (2009)
26. Matusiewicz, K., Pieprzyk, J., Pramstaller, N.: Rechberger, Ch., Rijmen, V.:
Analysis of simplified variants of SHA-256:
http://www2.mat.dtu.dk/people/K.Matusiewicz/papers/SimplifiedSHA256.pdf
27. Nakamoto, S.: Bitcoin: A Peer-to-Peer Electronic Cash System:
http://bitcoin.org/bitcoin.pdf
28. Raddum, H., Semaev, I.: New Technique for Solving Sparse Equation Systems. In:
ECRYPT STVL (2006), http://eprint.iacr.org/2006/475/
29. Sklavos, N., Koufopavlou, O.G.: On the hardware implementations of the SHA-2
(256, 384, 512) hash functions. ISCAS 5, 153–156 (2003)
30. Tillich, S., Feldhofer, M., Kirschbaum, M., Plos, T., Schmidt, J.-M., Alexander
Szekely, A.: Uniform Evaluation of Hardware Implementations of the Round-Two
SHA-3 Candidates. In: Second SHA-3 Conference (2010),
http://csrc.nist.gov/groups/ST/hash/sha-3/Round2/
Aug2010/documents/papers/TILLICH_sha3hw.pdf