An Analysis of Anonymity in The Bitcoin System
An Analysis of Anonymity in The Bitcoin System
An Analysis of Anonymity in The Bitcoin System
1 Introduction
The first Bitcoins were transacted in January 2009, and by June 2011 there were
6.5 million Bitcoins in circulation among an estimated 10,000 users [27]. In recent
months, the currency has seen rapid growth in both media attention and market
price relative to existing currencies. At its peak, a single Bitcoin traded for more
than US$30 on popular Bitcoin exchanges. At the same time, U.S. Senators
and lobby groups in Germany, such as Der Bundesverband Digitale Wirtschaft
(the Federal Association of Digital Economy), raised concerns regarding the
untraceability of Bitcoins and their potential to harm society through tax evasion,
money laundering and illegal transactions. The implications of the decentralized
nature of Bitcoin with respect to the authorities ability to regulate and monitor the
flow of currency is as yet unclear.
Many users adopt Bitcoin for political and philosophical reasons, as much as
pragmatic ones. There is an understanding amongst Bitcoins more technical users
that anonymity is not a primary design goal of the system; however, opinions vary
widely as to how anonymous the system is in practice. Jeff Garzik, a member of
Bitcoins development team, is quoted as saying that it would be unwise to attempt
major illicit transactions with Bitcoin, given existing statistical analysis techniques
deployed in the field by law enforcement.1 However, prior to the present work,
no analysis of anonymity in Bitcoin was publicly available to substantiate or
refute these claims. Furthermore, many other users of the system do not share this
belief. For example, WikiLeaks, an international organization for anonymous
whistleblowers, recently advised its Twitter followers that it now accepts anony-
mous donations via Bitcoin (see Fig. 1) and states the following2:
Bitcoin is a secure and anonymous digital currency. Bitcoins cannot be easily tracked back
to you, and are [sic] safer and faster alternative to other donation methods.
1
http://www.theatlantic.com/technology/archive/2011/06/libertarian-dream-a-site-where-you-
buy-drugs-with-digital-dollars/239776 Retrieved 2011-11-12.
2
http://wikileaks.org/support.html Retrieved: 2011-07-22.
An Analysis of Anonymity in the Bitcoin System 199
Our motivation for this analysis is not to de-anonymize individual users of the
Bitcoin system. Rather, it is to demonstrate, using a passive analysis of a publicly
available dataset, the inherent limits of anonymity when using Bitcoin. This will
ensure that users do not have expectations that are not being fulfilled by the system.
In security-related research, there is considerable disagreement over how best to
disclose vulnerabilities [7]. Many researchers favor full disclosure wherein all
information regarding a vulnerability is promptly released. This enables informed
users to promptly take defensive measures. Other researchers favor limited disclo-
sure; while this provides attackers with a window in which to exploit uninformed
users, a mitigation strategy can be prepared and implemented before public
announcement, thus limiting damage (e.g. through a software update). Our analysis
illustrates some potential risks and pitfalls with regard to anonymity in the Bitcoin
system. However, there is no central authority that can fundamentally change the
systems behavior. Furthermore, it is not possible to prevent analysis of the existing
transaction history.
There are also two noteworthy features of the dataset when compared to conten-
tious social network datasets (e.g. the Facebook profiles of Harvard University
students) [18]. First, the delineation between what is considered public and private
is clear: the entire history of Bitcoin transactions is publicly available. Secondly,
the Bitcoin system does not have a usage policy. After joining Bitcoins peer-to-
peer network, a client can freely request the entire history of Bitcoin transactions;
no crawling or scraping is required.
Thus, we believe the best strategy to minimize the threat to user anonymity is to
be descriptive about the risks of the Bitcoin system. We do not identify individual
users apart from those in the case study, but we note that it is not difficult for other
groups to replicate our work. Indeed, given the passive nature of our analysis, other
parties may already be conducting similar analyses.
200 F. Reid and M. Harrigan
2 Related Work
this dataset as a proxy for studying multi-scale human mobility and as a tool for
computing geographic borders inherent to human mobility.
Grinberg [13] considered some of the legal issues that may be relevant to
Bitcoin in the United States. For example, does Bitcoin violate the Stamp Payments
Act of 1862? The currency can be used as a token for a less sum than $1, intended
to circulate as money or to be received or used in lieu of lawful money of the United
States. However, the authors of the act could not have conceived of digital
currencies at the time of its writing and therefore Bitcoin may not fall under
its scope. Grinberg believes that Bitcoin is unlikely to be a security, or more
specifically an investment contract, and therefore does not fall under the
Securities Act of 1933. He also believes that the Bank Secrecy Act of 1970 and
the Money Laundering Control Act of 1986 pose the greatest risk for Bitcoin
developers, exchanges, wallet providers, mining pool operators, and businesses
that accept Bitcoins. These acts require certain kinds of financial businesses,
even if they are located abroad, to register with a bureau of the United States
Department of the Treasury known as the Financial Crimes Enforcement Network
(FinCEN). The legality of Bitcoin is outside the scope of our work, but is interesting
nonetheless.
2.2 Anonymity
Previous work has shown the difficulty in maintaining anonymity in the context
of networked data and online services that expose partial user information.
Narayanan and Shmatikov [21] and Backstrom et al. [4] considered privacy attacks
that identify users using the structure of networks, and showed the difficulty in
guaranteeing anonymity in the presence of network data. Crandall et al. [9] infer
social ties between users where none are explicitly stated by looking at patterns of
coincidences or common off-network co-occurrences. Gross and Acquisiti [14]
discuss the privacy of early users in the Facebook social network, and how
information from multiple sources could be combined to identify pseudonymous
network users. Narayanan and Shmatikov [20] de-anonymized the Netflix Prize
dataset using information from IMDB3 that had similar user content, showing
that statistical matching between different but related datasets can be used to
attack anonymity. Puzis et al. [23] simulated the monitoring of a communications
network using strategically-located monitoring nodes. They showed that, using
real-world network topologies, a relatively small number of nodes could collabo-
rate to pose a significant threat to anonymity. Korolova et al. [17] studied strategies
for efficiently compromising network nodes to maximize link information
3
http://www.imdb.com
202 F. Reid and M. Harrigan
observed. Altshuler et al. [1] discussed the increasing dangers of attacks targeting
similar types of information, and provided measures of the difficulty of such
attacks, on particular networks. All of this work points to the difficulty in
maintaining anonymity where network data on user behavior is available, and
illustrates how seemingly minor information leaks can be aggregated to pose
significant risks. Security researcher Dan Kaminsky independently performed an
investigation of some aspects of anonymity in the Bitcoin system, and presented his
findings at a security conference [15] shortly after an initial draft of our work was
made public. He investigated the linking problem that we analyze and describe in
Sect. 4.2. In addition to the analysis we conducted, his work investigated the
Bitcoin system from an angle we did not consider the TCP/IP operation of the
underlying peer-to-peer network. Kaminskys TCP/IP layer findings strengthen
the core claims of our work that Bitcoin does not anonymise user activity.
We provide a summary of Kaminskys findings in Sect. 5.2.
The following is a simplified description of the Bitcoin system (see Nakamoto [19]
for a more thorough treatment). Bitcoin is an electronic currency with no central
authority or issuer. There is no central bank or fractional reserve system controlling
the supply of Bitcoins. Instead, they are generated at a predictable rate such that the
eventual total number will be 21 million. There is no requirement for a trusted third-
party when making transactions. Suppose Alice wishes to send a number of
Bitcoins to Bob. Alice uses a Bitcoin client to join the Bitcoin peer-to-peer network.
She then makes a public transaction or declaration stating that one or more
identities that she controls (which can be verified using public-key cryptography),
and which previously had a number of Bitcoins assigned to them, wishes to re-
assign those Bitcoins to one or more other identities, at least one of which is
controlled by Bob. The participants of the peer-to-peer network form a collective
consensus regarding the validity of this transaction by appending it to the public
history of previously agreed-upon transactions (the block-chain). This process
involves the repeated computation of a cryptographic hash function so that the
digest of the transaction, along with other pending transactions, and an arbitrary
nonce, has a specific form. This process is designed to require considerable compu-
tational effort, from which the security of the Bitcoin mechanism is derived.
To encourage users to pay this computational cost, the process is incentivized
using newly generated Bitcoins and/or transaction fees, and so this whole process
is known as mining.
In this chapter, three features of the Bitcoin system are of particular interest.
First, the entire history of Bitcoin transactions is publicly available. This is neces-
sary in order to validate transactions and to prevent double-spending in the absence
of a central authority. The only way to confirm the absence of a previous transaction
An Analysis of Anonymity in the Bitcoin System 203
4
http://www.bitcoin.org
5
http://github.com/gavinandresen/bitcointools