Bitcoin foundations

July 28, 2017

While listening to an episode of Radiolab about cryptocurrencies the other day, I realized that despite having a general understanding of how Bitcoin and other cryptocoins work, I couldn't explain any details of the algorithms involved. That seemed like exactly the sort of thing the Recurse Center is great for learning, so I decided to spend a couple days reading up on Bitcoin and trying to implement a simple cryptocurrency of my own.

My first stop was the Bitcoin whitepaper, the original research paper that introduced the idea for Bitcoin. It was published by a figure known as Satoshi Nakamoto (real identity or identities unknown) on a cryptography mailing list in October of 2008, a few months before Bitcoin was actually released. Despite my having only a casual knowledge of cryptography, I found the paper to be clear and elegant. Of course, I as soon as I sat down to implement some of the most important ideas from the paper I realized that I hadn't understood it as well as I thought, but by going back and forth between the code and the paper and discussing it with fellow Recursers I think I understand the basics well enough to try to explain them in a blog post.

Aside: hashes and public key cryptography

Before we begin, there are a couple of cryptographic concepts that are important background for understanding how Bitcoin works: cryptographic hash functions and public key cryptography.

A cryptographic hash function is one of a family of functions that satisfies a particular set of useful properties. The function must take an arbitrary amount of data and produce an output of a fixed size – the SHA256 function, for example, produces 32 bytes of output. The function must always produce the same output for a given input, and a "hash collision" – when two different pieces of data produce the same hash – must be exceedingly unlikely. Moreover, any small change in the input must produce a completely different ouput, so that there is no efficient way to discover the input given only the output. These functions can be used in various ways to confirm the identity of a piece of data. For example, if you have a downloaded file and know what the hash of the file should be, you can run the file through the hash function to verify that it hasn't been tampered with. Changing even one bit of the file would result in a totally different hash value.

Public-key cryptography refers to a set of encryption methods that do not require the parties involved to securely share a key. Unlike traditional encryption methods in which the same key is used to encrypt and decrypt the data, in public-key encryption you need one key to encrypt a message and a different one to decrypt it. That means you can share one key (the public one) but not the other (the private one). These ciphers are a crucial foundation of the internet. Without them, you wouldn't be able to communicate securely without setting up a secure channel ahead of time by agreeing on a secret key – and if you have a channel available to agree on a key, why not just communicate using that? With public-key cryptography, however, it's possible to send someone a message securely knowing only their public key – no secret channel needed.

Normally you would generate a keypair and then share your encryption key, allowing anyone to encrypt a message for you, while keeping your decryption key private so that only you can read a message. But it's also possible to reverse this: you can share the decryption key, meaning anyone can read the message, but keep the encryption key secret, so that only you can write it. This is known as a "signature," and can be used to prove authorship of a document rather than keep the contents of the document secret. Bitcoin relies on the signature variant to ensure that only one person can "spend" (sign) a coin at a time.

A virtual currency

There's an obvious problem with a purely digital currency: what prevents you from making copies of it? There are fields of software dedicated to preventing copying (DRM on movies and music being the one most people have encountered) but they are far from perfect in practice and wouldn't be good enough for a virtual currency. They also rely on someone (the owner) being able to copy the files in question (otherwise no one could ever access them) which is something Bitcoin is designed to avoid.

A Bitcoin is not a file you can pass around or make copies of. It's not like cash, in other words. It's more like paying by check, with the role of the bank being replaced by a public ledger of all transactions that anyone can inspect. If you "own" a Bitcoin, it means that the ledger has recorded that you have the right to spend that Bitcoin. You "give" someone a Bitcoin by transfering that right to spend. Bitcoins have no existence outside of this public ledger.

This raises a few questions. First, how do you actually transfer a coin? Second, who keeps the ledger? That second question gets to the heart of how Bitcoin works, but it will be easier to explain if we first assume there is a public ledger and describe how Bitcoin transactions would work, given that assumption.

Bitcoin transactions rely, as I mentioned, on digital signatures implemented using public-key cryptography. You must have the private key from a keypair in order to sign (encrypt) a piece of data, but anyone who has only the public key can verify (decrypt) it. The Bitcoin ledger keeps track of which keys are allowed to spend (i.e. sign over to another key) which Bitcoins. You transfer a Bitcoin to someone else by signing their public key with your private key, and then adding that to the public ledger (more on how the ledger works shortly). Anyone looking at the ledger can see whether you are still entitled to make that transaction, or whether you've already spent the coin and the transaction should be considered invalid.

Minor extensions to this basic model, namely allowing transactions to have multiple inputs and outputs, enable transactions to send multiple coins or fractions of a coin in addition to exactly one coin – for example, if you have received 15 coins in one transaction and 20 in another and you want to pay someone 21 bitcoins, you can create a transaction that has both of those transactions as inputs and outputs 21 coins to your recipient's public key and 14 (the change) back to your key.

Proof of work

In order to support this transaction system, there are two important requirements the ledger must satisfy. It should be possible to tell whether one transaction preceeded another. That way, you can figure out, if someone tries to spend a coin twice, which one should be considered legitimate. It should also be as difficult as possible to fake the time of a transaction, to prevent fraud. If we allowed ourselves to rely on a trusted third party, this would be as easy as posting all transactions to some trustworthy public site – Facebook, say – where anyone could verify whether a transaction is valid or whether that particular coin has already been spent in a different transaction. You couldn't fake the time of a post unless you had inside access to Facebook (this is where the "trust" part comes in).

Without the trusted third party, things become difficult. I could, for example, post all transactions I personally make on my blog, but I could go back at any time to reorder, forge, or delete transactions, so no one else could rely on that record witout trusting me. But if I'm not willing to trust someone else, I won't be willing to rely on their site either.

Bitcoin solves this problem without relying on any trusted party using a concept called "proof-of-work" and a structure known as the blockchain. It works like this: anyone can run a node in the Bitcoin network. In order to initiate a transaction, you send it to a node, and that node will rebroadcast it to other nodes in the network. Meanwhile, all nodes are working independently to create a "block" in the chain – a group of valid transactions1 to be added to the ledger. The block consists of three important components:

  1. A set of transactions to be added to the ledger.
  2. A nonce value – a number which can be incremented until the hash of the block begins with a certain number of zeroes. The difficulty of finding this nonce rises exponentially proportional to the number of zero bits required.2 This is the "proof-of-work" – there's no way to create a block without doing the computation necessary to find this nonce value.
  3. The hash value of the previous block in the chain, which guarantees that the block was created after the previous block, which gives the block chain the important ordering property I mentioned earlier.

The second and third items together make it exceedingly difficult to forge a copy of the blockchain. Since each block references a previous block, in order to forge a copy of the blockchain that altered a past transaction you'd have to recreate the blockchain all the way back to that transaction – but there's no way to recreate the chain without calculating a new nonce for each block, which is very time-consuming. Meanwhile, the real blockchain will continue to grow. That means, as long as no one person controls more than half of the computers in the network,3 it's very unlikely that the forged blockchain would be able to catch up. Therefore the longest blockchain will be the real one. All an individual node has to do in order to support the original blockchain is to work on the search for the next block for that chain – there's no voting or other explicit consensus mechanism needed.

That's it! There's more to Bitcoin's actual implementation than this simple model of transactions and a blockchain, but that was the crucial innovation that enabled Bitcoin to happen and remains the foundation of its design.

  1. Precisely which transactions get included in a block is a more involved question, and Bitcoin has some additional features to encourage nodes to include transactions in their blocks (e.g. transaction fees). The main restriction is that all the transactions must be valid, because otherwise other nodes won't accept the block and it won't become part of the blockchain, but it is possible for a node to mine an empty block of transactions and this occasionally happens.

  2. The difficulty of mining a new block (i.e. the number of zero bits the block's hash must begin with) can be adjusted over time so that as the network grows, the time it takes for some node to find the next block remains approximately constant. (For Bitcoin, it takes around 10 minutes of work by all the computers in the world mining Bitcoins for one of them to find a nonce that satisfies the proof-of-work.)

  3. You might be thinking: how can we count on more than half of computers in the network to be honest? What incentive do people have to run a network node? Bitcoin incentivizes nodes to search for the next block ("mine" a block, in cryptocurrency jargon) by allowing the node that finds a block to create a certain number of new Bitcoins. All Bitcoins in existence were created this way. It's also possible to introduce transaction fees, so that whoever mines a block gets paid the fees for all the transactions in that block, which Bitcoin did a few years into its existence.