What is a Blockchain?
If you travel the cryptocurrency sphere for long enough, you will notice that certain terms are unavoidable. While some, like HODL and “whales” can be understood by their context, others are more challenging. Let’s take blockchain, for example. If you were to ask a casual crypto geek what blockchain is, he/she will likely answer “a distributed ledger where cryptocurrencies are stored and recorded.”
However, if you ask said geek to explain what that means, you are likely to get glazed looks or – at best – an abbreviated restatement of the already-said definition. This article will explain what exactly is a blockchain and why this technology is poised to change the conversation on modern commerce.
The Magic of Blockchain
To start, blockchains are math. They are not very complicated math; in fact, a person can do the calculations by hand with paper and pencil, if he/she was so inclined. However, the math is concrete. What this means is that, given the right input, this math can be guaranteed to produce an expected output.
This is important, since the two defining characteristics of blockchains are that they are decentralized and that they are fault tolerant.
Blockchains do not need a host computer or server to house the index, as a traditional database does. Even if a database utilizes remote or offsite repositories, there still need to be a computer that manages the log-in and querying functions and that hosts the addresses for the searchable resources or the index. As this is a centralized, controlled asset, the person or persons that control the server controls not only access to the database, but what the database has in it.
For a democratized network, this allocation of control is unacceptable. In a monetary framework, the server manager would be a de facto bank. The manager would be able to set up monetary policies for assets stored or served by the framework and can limit or block certain potential customers. For something like bitcoin, which is supposed to be open, a traditional database could not work.
Blockchain is different as it is distributed and managed by its users. This sound weird, but it is a sophisticated answer to the issue of trust. Even with a bank, the average customer has suspicions of if his/her best interests are being served. Incidents such as the 2007 banking crisis have established a healthy suspicion of institutional intentions. With a bank, however, deposits are protected by the federal government. In a decentralized environment, there are no safety nets.
By putting responsibility for safeguarding the blockchain to those that use it, one is creating a trustless environment. In other words, there is no third party to trust. The way blockchains do this is through cryptography, or the math previously mentioned.
The addresses for the records blocks – which are simply called “blocks” – on the blockchain, as well as the records themselves, are cryptographically encoded. The idea behind the encryption is to create a situation – like a calculus or geometric equation – where it is much simpler to prove the encryption is correct (prove it) than to solve it in the first place. This creates an imbalance of energy expended between proving and solving the encryption, making it energy and time “expensive” to encrypt.
While this saving may not show itself in an individual transaction, consider this: say that a hacker wishes to “double-spend” or cancel the transactions for a set of bitcoin so that they can be spent again. As it is easier to double-spend new coins than coins with established transaction histories (bitcoin are not fungible) and since there is a 100-blocks waiting period from the time a block is discovered for the coins to be issued, the minimum our hacker will have to wait to target the coins is 1,000 minutes, assuming a ten minutes per block generation rate.
Block addresses are in part composed of a hash of the previous block’s address and content. So, to change a previous block, one must also change all the successor blocks, as well. This article will explain later why this would be challenging, but this would effectively create a new branch of the chain – or fork – that must be approved by at least half of the active blockchain users.
So, a hacker must rediscover at least 100 blocks in less time it takes to discover one and must get at least half of the bitcoin nodes to accept the new fork. Therefore, blockchain hacking is thought to be virtually impossible. However, if our hacker was to gather more than half of the network hashing power on the blockchain network, he could force consensus on the fork, effectively allowing the double-spending to be approved until the network reassert control. This is known as a “51 Percent Attack.”
A blockchain offers portability and security in a way that a database cannot. How this works is the genius of the bitcoin revolution.
Understanding How Blockchain Works
Bitcoin, when introduced by the anonymous Satoshi Nakamoto, depended on a proof-of-work system first developed with junk mail control protocol Hashcash, which was proposed in 1997. How Hashcash worked is that it added a timestamp-based hash to the header of an email when it is sent. This hash represents a certain amount of computing processes, which must be completed to validate the email as being legitimate. The recipient would be able to easily confirm the hash.
As the hash could only be faked through brute force – the random guessing of values until a matching value is found – it would be expensive for spammers and mass mailers to send out large numbers of such emails. Each hashed email would represent an investment that must be considered, as if they had traditional postage.
Using a cryptographic algorithm designed and developed by the National Security Agency, Nakamoto modified this approach for his bitcoin proposal. “We need a way for the payee to know that the previous owners did not sign any earlier transactions,” Nakamoto wrote in his paper “Bitcoin: A Peer-to-Peer Electronic Cash System.” “For our purposes, the earliest transaction is the one that counts, so we don’t care about later attempts to double-spend. The only way to confirm the absence of a transaction is to be aware of all transactions. In the mint based model, the mint was aware of all transactions and decided which arrived first. To accomplish this without a trusted party, transactions must be publicly announced, and we need a system for participants to agree on a single history of the order in which they were received. The payee needs proof that at the time of each transaction, most nodes agreed it was the first received.”
“The solution we propose begins with a timestamp server. A timestamp server works by taking a hash of a block of items to be timestamped and widely publishing the hash, such as in a newspaper or Usenet post. The timestamp proves that the data must have existed at the time, obviously, to get into the hash. Each timestamp includes the previous timestamp in its hash, forming a chain, with each additional timestamp reinforcing the ones before it.”
“To implement a distributed timestamp server on a peer-to-peer basis, we will need to use a proof- of-work system like Adam Back’s Hashcash, rather than newspaper or Usenet posts. The proof-of-work involves scanning for a value that when hashed, such as with SHA-256, the hash begins with several zero bits. The average work required is exponential in the number of zero bits required and can be verified by executing a single hash.”
“For our timestamp network, we implement the proof-of-work by incrementing a nonce in the block until a value is found that gives the block’s hash the required zero bits. Once the CPU effort has been expended to make it satisfy the proof-of-work, the block cannot be changed without redoing the work. As later blocks are chained after it, the work to change the block would include redoing all the blocks after it.”
A nonce is a random number that is only used once to make a unique hash. They are used in cryptography to assure the “randomness” of a hash. The current number of zeros need to lead a successful hash for bitcoin is 13.
To paraphrase, a blockchain works like this: imagine you run an online club that sells sweaters. Every member is not only required to record every transaction, but to verify that they are valid. Your group does this through a complicated register timestamp that is energy-expensive to do and impossible to brute force. However, the veracity of the resulting code can be tested just by plugging the code back into the algorithm. A certain number of members must announce successful completion of the transaction to ship the sweater, although all members must automatically verify the transaction.
Groups of transactions are stored in blocks, which are encrypted and stored in the address of the successor block. After a certain amount of time, the block’s “receipts” are destroyed, as the encrypted hash is enough to verify past transactions. This is to save space. As each receipt “block” is directly connected to the previous one, this forms a chain. In case of disagreements, the longest chain – that is, the chain that has completed the most transaction confirmations – wins.
As an incentive to spend the time and energy to do these confirmations, the club holds a contest. While this is not the case for all blockchains (proof-of-stake blockchains, for example, distribute all their tokens at the launch of the chain and reward nodes with transaction fees, while permissioned blockchains offer no rewards at all), the members could compete for club points based on their confirmation activity. The contest is to guess the address of the next block, based on certain rules such as the address must have at least a certain number of leading zero, must be based on the content hash of the current block, and must be lower than a set target established by the club. The first one to guess a valid answer gets the reward, but only after enough time has passed to avoid fraud.
The thing is that it is possible to have multiple correct answers simultaneously. In this case, the member that confirmed the most transactions to get the answer get the reward. All other blocks – depending on the rules of the club – are discarded and treated as “orphans” or are marked as “uncles” – side blocks that can be used to speed up or simplify the confirmation process. Only the winning block is used to advance the chain, however.
It is up to the members to agree on which block is the winner. This is known as “consensus.” Should a member, somehow, manage to win the influence of 51 percent of the processing power of the group, he can decide which block won consensus. With that control, he can promote the embracing of an alternative chain, say one that dictated he won the reward for a certain block or blocks. Even though that reward could have been already spent, the new chain gives custody of those coins to the member, allowing double-spending. Once the membership reassert consensus, they could reset the chain to the right fork, but if the member transferred the coins to a private wallet already, it may require significantly altering the chain to get back the coins.
This would continue until the chain no longer has nothing left to reward, which, at that point, the chain would need to redefine its consensus protocol.
The World of Blockchain
Before we go further, we should define some terms one may have come across:
- Blockchain 1.0: A blockchain defined using Nakamoto’s definition for the bitcoin chain. Blockchain 1.0 are self-contained virtual environments where all elements of the data serving process (the data, the query processor, the servicing devices, the user interface, the programming shell, etc.) are part of the blockchain system. This includes bitcoin, Ethereum, and their successor clones and chains.
- Blockchain 2.0: A blockchain that requires an off-chain oracle to work. An oracle is a theoretical “black box” that can be used to make decisions. In the case of blockchains, this oracle monitors and communicates with real-world elements and external data. Blockchain 2.0 is the key component behind “Blockchain as a Service” or “Blockchain as a Platform” products, which can be used for real-world applications such as inventory maintenance, shipment tracking, money transfers, real-time vote tracking, identification confirmation, and other uses. Technically, smart contracts can be used as an oracle, but as blockchains 2.0 are typically permissioned, a traditional server program can be used.
- Permissioned blockchains: Privately-owned and controlled blockchains that have limited access. Blockchains used to replace corporate or institutional processes are usually permissioned to limit access to sensitive data. Typically, permissioned blockchains are called distributed ledgers.
- Smart contracts: Autonomous programs saved to and running on the blockchain’s virtual space on each node. These programs serve as the executioner of a contract, satisfying the conditions and enforcement of the transaction without the need of a third-party. Smart contracts can be programmed to automate transactions, like a self-servicing vending machine.
- Virtual machine: The virtual environment inside a node’s active memory where a blockchain lives and operates. This virtual machine is created and maintained by a program on the node that prevents interference by the node’s owner or other programs with the operation of the machine.
- Hash: A hexadecimal string of characters that represents the encoding of a set of data. Regardless of the size of the data set, a hash of an expected length will be generated. Order of the data in the set will not affect the hash, although changing an element in the set would. This makes hashes useful for ensuring data integrity. The hash algorithms are not complicated. But as the answer range sought is so narrow, millions or billions oh hash calculations may be needed – all with different nonces – before an acceptable answer is found. Due to the repetitiveness of the calculations and since transaction volume is rewarded, significant processing power is needed.
- Blocks: Merkle tree-encoded batches of accepted transactions. The header of a block is the hash of the previous block, creating an iterative progression. Blocks forged at the same time make temporary forks which are abandoned once consensus determines the correct block.
- Open blockchain: A permissionless blockchain open to the public, granted the node seeking entry meets entry requirements. Open blockchains typically feature at-will enrollment and departure. Open blockchains are usually decentralized, with the blockchain distributed and running on all nodes. Some blockchains, however, prioritizes nodes that have “staked” a certain number of tokens with special privileges and responsibilities. As any node – theoretically – can make such a stake, such blockchains are still considered “open.”
- Node: A redistribution point or a communication endpoint. In computer networks, it refers to an electronic device that is connected that can create, receive, or transmit relevant data. Regarding blockchains, it is an electronic device running a valid copy of the blockchain software.
- Consensus: The process of making decisions for a blockchain. For an open blockchain, each node represents a vote. For some decisions, such as transaction and block confirmations and winning chain calculations, the decision is made by the blockchain software automatically based on the blockchain’s established rules. Other decisions, such as rules or software changes, are proposed, with the blockchain software prompting node operators to vote. Approved proposals cause a fork in the blockchain, with non-complying nodes submitting blocks to the “minority” fork. Under certain circumstances, the “minority” fork is not abandoned, but spun off into a new blockchain with rules and tokens different from the “majority” forks.
- Fork: A disagreement among nodes manifested in the temporary or permanent branching of the blockchain. There are different types of forks:
- Hard fork: A fork created due to a rule change where blocks made with nodes running the unmodified software are considered invalid to nodes running the modified software. An example of a hard fork is the decision to make victims of the DAO hack “whole” by reversing the funneling of transferred tokens seized by the hacker. The nodes opposed to the fork broke off from Ethereum to form Ethereum Classic.
- Soft fork: A fork where blocks made by those that refused the change are still considered valid by those that have changed. This could create a fork that would merge with the main once the node accepts the change.
- User activated soft fork: A type of soft fork where a rule change is attempted with consensus support of the miners.
- Consortium blockchain: A permissioned blockchain that allow multiple businesses and organizations to run a node. These nodes may or may not have full access to the blockchain’s data.
- Clone: A blockchain based on a modification of an existing blockchain’s operating software. As most open blockchains are open-sourced, it is considered easy and common to clone a popular blockchain platform. The most cloned blockchain is the bitcoin blockchain.
- Miner: A transaction verifier that participate in the block-naming contest for proof-of-work blockchains. On other protocols, miners may be called verifiers or other names.
- Proof-of-work: A blockchain consensus protocol that dictates that the amount of work used to hash a transaction or block keeps it from being faked or hacked. Proof-of-work blockchains are mined, with tokens allocated gradually.
- Proof-of-stake: A blockchain consensus protocol that dictates that a verifier’s posted stake in the blockchain’s tokens attest to the verifier’s trustworthiness. Proof-of-stake blockchains are pre-mined, with the tokens available for distribution from the start.
As stated before, blockchains were created to resolve the issue of trust. How can you trust someone that is possibly lying to you?
“Commerce on the Internet has come to rely almost exclusively on financial institutions serving as trusted third parties to process electronic payments,” Satoshi Nakamoto, the creator of bitcoin, wrote in his paper, “Bitcoin: A Peer-to-Peer Electronic Cash System.” “While the system works well enough for most transactions, it still suffers from the inherent weaknesses of the trust based model. Completely non-reversible transactions are not really possible, since financial institutions cannot avoid mediating disputes. The cost of mediation increases transaction costs, limiting the minimum practical transaction size and cutting off the possibility for small casual transactions, and there is a broader cost in the loss of ability to make non-reversible payments for non-reversible services. With the possibility of reversal, the need for trust spreads. Merchants must be wary of their customers, hassling them for more information than they would otherwise need. A certain percentage of fraud is accepted as unavoidable. These costs and payment uncertainties can be avoided in person by using physical currency, but no mechanism exists to make payments over a communications channel without a trusted party.”
“What is needed is an electronic payment system based on cryptographic proof instead of trust, allowing any two willing parties to transact directly with each other without the need for a trusted third party. Transactions that are computationally impractical to reverse would protect sellers from fraud, and routine escrow mechanisms could easily be implemented to protect buyers. [We] propose a solution to the double-spending problem using a peer-to-peer distributed timestamp server to generate computational proof of the chronological order of transactions. The system is secure as long as honest nodes collectively control more CPU power than any cooperating group of attacker nodes.”
Nakamoto’s solution represents the replacement of trust with an automatic process. While a person can be trusted to act on his/her own behalf, that person cannot be trusted to act altruistically. The genius of blockchain is that altruism is not needed or solicited. As the blockchain represent trustless transactions, it can be “trusted” to securely enable processes away from the whims of human nature. As such, blockchain represents a new way to look at data and automation.