Ethereum Constantinople is getting activated by January end, and it has been a long time coming. The second hard-fork of the Metropolis update has a lot of interesting features which we will discuss in details in this article.
Alright, yes, we did say “hard fork” and that’s probably going to trigger you a lot, seeing the utter fiasco that hard forks have caused in the past. The crypto-community is still reeling from the ETH-ETC fork and the “hash wars”. However, by the time you are done reading the article, your fears will be put to rest, and you will learn all that you need to know about the Metropolis update.
The Four Stages of Ethereum
For those who don’t know, Ethereum is not just a mode of currency but it is primarily a platform for decentralized applications. The idea is that developers from around the world can rent computational power from Ethereum to run their DApps by executing smart contracts. Since it is such a complex system, the Ethereum developers decided to introduce it in 4 different stages. With each stage, Ethereum “levels up” by incorporating more and more properties making its system more robust and seamless.
The 4 stages are as follows:
- Metropolis: The upcoming phase. Since there are so many updates in Metropolis, it was subdivided into Byzantium and Constantinople
- Serenity: The final stage.
Each of these updates is introduced to the system in the form of a “hard fork”. Now, as we have said before, the term conjures up a negative image, but there is much more to it than meets the eye.
What are Forks?
Most people see a fork as being an eating utensil, an alternative to eating with chopsticks or with your fingers. There is another sense of the word, as in a divergence in a path. A fork in the road or a fork in the river indicate that a choice must be made to proceed; should one go left or should one go right?
With blockchains, the same is true. In the simplest terms, a crypto fork is a disagreement, permanently etched in code. It represents a fight where the token or coin failed to reach consensus, and as a result, the two sides opted to act instead of “talk it out.”
A fork is what happens when the system breaks down, either intentionally or incidentally.
To understand this, one must look at how blockchains work. A blockchain is a connected series of cryptographically-encoded records. These records form a chain based on a Merkle tree where every successive block contains the encoded hash of the previous block. The consensus protocol determines which block rightfully is the successor.
A common misunderstanding with blockchains is that only one block is connected to the chain per link. The truth is that many blocks connect to the chain at each level, and each of these blocks has blocks that link to them. As mining is a competitive activity, there are many “runners-up.” As the connector to the chain is included in the new block itself, each of these blocks “ connects” to the chain.
It is up to the consensus protocol to determine which of these blocks is the “official” one. The remainder – orphans – are treated differently depending on the blockchain. For bitcoin, for example, the “orphans” – or “stales” – with inferior proofs-of-work are ignored. “Orphans” in Ethereum – called “uncles” – however, are used to help improve transaction times and still receive mining rewards despite not being “official.”
“Orphans” can also be the results of a failed attempt to seize control of the blockchain.
However, when the consensus protocol disagrees which block is the “official” one, forks happen. This can be because of a disagreement in a change of protocol, or because of an intentional software shift, or by malicious means. Despite the means, two “official” blocks are named. This creates two new branches, with “official” blocks being named by that branch’s version of the consensus protocol. Each of these branches can be considered a separate blockchain – depending on if the fork is “hard” or “soft” – with both sharing the blocks before the protocol change. All bitcoin clones, for example, share bitcoin’s “genesis block” as the first block of their respective blockchains.
With bitcoin, for example, there have been six forks:
- August 2015: Bitcoin Core (the network’s operating software) forked with Bitcoin XT over disagreements about block size,
- February 2016: Bitcoin Core forked with Bitcoin Classic over disagreements about block size,
- May 2017: Bitcoin Core forked with Bitcoin Unlimited over disagreements about block size;
- August 2017: Bitcoin blockchain (due to blockchain code change and not a consensus software change) forked with Bitcoin Cash over disagreements about block size,
- October 2017: Bitcoin blockchain forked with Bitcoin Gold over proposed switch from Application-Specific Integrated Circuits (ASICs) for mining to Graphics Processing Units (GPUs), and
- February 2018: Bitcoin blockchain forked with Bitcoin Private over should users be able to cloak transaction metadata.
- There are “lesser” forks or intentional deviations of the Bitcoin Core software – as Bitcoin Core is open-source – to create a new coin. This includes Litecoin and Dogecoin (which is a fork of Luckycoin, which is a fork of Litecoin). Most bitcoin derivatives, however, involve new implementations of the respective blockchains.
There are four types of forks: hard forks, soft forks, software forks (permanent and temporary), and a user-activated soft fork.
The most common type of fork is a hard fork. Per Investopedia, “A hard fork (or sometimes hardfork), as it relates to blockchain technology, is a radical change to the protocol that makes previously invalid blocks/transactions valid (or vice-versa). This requires all nodes or users to upgrade to the latest version of the protocol software. Put differently, a hard fork is a permanent divergence from the previous version of the blockchain, and nodes running previous versions will no longer be accepted by the newest version. This essentially creates a fork in the blockchain: one path follows the new, upgraded blockchain, and the other path continues along the old path. Generally, after a short period of time, those on the old chain will realize that their version of the blockchain is outdated or irrelevant and quickly upgrade to the latest version.”
The most infamous hard fork is the Ethereum Classic fork. The fork involved the contentious decision to restore Ethers stolen from the DAO, a venture capital fund-running decentralized autonomous organization. “The DAO’ is the name of a particular DAO, conceived of and programmed by the team behind German startup Slock.it – a company building ‘smart locks’ that let people share their things (cars, boats, apartments) in a decentralized version of Airbnb,” blockchain strategist David Siegel wrote in an editorial. “The DAO launched on 30th April, 2016, with a 28-day funding window. For whatever reason, The DAO was popular, raising over $100m by 15th May, and by the end of the funding period, The DAO was the largest crowdfunding in history, having raised over $150m from more than 11,000 enthusiastic members. The DAO raised far more money than its creators expected. It can be said that the marketing was better than the execution, for during the crowdsale, several people expressed concerns that the code was vulnerable to attack.”
“Unfortunately, while programmers were working on fixing this and other problems, an unknown attacker began using this approach to start draining The DAO of ether collected from the sale of its tokens. By Saturday, 18th June, the attacker managed to drain more than 3.6m ether into a ‘child DAO’ hat has the same structure as The DAO. The price of ether dropped from over $20 to under $13. Several people made attempts to split The DAO to prevent more ether from being taken, but they couldn’t get the votes necessary in such a short time. Because the designers didn’t expect this much money, all the ether was in a single address (bad idea), and we believe the attacker stopped voluntarily after hearing about the fork proposal.”
The decision to restore the ethers violate a key tenet of cryptocurrency: that the code is law and actions needed to correct “lawful” actions, per the consensus protocol, should be avoided. By taking an action more than the consensus protocol, the rules of the game are being changed mid-game, opposition to the move argued. When the blockchain was changed to reverse the transaction. A vocal minority (about three percent of the community) continued to follow the original blockchain, forming Ethereum Classic.
There is no way to bring back together Ethereum and Ethereum Classic. Additionally, a user cannot negotiate between the two blockchains with the same token – the original Ethereum token split along with the blockchain. A hard fork always results in a new token and a new blockchain – even if the two blockchains share the same blocks before the split.
In the case of the DAO hack, on the original blockchain, the tokens were not restored, and the hack was considered an unethical, but legal transfer. The stolen ethers were not exchanged to ETC. The new Ethereum blockchain invalidated the transactions, restoring the tokens to their original owners.
Conversely, a reversible fork is called a “soft fork.” “In terms of blockchain technology, a soft fork (or sometimes softfork) is a change to the software protocol where only previously valid blocks/transactions are made invalid,” per Investopedia. “Since old nodes will recognize the new blocks as valid, a softfork is backward-compatible. This kind of fork requires only a majority of the miners upgrading to enforce the new rules, as opposed to a hard fork which requires all nodes to upgrade and agree on the new version.”
Unlike a hard fork, which rejects nodes that choose not to follow the consensus protocol change, a soft fork only compromises the nodes. “New transaction types can often be added as soft forks, requiring only that the participants (e.g. sender and receiver) and miners understand the new transaction type. This is done by having the new transaction appear to older clients as a “pay-to-anybody” transaction (of a special form), and getting the miners to agree to reject blocks including these transactions unless the transaction validates under the new rules. This is how pay to script hash (P2SH) was added to Bitcoin.”
The way to think about a soft node is like this: a cryptocurrency issues a new protocol rule. The individual node can opt to not follow the new rule – either intentionally or due to ignorance. If that node opts not to follow the new rule, its blocks are automatically made stale and transactions verified by it are rejected by the consensus. As stated previously, this does not stop the node’s blocks from being accepted according to the old protocol and forming a new chain. Once the node adopts the rule, however, its blocks will be recognized by the new consensus and the node’s “rebel chain” will be abandoned.
The DAO hack could have been solved by a soft fork, invalidating the affected tokens. Doing so, however, would create a worse security loophole.
Andreas Antonopoulos describes the difference between hard and soft fork like this:
“If a vegetarian restaurant would choose to add pork to their menu it would be considered to be a hard fork. if they would decide to add vegan dishes, everyone who is vegetarian could still eat vegan, you don’t have to be vegan to eat there, you could still be vegetarian to eat there and meat eaters could eat there too so that’s a soft fork.”
Software forks – also known as a “git fork” – are intentional deviations from the consensus protocol and the operating software to develop new protocols and software or to test changes to the existing protocol. These can be temporary, as in protocol changes to the existing blockchain, or permanent, as in the development of a new blockchain.
Software forks are always suggestions. Developers cannot mandate a software fork to be embraced as changes to the consensus protocol. Software forks can be adopted or rejected at the community’s discretion.
Ethereum, for example, started as a software fork of bitcoin before permanently separating itself. It would return to bitcoin as the first ICO.
User-Activated Soft Forks
The most controversial type of fork is a user-activated soft fork, where a soft fork is pushed through without majority approval of the mining nodes, who are responsible for transaction verification. Instead, the fork seeks the approval of the consensus of full nodes. These nodes must represent an economic majority on the network.
Point to Note
Alright, one thing that you should understand is that hard forks happen all the time. Any project that has gone through any form of updates has definitely hard forked multiple times. More often than not, the community agrees with the updates. The problem comes by when a significant portion of the community doesn’t agree with the updates.
So, having said that, let’s look into Ethereum Metropolis.
Ethereum Metropolis = Byzantium + Constantinople
The four categories that Metropolis is bringing in lots of changes are:
- Smart contract optimization
Let’s go through each one of them one by one.
#1 Smart Contract Optimization
Before understanding the optimizations that Metropolis is bringing, let’s understand what smart contract and gas means.
What are smart contracts?
Contracts are an integral part of the fabric that makes modern society. A smart contract is an idea of establishing an agreement between two complete strangers in a trustless way and to make sure that the contract is honored via self-executing code. Two people can get into a contractual agreement without the need for a third party.
Yup that’s right…the humble vending machine.
Smart Contracts, as with vending machines, works on the IFTTT logic aka the IF-THIS-THEN-THAT logic. In simpler terms, the contract will execute newer instructions and codes if and only if the older code has been executed.
Let’s see how this works in a vending machine. Each and every step that you take acts like a trigger for the next step to execute itself. It is kinda like the domino effect. Suppose you need to take some food from the machine, this is what you will need to do.
Step 1: You give the vending machine some money.
Step 2: You punch in the button corresponding to the item that you want.
Step 3: The item comes out and you collect it.
Ok, so if you review all these steps, then there is one thing that you are going to notice. Really think about it. Will any of the steps work if the previous one hasn’t been executed yet? Each and every one of those steps is directly related to the previous step.
Oh, and there is one more thing that you need to think about.
In the entirety of your interaction with the vending machine, there were only two parties involved, you and the machine. There was no need to have a third party involvement and this is one of the most critical features of smart contracts. Only you and the person that you need to directly involve yourself will need to be part of the interaction. There is absolutely no need to have third parties involved.
What is Ethereum Gas?
Gas is the lifeblood of the Ethereum ecosystem. It is the unit that measures the amount of computational effort that it takes to execute certain operations. Every single thing in Ethereum, be it a transaction, smart contract, or even an ICO takes a certain amount of gas. In fact, gas is what is used to calculate the amount of ether you will need to pay to the network to execute an operation.
Let’s look at an analogy to better understand how gas works. Suppose you need to fill up gas before you go to work, what are you going to do?
- Go to the gas station and specify how much gas you want to fill up in your car.
- Fill up the tank with the specified amount of gas.
- Pay the amount of money you owe.
In the analogy, the car is the operation that you want to execute, like a transaction or smart contrast. The gas station is an Ethereum miner, the gas is Ethereum gas, and the money that you pay is the miner fee. The smallest unit of gas measurement is “wei”.
So, as you may have inferred by now, to get an operation done in Ethereum, the person initiating the transaction or the smart contract must specify a gas limit before they submit it to the miners. If a gas limit has not been specified, the miners will not execute the operation. Now when submitting a gas limit, the following two cases will occur:
- The gas limit is too low
- The gas limit is too high
If the gas limit is too low, the miners will immediately stop all operations. However, the contract initiator must pay for the number of computations that have taken place till then.
If the gas limit is too high, the contract will be executed and the leftover gas will be refunded immediately.
So, in theory, it should make sense to always submit contracts or transactions with a bloated gas limit right? Unfortunately, it doesn’t work like that in reality. Miners in Ethereum are limited by a 6,700,000 gas limit per block, and each and every transaction in Ethereum has a gas limit of 21,000. So, now consider the following two transaction scenarios that a miner may face:
- Executing a simple transaction with a gas limit of 48,000.
- Executing two simple transactions with gas limits of 21,000 each.
Obviously, the miner will choose the second scenario because it makes more sense to them economically.
How Ethereum Gas Works
To better understand how gas works in Ethereum, let’s use an analogy. Suppose you are going on a road trip. Before you do so you go through these steps:
- You go to the gas station and tell them exactly how much gas you want to fill up in your car
- The gas station attendant fills up your car with the amount of gas
- Once your car has been filled up, you pay the amount of money that you owe them for your car.
Now, let’s draw parallels with Ethereum.
- Driving the car is the operation that you want to execute, like executing a function of a smart contract.
- The gas is well….gas.
- The gas station is your miner.
- The money that you paid them is the miner fees.
So, for any operations that users need to execute in ethereum, they have to fill up their contracts with gas for the following operations:
- The general execution of the smart contract.
- Secondly, intrinsic gas, i.e. the gas needed to cover the data.
A Developer’s Relationship with Gas
If you are a developer, then you don’t really need to care about Gas Price at all. All that you should be concerned with is your smart contract’s Gas Limits. The gas limit needs to fall within reasonable limits to make sure that you are not compromising on user experience.
Let’s take an example.
If you are creating a smart contract to handle custody of a million dollars worth of eth, you may not be concerned with a $5 gas limit. However, if you were doing a social platform with $5 gas limits for every post you would bankrupt your users very quickly. You need to design your smart contracts to fit with your use case as processing power and storage come at a high premium in ethereum right now.
Now, this is where we bump into our problem.
The way it stands right now, it is extremely expensive for a developer to host their smart contracts on the Ethereum blockchain. Metropolis is actually optimizing smart contracts for developers, via various methods.
Revert and Returndata
So, we have told you how gas limits work in Ethereum, now consider the following two scenarios:
- The specified gas limit is too low.
- The specified gas limit is too high.
Scenario #1: The Gas Limit is too low
If an operation runs out of gas, then it is reverted back to its original state like nothing actually happened. Having said that, the contract initiator needs to pay the miners fees for their computational costs and, even if it is not executed, the operation still gets added to the blockchain.
This will be clear the moment we go back to our road trip analogy.
Suppose you need to have 35 liters of gas in your car, but you only put in 30 liters. Since you haven’t filled up enough gas in your car, you will not be able to reach your destination. However, you will still need to pay for the 30 liters that your car has consumed right? Burning up fuel is an irreversible process after all.
So, how will this work in the context of our smart contract? Let’s bring up our smart contract and see how it works:
- First, we want to store a number in a variable which will let’s say, cost 45 units of gas
- You add the number with itself, which costs 10 gas
- Then you store the result again, which will cost 45 gas.
So, overall, you need 100 gas. What happens if you just set a gas limit of only 90 gas?
In this scenario, the miner will do 90 gas worth of computation and then charge the sender fees for the 90 gas which turns out to be (90 * 20 Gwei) = 0.0000018 ETH.
Also, the contract reverts back to its original state and the transaction is included in the blockchain.
Scenario #2: The Gas Limit is too high
Now, let’s look at the second scenario. What happens if the gas limit is too high?
After all, this makes sense right? Whatever gas is leftover, it gets refunded to the sender, right? Of course, that sounds like the way to go. However, in reality, it doesn’t really work like that.
Miners are limited by the block gas limit, which is around 6,700,000 gas. A basic transaction (simple transfer of ETH) has at least a gas requirement of 21,000 gas. Miners can only include transactions which add up to be less than or equal to the block gas limit.
During a contract execution if one wishes to go back to an earlier state during the execution, it would require manual triggering of an exception eg. if one were to cancel a transaction then they will have to double spend to stop it from going through. To go back to the original state of the contract, developers can use the “throw” function.
The thing with the “throw” function is that while it does take the contract’s state back to the previous one, it ends up eating up all the gas that the developer has set as the limit. So, in order to counter this, Ethereum is beefing up the “revert” function which will help the developers to go back in state, without eating up more gas than is required. The unused gas will also be refunded back to the developers. Along with that, Ethereum is also bringing in the “returndata” opcode which will enable contracts to return variable sized values.
A bit shift is a bitwise operation which moves each digit in a number’s binary representation to the left or right a required number of times. The “<<” is the logical left shift operator while “>>” is the logical right shift operator.
The way it works is actually pretty straightforward. All the numbers we are going to convert into a binary format and use the 8-bits notation.
So, 5 will be written like this: 00000101
If we want to right shift this by 1, then it will look like this:
00000101 >> 1
This will push the digits to the right side by one. The last digit, which in this case is “1” gets completely removed and lost and a “0” will always get appended in the beginning.
Hence, 00000101 >> 1 will become 00000010 or 2.
Similarly, if we left shift 5 by 1, we will get:
00000101 << 1
Conversely, here the “0” at the end will get pushed out and a “0” will then get appended at the end. So, 00000101 << 1 will give us 00001010 or 10.
Ok, now think about this for a second.
5<<1 becomes 10 and 5>>1 becomes 2.
Did you notice something here?
On left shift, our number got multiplied by 2, while on right shift, it became divided by 2(remove the remainder). In fact, whenever you left shift a binary number N times, then it multiplies that multiplies that number by 2^N times. On the other hand, if you right shift a number N times, then we divide the number by 2^N and the quotient is the answer.
Do you know why this is so useful? It is extremely resource efficient. With just one move not only are executing the exponent operation but you are doing multiplication and division as well. This will help in efficient gas-usage in your contracts.
Optimizing Large-Scale Code
Currently, we have an opcode called EXTCODECOPY. Do you know why we use it? Think about this for a second.
Many contracts need to perform checks on a contract’s bytecode, but do not necessarily need the bytecode itself. Eg. a smart contract may need to check another contract’s bytecode if it belongs to a particular group or not, or it may perform certain analyses on code and whitelist any contract with matching bytecode if the analysis passes.
EXTCODECOPY will help you do exactly that, but it is extremely expensive for large contracts. This is why Ethereum is bringing in the EXTCODEHASH. Instead of returning the complete bytecode, EXTCODEHASH will return the keccak256 hash of a contract’s code. The thought process behind this is to only allow essential data of the contract code to be checked rather than the entirety of the code itself. Once again, this is an economical solution for programmers.
Net Gas Metering
Ethereum is also introducing the SSTORE opcode which will help in “gas metering”. What this does is that not only will it reduce excessive gas costs where it doesn’t matter plus it will charge users for holding data that is permanently stored on the blockchain.
What is the main advantage of this?
It will prevent blockchain bloat for Ethereum.
Ethereum is trying to bring in privacy via zk-SNARKS.
Zk-SNARKS stand for Zero-Knowledge Succinct Non-Interactive Arguments of Knowledge. So, we guess the first question you should ask yourself is….
What are Zero-Knowledge Proofs?
In any trusted system, there are two parties, the prover, and the verifier. Prior to this discovery, it was always assumed that it was the “prover” who would be a malicious one instead of the verifier. These researchers flipped this idea around and questioned the morality of the verifier instead of the prover. They asked two simple questions:
- How can anyone know for sure that the verifier is not going to leak any information?
- How much about the prover should the verifier actually know?
The later actually has some very severe real-life implications. Take the example of password protection.
Imagine you are trying to login to a website using a password. The standard protocol is that the client i.e. you, writes in the password and you send it to the server, the server will then hash your password and then compare it with the hash in their system. If it is a match, then the password is deemed successful.
This system works smoothly, save for one problem.
The server has the plain, original version of your password, which completely compromises your privacy. If the server gets attacked in any way, then the attacker will get hold of your password. This is why innovations like ZKP are essential. Using ZKP, the client would only need to reveal as much information as necessary to the server in order to get in.
That, in essence, is the idea behind ZKPs.
A ZKP has the following three properties:
- Completeness: If the statement is true then an honest verifier can be convinced of it by an honest prover.
- Soundness: If the prover is dishonest, they can’t convince the verifier of the soundness of the statement by lying.
- Zero-Knowledge: If the statement is true, the verifier will have no idea what the statement actually is.
Let’s do a thought experiment and see how ZKP works.
The Billiard Balls
So we have a prover and a verifier, but the verifier is color-blind. The prover has two billiard balls, red and green. Now, color-blind people can’t tell the difference between the two colors, as you can see from the following image:
So, this is the situation right now. The verifier believes that both the balls are of the same color, while the prover wants to prove that the colors are both the same. How are we going to do this?
The verifier takes both the balls and hides it behind their back. Now, he can either switch the balls in his hands or keep them as is. After he is done switching the balls (or not), he presents them to the prover. The prover can obviously see the actual color of the balls and will know instantly whether the switch has been made or not.
The verifier can then repeat this experiment as many times as he wants before he is satisfied with the fact that the prover wasn’t lying about the color of the balls.
Let’s look up the three properties of the ZKP in the experiment given above:
- Completeness: Since the statement was true, the honest prover convinced the honest verifier.
- Soundness: If the prover was dishonest, they couldn’t have fooled the verifier because the test was done multiple times.
- Zero-Knowledge: The prover never saw the verifier switching the balls in his hand.
Zk-SNARKs: An Introduction
So, now that you know what zero-knowledge proofs are and how it can be used in cryptography, let’s introduce ourselves with Zk-SNARKS aka “Zero-Knowledge Succinct Non-Interactive Argument of Knowledge” aka a fine example why developers shouldn’t be allowed to name anything.
Its use in blockchain technology is immense. Let’s check this with an example. Suppose Alice and Bob enter a smart contract agreement, where Bob has to do certain tasks in a row and in proper order in order to receive the payment from Alice. Obviously, Alice wants to be shown proof that Bob is following each and every step sequentially, however, what if Bob works for a security company and is dealing with highly classified data?
Using Zk-Snarks, Bob can show just enough proof to Alice as to what he is doing, without explicitly showing her the actual parts. That way, Bob can be credible without giving away any confidential information.
Functionality of Zk-SNARKS
A typical Zk-Snark has three algorithms: G, P, and V.
- G: A key generator which takes an input “lambda” and a program C. It generates two public keys, a proving key (pk) and a verification key (vk). The value “lambda” should be confidential and not revealed under any circumstances whatsoever.
- P: The prover will be taking three items as input. The inputs being: pk, random input x (another public item), and the actual statement (w) whose validity they want to prove without revealing the details. The P algorithm will generate proof prf such that: prf = P(pk, x, w)
- V: This is the verifier algorithm which checks the prover’s claim. It returns the value “TRUE” or “FALSE”. This algorithm takes in three inputs: vk, x, and prf. The function will act thus: V(vk, x, prf).
Christian Lundkvist shows how the algorithms will work in unison in an example function. Consider the following:
function C(x, w)
return ( sha256(w) == x );
The function takes in two values x and w. It returns TRUE if the SHA-256 hash of w is the same as x otherwise it returns FALSE.
So, how does Zk-Snark work with the above program?
Step #1: Key Generation
The verifier will have to first generate the proving and verifying key using the generator G. The first thing they will have to do is to generate lambda. As mentioned above, lambda is a secret value so they need to be extra careful while doing so. Generation will look like this:
G(C, lambda) = (pk , vk).
Now that the keys are generated, we go to the next stage.
Step #2: Proof Generation
Alright, so the two keys have been generated, the Prover must now generate a proof to validate her statement. She is going to use the proving algorithm P. Refer to the example function given above. Her statement is “w” and the SHA-256 hash is “x”.
prf = P( pk, x, w).
Step #3: Verification
With the proof now generated, it is upto the Verifier to verify the validity. He does it through the verification function V.
That’s how zk-SNARKS work and it will help Ethereum in getting privacy.
Before we get into what account abstraction means, let’s understand what abstraction means. Abstraction means that anyone can use any system or protocol without completely knowing the ins and outs and all the technical details.
Eg. When you use your TV, you don’t need to be an engineer to operate it. You simply press on your remote button to activate the TV. You don’t need to know how pressing the button activates the circuit inside the TV. Abstraction makes a complex technology accessible to the masses by removing the complexities.
Abstraction is what Ethereum plans to achieve in the future. In a hypothetical decentralized future, they envision everyone to use DAPPS without even realizing that they using a DAPP based on Ethereum. They basically want Ethereum to “disappear” in the background. Ethereum is taking a major step towards doing just that by introducing “Account Abstraction”.
As of right now, Ethereum is using the Proof-of-Work (POW) consensus mechanism, the same mining algorithm that is used by Bitcoin.
What is Proof-of-Work?
This is how POW works:
- The miners try to solve cryptographic puzzles to add a block to the blockchain.
- The process requires a lot of effort and computational power.
- The miners then present their block to the bitcoin network.
- The network then checks the authenticity of the block by simply checking the hash, if it is correct then it gets appended to the blockchain.
- So, discovering the required nonce and hash should be difficult, however checking whether it is valid or not should be simple. That is the essence of proof-of-work.
By solving these problems, miners get to add a block to the Ethereum blockchain and get a block reward in return, which as of now is 3 ETH.
What is Difficulty?
So, till now you know that miners use their mining power to mine for blocks. Even though there is no upper cap in the total amount of Ether, you still need to regulate the amount of Ether that is floating around in the system to make sure that the supply-demand equation doesn’t go out of balance.
This is the reason why to keep the block creation time consistent, they have hardcoded a “difficulty” parameter into their system, which makes block creation more or less difficult. If the block creation time is too low, then the difficulty goes up and if it is too high, then it will go down.
The following chart shows the difficulty in Ethereum’s network from December 11, 2018, to January 11, 2019.
Current Ethereum Difficulty stands at 2,691,878,354,888,980.00000000
Problems with POW Mining
POW, of course, is not without its flaws. Let’s look at these problems from the context of Bitcoin because it is pretty similar to Ethereum’s.
#1 Energy Wastage
The biggest problem of Proof-of-work is the energy wastage. This is how much energy Bitcoin has consumed over the last few months. As you can see, Bitcoin has a voracious appetite:
In December 2017, research showed that Bitcoin consumes more energy than 159 individual countries. That is a pretty crazy statistic!
As we have already told you, Bitcoin uses ASICs for mining. The problem with that is ASICs are expensive, and pools with more money tend to have more ASICs and, consequently, more mining power. In fact, check out the hashrate distribution chart for Bitcoin via Blockchain.info.
As you can see, 3 pools BTC.com, AntPool, and SlushPool alone own more than 50% of the network’s hashrate aka mining power. This completely defeats the purpose of decentralization, plus, there is one more big issue. Individual users simply can’t compete with big money pools.
This is even worse in Ethereum’s case:
As you can see, 4 pools account for 75% of the network hashrate.
A solution was needed to achieve more decentralization. This is why Ethereum looked at Proof of Stake as a possible solution.
What is Proof of Stake(POS)?
The main draw of POS is that it makes mining a purely virtual process. Now, there are many different kinds of proof-of-stake execution, however, the general idea goes like this.
- POS uses “validators” instead of miners.
- Validators lock up a certain amount of Ether as stake.
- Once they find a block which they think can be added to the chain, they place bets on it.
- If the block gets appended, they receive a reward which is proportionate to their bets.
Since no hardware or extensive computation is needed, the process is not wasteful. However, having said that, general POS suffers from one major issue. It is called the “Nothing at Stake Problem.”
Consider this scenario.
Suppose you are watching a football match between two equally matched teams. Suppose the betting price for both the teams is similar (say $5 a piece) and the profits that you get from one of the winning teams is pretty high (say $20). It is in your interest to put a bet on both the teams and pocket whatever profit you make.
Let’s use the same logic here. What is there to stop a malicious validator to bet on multiple blocks hoping to cash in one of them? In the worst case scenario, a bunch of malicious validators can bet on multiple blocks and cause the entire system to hardfork. Something needed to be done in order to stop these kinds of attacks.
Enter the Casper Protocol
What is the Casper Protocol?
Casper is the name of the POS protocol that will be used by Ethereum. Casper uses the same principles that most of the other POS protocols use….with one major difference. Casper has introduced a punishment mechanism to stop malicious validators from taking advantage of the system. In Casper, if a validator attempts to act in a malicious manner, their stake gets completely slashed off. Now, remember, this stake that they have to put up is pretty significant, so it makes no sense economically for validators to act in a malicious manner.
It must also be noted that Casper is not just one project. It is a combination of two different projects that are being undertaken currently by the Ethereum dev team.
- Casper the Friendly Finality Gadget (FFG)
- Casper the Friendly GHOST: Correct-by-Construction (CBC)
Casper FFG is also known as “Vitalik’s Casper” because the main man behind it is Ethereum co-founder Vitalik Buterin. Casper FFG is not a full-blown POS protocol, but rather a hybrid POW/POS one. Casper has a proof-of-stake layer running on top of the normal POW protocol and the way it is going to work is like this:
- New blocks are still mined using POW.
- Every 50th block is finalized by POS. Finality means that once a particular operation has been done, it will forever be etched in history and nothing can revert that operation
The second project is called Casper CBC aka “Vlad’s Casper” because the person behind it is Vlad Zamfir, the poster child of Casper. To understand how Correct-by-Construction works, let’s compare a normal protocol to a CBC protocol.
- Formally specify the protocol
- Define the properties of the protocol must satisfy
- Prove that protocol is fulfilling the properties beyond any reasonable doubt
- Formally but partially specify the protocol
- Define properties that the protocol must satisfy
- Derive the protocol in a way that it satisfies all the properties that it was stated to specify
So, what are the advantages of the Casper protocol over the traditional POW that is used by Bitcoin?
- Malicious validators don’t have any incentive to game the system thanks to the slashing protocol.
- POS is far less wasteful and uses fewer resources than POW.
- POW mining was getting increasingly centralized.
- Doing a 51% attack on POS will cost much more than POW.
- Since in POS, validators need to actually stake their own money in the system, it is in their best interest to always work for the best of the system (i.e. Ethereum) to make sure that the value of their investment rises.
- POS system is far more likely to scale as opposed to POW.
The Difficulty Bomb
In order to make this transition happen, Ethereum introduced a difficulty bomb into their system on 7th September 2015, which is going to increase the difficulty exponentially. This will make mining extremely difficult and enter Ethereum’s POW into the “ice age”. Miners will basically have no incentive to mine on Ethereum anymore.
This all sounds good on paper, but the Casper Protocol is not ready yet for full implementation and there is still some work left to be done. This is why, instead of just delaying Constantinople as a whole, they have delayed the difficulty bomb by 29 million seconds or approx 12 months.
Now, the problem here is that since this delay will make the mining process simpler, something needed to be done to keep Ether supply under control. So, EIP 1234, authored by Afri Schoeden, is going to reduce the block reward from 3 ETH to 2 ETH. This reduction can be thought of as a “stop-gap” to the “supply bleed”.
So, how is this reduction going to affect the overall economics? Let’s check it out.
According to the average data from etherscan over the past year, the current supply of ether increases by 20,300 ETH/day. To break this down:
5,900 reg blocks * 3 ETH/block = 17,700 new ETH/day from reg. blocks
1,090 uncle blocks * 2.42 ETH/block = 2,600 new ETH/day from uncles
17,700 + 2,600 = 20,300 total new ETH/day in rewards
20,300 * 365 = 7,400,000 total new ETH/year in rewards
So the reduction of the reward from 3 ETH/block to 2 ETH/block means that each block will produce 33% fewer ETH, decreasing the uncle block rewards from ~2.42 ETH/block to ~1.63 ETH/block. Assuming the number of blocks remains fairly consistent, we can determine:
(Daily Rewards with 3 ETH) * (0.66) = Daily Rewards with 2 ETH
(20,300 ETH/day) * (0.66) = Daily Rewards with 2 ETH
13,400 ETH/day = total new ETH/day in rewards
13,400 * 365 = 4,890,000 total new ETH/year in rewards
So, what does all this mean?
It means that after the Constantinople hard fork, total new ETH supply will reduce from 20,300 ETH/day to 13,400 ETH/day and from 7.4m ETH/year to 4.9m ETH/year.
Ok, so what does that do to Ethereum’s inflation rate?
The table below demonstrates rough inflation rates of ETH supply year-to-year, and projecting forward to 2020 and 2021 given the calculated issuance rate of 4,890,000 ETH/year (and assuming there are no future block award adjustments).
According to the table, after Constantinople, Ether’s inflation rate will drop from 7.7% to 4.8%, for a change of -2.9%.
The image below demonstrates the inflation changes alongside hard fork and protocol updates in the past and planned for the future. As issuance and supply eventually stabilize in parallel, the limit of ether will rely on the demand the market has for it.
This proposal has caused quite the controversy. Afri Schoeden has called it the “the best proposal to stabilize issuance while simultaneously delaying the bomb.” However, there are other members in the Ethereum community how have opposed this EIP arguing that the reduction of the difficulty will lead to more centralization.
Scalability has always been a priority for Ethereum. Let’s look into how they are planning to do so.
A state channel is a two-way communication channel between participants which enables them to conduct interactions, which would normally occur on the blockchain, off the blockchain. So, what this eventually does is that it exponentially reduces the time taken for a transaction to go through.
Can you guess why?
When you create a state-channel between two parties. they can engage in an infinite (theoretically speaking) number of microtransactions between themselves without interacting with a blockchain. Since everything is happening off-chain, the miners only need to put the final state of the channel into the blockchain.
So what are the requirements to do an off-chain state channel?
A section of the blockchain state is locked up via multi-signature or some sort of a smart contract and it is agreed upon by a set of participants. These participants interact with each other without submitting any data to the miners. As we have stated earlier, the final state is presented to the miners and added to the blockchain.
The state channels can be closed at a point which is predetermined by the participants via one of the following methods:
- Either the participants can agree beforehand to close the state channel after a given amount of time has lapsed.
- Or, it could be based on the total amount of transactions done eg. close the chain after $1000 worth of transactions have taken place.
To understand the concept of state channels using a real-life (sorta) allegory. This example and the following diagram was presented by Stephen Tual.
Look at the image above.
So, you have an electric car which has opened up a state channel with an electric charger to do $39.19 worth of total transactions. When these series of transactions are done, the final state is added to the blockchain. Now can you imagine how much time it would have taken if they had to run every single transaction through the blockchain?
Important Features of State Channels
State channels are useful in many applications, where they are a strict improvement over doing operations on-chain. Having said that, they are not without their limitations. Let’s look at some of the features and limitation of state channels.
- State Channels are reliant on availability. So, if Alice is taking part in a state channel and she loses her internet connection at that very moment, she might not be able to uphold her role in the channel. Having said that, it could be possible for her to pay someone to maintain availability on her behalf.
- State channels are the most useful when participants are going to be exchanging many state updates over a long period of time.
- State channels are best used for applications with a defined set of participants.
- State channels have extremely high-quality privacy properties because everything is happening “inside” a channel between participants, rather than broadcast publicly and recorded on-chain. Only the opening and closing transactions must be public.
- State channels have instant finality, meaning that as soon as both parties sign a state update, it can be considered final. Both parties have a very high guarantee that, if necessary, they can “enforce” that state on-chain. We will cover privacy in more details in a bit.
That is the main use case of a state channel, it will immensely help in blockchain scalability. In fact, Bitcoin’s lightning network is essentially a fancy state channel which deals only with payments.
Vitalik Buterin himself is working on implementing state channels via EIP 1014. This EIP will allow interactions to be made with addresses which do not really exist on the blockchain, however, they can be relied upon to “only possibly eventually contain code that has been created by a particular piece of init code.”
State channel developer Liam Horne has described EIP 1014 to be “a significant performance increase in state channels.”
One of the most exciting ways that Ethereum is looking to implement scalability is “sharding.” In sharding, you break down a huge database into smaller and more manageable chunks called “shards.” The way blockchains are designed right now, all the nodes are forced to work on the same issue at the same time, which is a highly inefficient mechanism. However, via shard implementation, the nodes can be distributed among various shards and you can parallelize the tasks.
Sharding will make processing faster by splitting a state into different shards. However, if we are using POW, the smaller shards will be in danger of being taken over by malicious miners because of its low hashrate. In fact, this is the biggest reason why POW blockchains can never implement sharding, any and all small shards can be easily taken over.
This is why it is important for Ethereum to implement POS before they can execute sharding.
Sharding in the Context of Blockchain
Consider the state of the Ethereum blockchain which we shall call “Global State”, which is visible to everyone. Let’s consider the Merkle Root of this global state. This state root is going to be broken up into shard roots and each of these shared roots is going to have their own state. These states are going to be represented in the form of a Merkle tree.
Now, let’s get into the internal mechanics.
So what happens what after sharding is activated?
- The state is split into shards
- Each unique account is in one shard
- Accounts can only transact with other accounts in the same shard
In Devcon, Vitalik Buterin explained shards like this:
“Imagine that Ethereum has been split into thousands of islands. Each island can do its own thing. Each of the islands has its own unique features and everyone belonging on that island i.e. the accounts, can interact with each other AND they can freely indulge in all its features. If they want to contact with other islands, they will have to use some sort of protocol.”
The Two Levels of Shard Interaction
Upon the implementation of Shards, the Ethereum blocks will be broken into two levels of interaction, as opposed to one single level of interaction.
The First Level
The first level is the transaction group. Each shard has its own group of a transaction.
The transaction group is divided into the transaction group header and the transaction group body.
Transaction Group Header
The header is divided into distinct left and right parts.
The Left Part:
- Shard ID: The ID of the shard that the transaction group belongs to.
- Pre-state root: This the state of the root of shard 43 before the transactions were applied.
- Post state root: This is the state of the root of shard 43 after the transactions are applied.
- Receipt root: The receipt root after all the transactions in shard 43 are applied.
The Right Part:
The right part is full of random validators who need to verify the transactions in the shard itself. They are all randomly chosen.
Transaction Group Body
It has all the transaction IDs in the shard itself.
Properties of Level One:
- Every transaction specifies the ID of the shard it belongs to.
- A transaction belonging to a particular shard shows that it has occurred between two accounts which are native to that particular shard.
- Transaction group has transactions which belong to only that shard ID and are unique to it.
- Specifies the pre and post state root.
The Second Level
There is the normal blockchain, but now it contains two primary roots:
- The state root
- The transaction group root
The state root represents the entire state, and as we have seen before, the state is broken down into shards, which contain their own substates.
The transaction group root contains all the transaction groups inside that particular block.
Properties Of Level Two:
Level two is like a simple blockchain, which accepts transaction groups rather than transactions.
Transaction group is valid only if:
- Pre-state root matches the shard root in the global state.
- The signatures in the transaction group are all validated.
If the transaction group gets in, then the global state root becomes the post-state root of that particular shard ID.
What are the challenges of implementing sharding?
- There needs to be a mechanism to know which node implements which shard. This needs to be done in a secure and efficient way to ensure parallelization and security.
- Proof of stake needs to be implemented first to make sharding easier according to Vlad Zamfir.
- The nodes work on a trustless system, meaning node A doesn’t trust node B and they should both come to a consensus regardless of that trust. So, if one particular transaction is broken up into shards and distributed to node A and node B, node A will have to come up with some sort of proof mechanism that they have finished work on their part of the shard.
So there you have it. That’s all the changes that Ethereum Metropolis brought in or is bringing in. The following few weeks is going to be extremely interesting and critical for Ethereum. Especially with the market behaving the way it is, it will be curious to see how they react to these new changes. Regardless, on paper, Constantinople looks like a home run. However, the mining reward reduction seems extremely contentious. Let’s wait and watch how the community reacts to it in practice.