Bitcoin
$5,566.72
-7.16
Ethereum
$173.21
-1.47
Litecoin
$41.08
-1.16
DigitalCash
$131.45
-5.79
Monero
$88.15
-1.86
Nxt
$0.05
0
Ethereum Classic
$7.21
-0.26
Dogecoin
$0.00
-0

Intro to Cryptography

Cryptocurrencies like Bitcoin and Ethereum use a peer-to-peer decentralized system to conduct transactions. Now, since the entire process happens online, what is going to protect it from malicious hackers and attackers? Something needed to be put in place to safeguard the integrity of the entire transaction. This is where cryptography comes in. Bitcoin and Ethereum use the following two forms of cryptography primarily.

  • Cryptographic hash functions
  • Asymmetric key cryptography

So, before we go deep into them, let’s understand what cryptography means and where it comes from.

#1 Cryptographic Hash Functions

Hashing or Hash function is a function which will give you a fixed length of output every time, no matter what length of data you give as an input to the function. The values returned by a hash function are called digests, hash values, hash codes, or simply hashes. The benefit of this fixed length output is obvious.

Suppose, you have a hash function that always produces 32-bit output. So instead of dealing with inputs of various lengths (say 8 bit or 128 bit), which can be huge, you only have to deal with 32-bit output every time. This makes computation and keeping tab easy.

Now do not think that this is the hashing blockchain uses. Though the Cryptographic hash is referred to as hash, it is not technically correct. The Cryptographic Hash is different rather it is a specialized case of the general hash function. For example, Cyclic Redundancy Check (CRC) is a hash function but not a cryptographic one.

The cryptographic hash function needs to have mainly five properties that make it a secure and therefore can be considered for cryptographic use.

The Properties of Cryptographic Hash Function

Property 1: Deterministic

What this means is for a particular hash function, the hash or result of a particular input message is always same, no matter how many times you run it through.

Property 2: Quick Computation

The hash function needs to procure output fast enough i.e. process quickly. This is definitely a desired property since it needs to process multiple messages or data, and this allows the system to be efficient.

Property 3: The Avalanche Effect

What happens to the hash if you change a letter of your message or a digit of your data? For a hash function to be a cryptographic hash this slight change should result in a huge change in the output hash.

What happens here is simple. We know that any message, text or numerical, is represented in bits (1 or 0). So, a single character or number change alter the bitstream altogether. The cryptographic hash function now gives an output that is distinctively different than before – a single bit change reflects in more than half the output bits getting flipped. This is also mathematically called the Butterfly Effect.

Check the below image for example:

Property 4: Preimage Resistance

Let us first state what Preimage Resistance is, then we will break it down for you further.

If you know the hash value H(M) it is infeasible to determine the input M. This is what we mentioned earlier make the cryptographic hash function ‘one-way’. Functions that don’t have this property are open to First PreImage Attacks.

On a broader perspective, we should also mention that if H(M1) = H(M2) i.e. the hash value of two different message is the same (will discuss this next), it should be infeasible to find out M2 though you know M1. Lacking this property, such hash functions are vulnerable to the Second Preimage Attacks.

Now we would like to draw your attention to the word ‘infeasible’, not impossible but simply means very difficult. So why do we say it is not impossible?

Well, take the example of rolling a die. For each of the 6 outcomes, we have 6 different hash values. Though in general there is no way to find out the actual outcome from the hashes, if we find out or create a table of hashes for each number (1 – 6), we can consult the table and find out for sure the actual input.

Now, what if this number of input increases say a 32 bit or 128 bit or 256-bit data stream? As you can easily guess, it gets exponentially difficult to create the table, called Rainbow Table, for so many input data. The only option you have now is, take a random input, hash it, and hope that it is the one you are looking for. This is called the “brute-force method.”

In the best case, you will select the input on your first attempt. If this is a 32-bit input data, then the probability of this happening is 1/(2^32) =  2.33 x 10^-10 = 0.000000000233%; now if this is a 128-bit data then chances of the best case scenario is 1/(2^128) = 2.99 x 10^-39% !! See our point, the chances get astronomically slimmer as the range of input increases.

For the worst case scenario, you will only find the hash you need at your last selection. So, for 32-bit data it means after 2^32 – 1 = 4.2949 x 10^9 attempts, and for 128-bit data your luck smiles after you have exhausted 2^128 – 1 = 3.403 x 10^38 options.

Even we consider the average scenario, you find what you need after (2^32)/2 = 2^31 = 2.147 x 10^9 attempts for 32-bit and (2^128)/2 = 2^127 = 1.7 x 10^38 attempts for 128-bit data.

To summarize, though academically not impossible to break the pre-image resistance by using use brute-force method, the number of attempts and therefore time (yes, even with huge computing power) needed makes someone cracking it nearly impossible.

Property 5: Collision Resistant

This is where we address the issue of second pre-image resistance as promised.

The cryptographic hash function demands that for any two given input M1 and M2, their hashes shouldn’t be equal i.e. H(M1) H(M2) or should be very difficult to find such pair, which is called cryptographic hash collision. Note here that a hash function having collision resistance ensures second pre-image resistance, but not the first.

Now since there is a chance of finding such pair, and hackers can do that with what is known as “the birthday attack.” So how do we avoid this?

Birthday Attack

The birthday attack exploits what is known in probability theory as birthday paradox or birthday problem. Let us explain as simply as possible.

The chance of two people in a group of 367 people sharing the same birthday is 100%. Astonishingly, the probability of two people sharing their birthdays in a group of 70 people is 99.9%. And the chance of 50% probability of same birthday occurring happens for just about 23 people!

According to the probability theory, for a 50% chance of collision to happen, you need √N (square root) number of items, where N is the total number of possibilities.

For birthday problem, out of 365 different possibilities, you need √365 i.e. about 23 people to have a 50% chance of having the same birthday.

So, let us check what this means in the context for Cryptographic Hash function.

For 128-bit hash, there are 2^128 possibilities, and as per our probability rule, the 50% chance of breaking the collision resistance happens at √(2^128) = 2^64-h instance.

Therefore it is evident that it is much much easier to break the collision resistance of a hash function than to break its preimage resistance. But since even the time needed to do that is so long, we can safely assume that if two hashes match they refer to the same input i.e. if H(M1)=H(M2), then M1=M2.

In order to achieve this collision resistance, hash functions use a paradigm called  Merkle–Damgård construction.

Merkle–Damgård Construction

The Merkle–Damgård construction or paradigm works on a very simple philosophy – if we have a collision-resistant hash function for short messages, then we can construct a collision-resistant hash function for long messages as well.

Keeping in mind the diagram above, take note of the following points:

  • A larger message M can be broken down into smaller blocks of messages, m[i].
  • The hash function H consists of lots of smaller hash functions “h”.
  • “h” is a smaller hash function which is called compression function. It takes in a small block of the message and returns a hash.
  • The first hash function “h” (circled in the diagram above) takes in the first block of the message (m[0]) and adds to it a fixed value IV and returns the hash.
  • The hash now gets added to the second block of the message and goes through another hash function h
  • This goes on till the last message block where a padding block PB is also added to the message.
  • The output of each compression function h is called a chaining variable.
  • The padding block (PB) is a string of 1s and 0s. In the SHA-256 algorithm, the PB is 64-bits long.
  • The output of the last (rightmost) hash compression function “h” is the output of the large message M.

We will now discuss how does this follow collision resistance.

Now, if “h” is collision-resistant, then “H” should be collision resistant as well.

In order to prove this theorem, we will go in the reverse direction, i.e. if we can prove that collision happens for H, then h also should have a collision.

If there are two messages M and M’, and we are having collisions on both of their hashes that means

H(M) = H(M’)

Therefore we can use it to find out the collision in h. For now, let us find out the chaining variables for both H(M) and H(M’) messages.

For H(M), IV= H(0), H(1),….., H(t), H(t+1) = H(M)                 : t is any arbritrary number

Now for H(M’),  IV= H’(0), H’(1),.., H’(r), H’(r+1) = H(M’)    : r is any arbritrary number which may  

                                                                                               or may not be equal to t

 

Though both hash functions may not have equal number of chaining variables, in both the cases – H(t+1) = H(M) And H’(r+1) = H(M’)

As we know that H(M) = H(M’), and since collision exists, therefore H’(r+1) = H(t+1).

Now let’s refer back to the above diagram again –

Now, let’s check if H(t+1) and H’(r+1) are the last hash values or digests of their respective hash functions then what would be the input message for the last compression function for each

For H(t+1) it would be h(H(t), M(t)||PB)

Similarly, for H’(r+1) that would be h(H’(r), M’(r)||PB’)

So, now the question is under what conditions the compression function “h” will be non-collision resistant. The answer is if and only if it gives the same output to inputs which are not similar i.e. –

  • H(t) H’(r)
  • M(t) M’(r)
  • PB PB’

Now if these outputs are same then we have to dig deeper.

If PB = PB’ then we know for sure that both have the same number of message blocks i.e. t = r which means:

M(t) = M’(r) And H(t) = H’(r)

In this case, since the value of t and r are the same, we can rewrite H(t) and H’(r) as H’(t)  

So let’s see what would be the value of H(t) and H’(t) –

H(t) = h(H(t-1), M(t-1)) and H’(t) = h(H’(t-1), M’(t-1))

Since H(t) = H’(t), therefore h(H(t-1), M(t-1)) = h(H’(t-1), M’(t-1))

For a collision to occur, the following conditions must be met:

H(t-1) H’(t-1)

M(t-1) M’(t-1)

Now, if these conditions are held true, then we know for sure that we have fianlly found a collision on ‘h’

Let’s see what happens if those conditions are true.

Here H(t-1) = H’(t-1), and since we already know that M(t) = M’(t),

Therefore M(t-1) = M’(t-1).

We can now continue the calculations for H(t-1) and H’(t-1) all the way back until we reach the very first input message. In that case any one of the following two conditions must hold:

  • Find collision for ‘h’.
  • All message blocks of M and M’ have to be the same.

Now, since we already know that the prime condition for a collision to happen is that the input message M and M’ cannot be the same, the only way the collision of H happens is if the compression function h also has a collision.

Therefore, it is proven that for every hash function H with a collision, we have a compression function h that has a collision, and thus the Merkle-Damgard holds true.

All these are the main five properties of cryptographic hash functions, but this last one is also has a very high value.

Property 6. Puzzle-friendly

This property states that if you take any random value (message or number) k from a distribution of high min-entropy, for every output Y, it is infeasible to find another message x so that H(k|x) = Y.

Let us explain what min-entropy is then we will elaborate on this property.

A distribution or set of message or data is said to have min-entropy if the distribution is so large that the probability of us picking up a certain message or number while choosing randomly can be considered very very low or negligible.

What this means is, the distribution of selecting a random number from a dice (1 -6) has low min-entropy. But the distribution of selecting a lottery number from say 6-digit string has high min-entropy.

Now, the I sign denotes concatenation. This simply means joining the two strings together, i.e. in this case when ‘k’ is concatenation with ‘x’ it becomes ‘xk’. A better example would be (Crypto | Currency) = CryptoCurrency.

So what the property means is a hash function will be called cryptographic if, for its any output Y,  it is infeasible to find a string x such that there is no string k when concatenated with x and then hashed will produce Y.

Length Extension Attack

As mentioned earlier to avoid collision resistance, Merkle–Damgård construction is used, but using it purely also exposes such hash functions to the Length Extension attack.

In simplest terms, if an attacker knows a message M2 and only the length of another message M1, then he or she can use the hash of M1 i.e. H(M1) to find out H(M1|M2) i.e. the hash value of M1 and M1 concatenated.

What the attacker can do now is he or she can alter the internal state of the message. This attacker now adds extra information (bit string) at the end of the message and create a valid hash.

Popular cryptographic hash functions like the SHA-256, SHA-512 or other functions like MD5, SHA-1, RIPEMD-160, Whirlpool all use the Merkle–Damgård paradigm and therefore are vulnerable to the Length Extension Attack.

However hash functions like SHA-3, truncated SHA-2, HMAC and BLAKE-2 are not at all vulnerable to the length extension attack, making them heavy security.

That is why the Merkle–Damgård construction are tweaked a bit to safeguard against such attacks.

Narrow Pipe to Wide Pipe Construction

In Merkle–Damgård construction’s straightforward usage, the size of the hash output is equal to the internal state size (between each compression step), resulting in a what is called narrow-pipe hash design. This invites for the discussed length-extension attacks, as well as multiple security issues such as multi-collisions, long message attacks, generate-and-paste attacks etc.

This is where the Wide Pipe construction of Merkle–Damgård comes in. In this case, it has a larger internal state size i.e. the bit-length that is internally used is larger than the output bit-length.

Apart from the tweak t of the Merkle–Damgård construction, the Wide Pipe construction also includes new constructions like the sponge construction and HAIFA construction.

Some Examples of Cryptographic Hash Functions

Before we delve in details about a few of the popular cryptographic hash functions currently in use, let us give you near-exhaustive examples of cryptographic hash functions –

All of these are unkeyed cryptographic hash examples –

  • BLAKE-256 – This is a 256 bits hash function based on HAIFA structure.
  • BLAKE-512 – This is a 512 bits hash function based on HAIFA structure.
  • BLAKE2s – This is a type of cryptographic hash function that goes up to 256 bits and it’s based on HAIFA structure.
  • BLAKE2b – This is a type of cryptographic hash function that goes up to 512 bits and it’s based on HAIFA structure. It is used by ZCash and faster than the recent SHA-3.
  • GOST – This is a 256 bits hash.
  • HAS-160 – This is a 160 bits hash.
  • HAVAL – This is a type of hash that has a range of 128 to 256 bits.
  • JH – This is a type of hash that has a range of 224 to 512 bits.
  • MD2 – This is a 128 bits hash.
  • MD4 – This is a 128 bits hash.
  • MD6 – This is a type of cryptographic hash function that goes up to 512 bits and it’s based on Merkle tree NLFSR.
  • MD5  –This is a 128 bits hash function based on Merkle–Damgård construction. It’s complex hence slower than MD-4.
  • RIPEMD – This is a 128 bits hash.
  • RIPEMD-128 – This is also a 128 bits hash.
  • RIPEMD-160 – This is a 160 bits hash. This is used by Bitcoin Script.
  • RIPEMD-320 – This is a 320 bits hash.
  • SWIFFT – This is a 512 bits hash.
  • Whirlpool – This is a 512 bits hash as well.
  • SHA-1 – This is a 160 bits hash function based on Merkle–Damgård construction.
  • SHA-224 – This is a 224 bits hash function also based on Merkle–Damgård construction.
  • SHA-256 – This is a 256 bits hash function based on Merkle–Damgård construction. Both Bitcoin and Bitcoin Script also uses this hash function.
  • SHA-384 – This is a 384 bits hash function based on Merkle–Damgård construction.
  • SHA-512 – This is a 512 bits hash function based on Merkle–Damgård construction.
  • SHA-3 (a.k.a Keccak) – This is a hash function based on Sponge function and has an arbitrary bits length. Ethereum uses this one.
  • Streebog – Another Merkle–Damgård construction-based cryptographic hash function with a range from 256 to 512 bits.
  • Tiger – This is a 192 bits hash function based on Merkle–Damgård construction.
  • Skein – This is a cryptographic hash function based on Unique Block Iteration and has an arbitrary bits length.
  • Spectral Hash – This is a 512 bits hash function based on Wide pipe Merkle–Damgård construction.

You can easily figure out after how many hashes or instances you can break collision resistance for each of them. For example, we would like to mention that though falls under the family of very popular hash functions, MD4, MD5, and SHA-1 are not recommended anymore as their collision resistances are already broken.

Hashing and Data Structures

A data structure is a specialized way of storing data. There are two data structure properties that are critical if you want to understand how a blockchain works. They are as follows:

  • Pointers.
  • Linked Lists.

Pointers

Variables are used in programming to store data. So, if we have something like this in programming:

int a = 1;

What this means is that we are declaring an integer variable a and are storing the value 1 in it.

Pointers are variables in programming which stores the address of another variable. This is why they are called pointers because they are literally pointing towards the location of other variables.

Also, a hash pointer is a variable that saves the address of another variable along with the hash of the data stored in the variable.

Linked Lists

You see the data structure above, that’s called a linked list. It is a sequence of blocks, each containing data which is linked to the next block via a pointer. The pointer variable, in this case, contains the address of the next node in it and hence the connection is made. The last node, as you can see, has a null pointer which means that it has no value.

There is one important thing to note here.

There is a pointer in each block which contains the address of the next block. That is how the pointing is achieved and the blocks are linked to one another.

So, you might be thinking that the structure looks pretty familiar right?

Turns out, that a blockchain is basically a linked list where each block is pointing to the next one with a hash pointer. Each block in the chain stores the hash of the data of the previous block.

This is how the blockchain gains its immutability.

  • Suppose a hacker tries to tamper with the data in the 3rd block of the blockchain
  • Because of the snowball property, it drastically changes the hash of the block which is stored in block 4
  • This goes on until the end of the chain until it freezes the whole chain
  • Since that is a theoretical impossibility, the blockchain is immutable.

Alright, so that’s what hashing does and how cryptographic hashing algorithm works. Now let’s move on to asymmetric key cryptography.

#2 Asymmetric Key Cryptography

Cryptography is the study and practice of techniques, grounded in advanced mathematical principles, which used for storing and transmitting information from one point to another. The data is hidden in such a way, that only the intended recipient can unlock and read the message.

In Crypto lexicon, the sender of the message is often known as Alice “A” which the intended receiver is Bob “B”, and the adversary/eavesdropper is Eve (E for eavesdropper).

Cryptography has been used for thousands of years by people to relay messages free form the influence of the enemy. In fact, the earliest known use of cryptography was seen in the tomb taken from the Old Kingdom in Egypt circa 1900 BCE.

In fact, let’s take a look at the Caesar cipher. Caesar cipher happens to be one of the earliest known ciphers. Even though mathematicians took nearly 800 years to break it, the operation is actually very simple. All that you needed to do was to shift each letter in your message forward by a given number. That number happens to be the key to unlock your code.  If you want to send this code to your friend, then you will need to send the key to them to unlock your message.

Suppose you want to send a message “ATTACK” to your friend along with a key. Let’s say the key is 3. So, to encrypt the message, you will need to shift each letter of your message forward by 3.

You will then send “DWWDFN” to your friend. Once, they get it, they will shift each alphabet of the code back by 3.

Cryptography historically just meant encrypting a piece of information in plaintext into unintelligible text which is the cyphertext. Cyphertext is basically an encrypted version of the original plaintext. In order to read the message, the intended recipient must be able to decrypt the ciphertext and revert it back to plaintext.

So, how exactly does this encryption and decryption happen?

To do this, we use something called a “cipher”. A cipher is an algorithm or a device which takes care of both encryption and decryption. While a classical cipher was strong enough to pass simple messages, the problem is that it is still vulnerable to simple brute force attacks. Brute force attacks simply means going through all the possible iterations before stumbling on the correct one to open up the code.

However, the cryptography game changed with the addition of technology. World War 1 saw the development of rotor cipher machines and World War 2 introduced computers to cryptography, taking it to a whole new level of complexity. Modern cryptography is an amalgamation of mathematics, computer science, electrical engineering, communication science, and quantum physics. In fact, computers have made it possible to have new forms of encryption of any kind of digital information, not only language texts.

So, with that being said, let’s get into the cryptography that is being used by blockchains and cryptocurrencies. Let’s begin with understanding what digital signatures mean.

What are Digital Signatures?

What is a signature and what does it do? What are the “properties” of a typical signature? When you put your signature on a piece of paper, what should it be able to do?

  • It should be able to provide proper verification. So, the signature should basically be able to verify that it is you who actually signed the paper.
  • A good signature should be impossible, or at least very hard to forge. This maintains the uniqueness of the signature
  •  The final property is non-repudiation. In simpler terms, it means that If you have signed something with your signature, then you should not be able to take it back or claim that someone else has done it instead of you.

Now, we know that this is what a signature SHOULD do, but having said that, the reality is that in the real world, no matter how intricate the signature, there are always chances of forgery, and you cannot really verify signatures using simple visual aids, it is very inefficient and non-reliable.

Cryptography gives us a very elegant solution to this predicament via “digital signatures”. In fact, it happens to be one of the most important cryptographical tools that are used in cryptocurrency.

So, what are keys? And how are they used in the blockchain?

There are two kinds of key cryptographies:

  • Symmetric cryptography
  • Asymmetric cryptography

Symmetric Cryptography

Symmetric cryptography is one of the earliest and simplest cryptographic methods known to man. If you were to break down the steps, then it will look kinda like this:

  • You want to send a message M to your friend Bob
  • You use a key to encrypt the message and turn it into ciphertext
  • Bob gets the ciphertext and then uses the same Key to decrypt the ciphertext and retrieves the original message M

There are two kinds of ciphers that we use in symmetric cryptography:

  • Block Ciphers.
  • Stream Ciphers.

Block Ciphers

Up first we have Block Ciphers. Block ciphers are a form of symmetric cryptography which uses a key of a fixed length to encrypt a block of fix length.

Let’s see this with a simple example. Consider the following table:

So, if you have received a ciphertext which says “EFBD”, you can simply look up the table and substitute those ciphers with plain text, which comes out to be “FACE”.

However, let’s go a little deeper. Check the tables:

Plain: A B C D E F
Cipher: F A B C D E

According to the cipher, it is basically the plain text which has shifted to the right by 1. So, in other words:

EFBD = FACE shifted by 1

This is the basics of the block cipher. Also, if you have the key, anyone can decipher the ciphertext from the plain text and vice versa.

 

Block ciphers are usually used to deal with humongous chunks of data.

One of the most important properties of the block cipher is that the moment the key changes, the output ciphertext will change drastically. Consider the following:

So, we have three keys which are generating 3 different ciphertexts.

  • In ciphertext 1 we are shifting to the right once.
  • In ciphertext 2 we are shifting to the right twice.
  • In ciphertext 3 we are shifting to the right thrice.

When we put the input i.e. “FACE” through all these ciphers, we get the following outputs respectively

  • EFBD
  • DEAC
  • CDFB

Do you see how drastic the output ciphertext is every single time you change the key? Imagine if the input was humongous instead of “FACE”, the ciphers would have been indistinguishable from one another.

There are two rules for a block cipher to be considered valid:

  • You must be able to derive the plaintext from the ciphertext and vice versa given a key.
  • The function must be efficiently computable.

The block sizes in block ciphers are fixed, so the input text has to be the same size as the block. If the input in bigger than the block, then it should be broken down to get the correct size. If, on the other hand, it is smaller, then some junk data is padded to the block to get it to correct size to fit the block.

Examples of Block Ciphers

Data Encryption Standard (DES)

Up first we have the Data Encryption Standard which was the government standard till 2001. t has the following properties:

  • Block sizes of 64 bits.
  • Key size of 56 bits.

Advanced Encryption Standard (AES)

Advanced Encryption Standard is extremely secure and widely used nowadays. It has the following properties:

  • 128-bit block size.
  • 128, 192 or 256-bit key size.

Stream Ciphers

Stream cipher uses a fixed key to replace the plaintext message with a pseudorandom string of characters. Each and every letter in plaintext is encrypted one letter at a time. Let’s check out three specific kinds of stream ciphers to give you an idea of how they work:

  • One-time pad with alphabets.
  • One-time pad with XOR gate.
  • Linear feedback shift register.

One-time pad with alphabets

This encryption requires a key which has the same number of characters as the input message. The idea is to use each character of the key to pad and encrypt each alphabet of the input message. This key can be used for one-time use only.

Let’s take an example.

Suppose we have a message which goes like this: “MEET ME OUTSIDE”.

We want to send this message to our friend Bob, however, we don’t want anyone else to intercept our message. You and Bob meet beforehand to decide on a one-time pad which will go like this:

“B D U F G H W E I U F G W”

The first thing that you will notice here is that the pad and the original message both have exactly the same number of character, i.e. 13.

Alright, so how does the process happen?

Each and every alphabet will be replaced by its numeric equivalent in during the process.

The numerical mapping goes like this:

During this entire process, we will be utilizing 6 pieces of data:

  • The original message or OM which in this case is “MEET ME OUTSIDE”.
  • The Numerical Original Message or NOM which is the numerical equivalent of the original message
  • The one-time pad or the OTP
  • The Numerical OTP or the NOTP
  • The NCT or the numerical ciphertext which is calculated like this: NOM+NOTP mod 26
  • CT i.e. the ciphertext which is the alphabet equivalent on the NCT

So, we need to send the message “MEET ME OUTSIDE” and we need to use the one-time pad to encrypt it.

Part 1: The Encryption

Look at the table below to understand how we will get the ciphertext,

First up, we will put the message “MEET ME OUTSIDE” in the OM row.

Up next, we will use the numerical mapping table to get the numerical equivalent of each alphabet.

Let’s see what that looks like:

Now that we have the numerical equivalent of each alphabet in the original message, we are going to put that in the row “NOM”.

Now, remember the OTP that we are going to use as key:  “B D U F G H W E I U F G W”.

Let’s find the numerical equivalent of each and put them in the NOTP row: “1, 3, 20, 5, 6, 7, 22, 4, 8, 20, 5, 6, 22”.

So, far it is pretty straightforward and uncomplicated, right? Now, it is time to calculate the Numerical Ciphertext or NCT. NCT is calculated like this: NOM + NOTP mod 26:

Now that we have the NCT, we will find the alphabetical equivalent of the numbers and calculate our CT which comes out to be: “N H Y Y S L K Y B M N J A”.

That ends our encryption process and gives us our ciphertext.

Part 2: The Decryption

Alright, so you have encrypted the message and sent the ciphertext to your friend Bob. So, how is he going to Decrypt using the same key?

Let’s see the data that Bob has with him:

  • He has the encrypted message aka the ciphertext that he has gotten from you.
  • He has the key that both of you share.
  • He has the mapping table to find the numerical equivalents.

So, how will he decrypt the message using this data?

Firstly, they will calculate the NCT from the CT by simply using the numerical equivalent of each alphabet that is present in the CT.

So, now that he has the NCT and NOTP, he will calculate the NOM (Numerical value of the original message) by doing this calculation:

NOM = NCT – NOTP mod 26.

Once that is calculated, he will use the table to get back the original message:

So, let’s see how the NOM calculation work:

Now that you have the NOM, you can get the alphabetical equivalent to retrieve the original message, which is: “MEET ME OUTSIDE”

One-time pad with XOR gate

XOR or “Exclusive OR” is a logic gate.

What is a logic gate?

In electronics, a logic gate is an idealized or physical device implementing a Boolean function; that is, it performs a logical operation on one or more binary inputs and produces a single binary output. Usually, a logic gate will take in 2 inputs and gives out 1 output. The inputs and outputs will be in binary values… in other words, it can only be 1 or 0. 1 being “truth” and 0 being “false”.

So, what does the XOR operator do? The XOR operator returns “truth” only if the following conditions are met:

  • A is false and B is true
  • B is false and A is true

In other words, this will only return true if A and B are not equal. A XOR truth table looks something like this:

Ok, so how do we encrypt and decrypt using XOR gates? Let’s take a look.

The Encryption Process

Now, this process works only with binary inputs. So, whatever message you want to send to your friend, you will need to get a binary form of it. Also, as with the earlier method, you will need a key which is of the same length as your input text. Anyway, for this example, let’s suppose that the input message that you are giving is: 00011110.

You and your friend, let’s rope in Bob again, have decided on key, whose binary equivalent is: 01001010.

Keep in mind how the key and the value is of the same length.

Alright, so now how are we going to get out ciphertext? We are sure that you have figured that part out already.

Cipher Text = Plain Text XOR Key.

On XORing each element of the PT and the Key with each other, we get:

So, we get our new ciphertext which is: 01010100

This is what you are going to send to Bob.

The Decryption Process

Alright, so how is Bob going to decrypt the ciphertext with the key?

Well, by XORing again of course. Consider the following example.

So, we have our input 0,1 and the key 1,1 and the Cipher 1,0.

Turns out, that if you simply XOR the cipher with your key, you will get back the input!

Pretty darn awesome right?

That is exactly what Bob is going to do now.

That is how Bob will retrieve your original message.

Linear Feedback Shift Register

Linear feedback shift register is a function whose future output completely depends on its earlier (or current) state.

Let us explain and it will be clear to you soon.

The idea of the shift register is to predetermine with your friend (as with any symmetric encryption). This key will be a linear feedback shift register function which will be used by you to determine the code.

Suppose you and Bob decided that the formula you want to go with is: E(i+3) = E(i+1) + 2E(i+2) mod 26.

Wherein E(1) = 2 and E(2) = 4 and so on and so forth.

Let’s examine the equation that we are working with again:

E(i+3) = E(i+1) + 2E(i+2) mod 26

As you can see, each consequent output is directly dependent on the previous outputs.

Ok so all that makes sense, now how do we actually use this to get our ciphertext?

The message you want to send Bob is “MEET ME”. There are 6 characters in this message and so you will need 6 values of E() to form your key. You have already met with Bob beforehand wherein you decided the following predetermined values:

  • E(1) will be 2
  • E(2) will be 4

We know that in order to code the input message, we will need E(1) to E(6). Let’s calculate E(3) to E(6).

E(3) = E(1) + 2E(2) mod 26 = 10.
E(4) = E(2) + 2E(3) mod 26 = 24.
E(5) = E(3) + 2E (4) mod 26 = 6.
E(6) = E(4) + 2E(5) mod 26 = 10.

Ok, we have the keys, let’s see how the encryption and decryption works.

The Encryption Process

Let’s create our table.

We will put in the keys and the original message which is “MEET ME”. After that, we are going to create a NOM which has the numerical equivalent of each alphabet of the original message.

Ok, now that all obvious data has been inputted, let’s calculate our ciphertext by first calculating our Numerical Ciphertext aka NCT.

To get the numerical value of the ciphertexts, add the key and the numerical value of the original message and mod with 26.

Now you’ll get:

Ok, so you have: 14 8 14 17 18 14

If you map these numbers to their corresponding alphabets and you get “OIORSO”. That’s the encrypted message.

The decryption of this message is really hard especially if you don’t have the key.

Examples of Stream Ciphers

Rivest Cipher 4 (RC4)

  • Used in WEP aka wired equivalent protocol for wireless network security.
  • Also an option in TLS/HTTPS for encrypting web traffic.
  • It is not recommended to use this cipher anymore since it has been cracked so many times.

The A5/1

  • It is used for encrypting GSM (Global System for Mobile communication) phone data and communication.
  • The renowned whistleblower Edward Snowden revealed that the NSA routinely keeps breaking GSM for surveillance purposes so that doesn’t really qualify it as secure now does it?

Types of Symmetric Encryption Algorithms

Since we have looked into symmetric cryptography, let’s go through some of the most famous symmetric encryption algorithms out there.

#1 DES and Triple DES

The Triple Data Encryption Standard, or Triple DES algorithm, evolved from the original DES algorithm introduced as a standard in 1976.

DES makes the use of 56 bits of its 64-bit key to encrypt the messages in fixed-sized blocks of data. In the 1970s it was considered extremely secure, the advancement in computer technology has led to a lot of hacks with the DES cryptography. In fact, in the late 1990’s, the DES encryption was regularly broken.

To counter this vulnerability, developers started using a newer standard, i.e. Triple DES.

This new standard increased the overall strength of the algorithm by:

  • Using two or three 64-bit keys
  • Performing encryption three times on each message
  • Every time the result is passed, it is used as the source for the next one.

#2 RC2

Ron Rivest developed the RC2 algorithm in the late 1980s as a replacement for DES. RC2 encrypts data in 64-bit blocks and has a variable key size of 8 to 128 bits in 8-bit increments.

Lotus Development requested Rivest’s help in creating RC2 for the company’s Lotus Notes software.

However, the truth is that the strength of this algorithm comes mainly from the length of its keys, which makes it vulnerable and easily compromised.

#3 Blowfish and Twofish

The Blowfish algorithm was developed by security researcher Bruce Schneier in the early 90’s. Quite like RC2, Blowfish breaks messages up into equal-sized 64-bit blocks and encrypts the blocks. However, unlike the RC2, its key sizes range from 32 to 448 bits. Schneier released Blowfish as a public-domain algorithm so that anyone can use it to encrypt data.

Schneier then went on to develop Twofish, which is an improvement on Blowfish and uses 128-bit blocks and keys up to 256 bits long. Despite its theoretical vulnerabilities, no one has broken it in real-life yet and it also happens to be one of the fastest growing fixed-block algorithms currently available.

#4 Serpent

The Serpent algorithm was developed in 2000 by Cambridge researchers Ross Anderson, Eli Biham, and Lars Knudsen. These gentlemen thought that the other algorithms had theoretical flaws that rendered their encryption vulnerable to shortcut attacks. They wanted to make an encryption algorithm which was flawless as possible.

Serpent was the result of this collaboration and it uses a 128-bit block and 256-bit keys. Similar to the Blowfish and Twofish algorithms, the Serpent algorithm is in the public domain. Serpent is highly rated for “safety factor,” i.e. trustworthiness against attacks.

Advantages of Symmetric Cryptography

Even though symmetric cryptography has some major problems the biggest advantage of symmetric cryptography is that it requires very little overhead. All that you need to do is to share one single key with your intended recipient.

Even now, many softwares use this method in conjunction with asymmetric cryptography to provide fast and efficient encryption/decryption services.

The Problems With Symmetric Cryptography

Despite its simplicity, there are a lot of problems with symmetric cryptography.

Problem #1: The shared key

Encryption and decryption are done by one single key and that is a huge problem. The very fact that you need to share your key with your recipient in an extremely secure way makes the process very risky because one single leak can compromise everything.

Problem #2: It is not scalable

Another problem is that this method is not scalable at all.

Suppose you run a company which deals with storing sensitive data. If you used symmetric key cryptography, you will need to share a unique key with all your clients. Now this is ok if you have 2 or 3 clients. But what happens when you get popular and have 100s of clients? It will be impossible for you to save that many keys.

Since symmetric key cryptography had so many issues, a solution was needed, and in the 1970’s that solution finally came.

James Ellis and His Breakthrough

In 1970, a British mathematician and engineer named James Ellis thought of an idea which laid down the foundation of modern asymmetric cryptography.

He theorized that what if encryption and decryption were inverse operations based on 2 different keys? In Ellis’s hypothesis, the receiver of the message was not a passive party and they needed to take part in the process as well. The two different keys will be the encryption key and the decryption key. If the original plaintext message is the safe, then the encryption key is the padlock and the decryption key is the key.

The padlock could be sent to anyone in the world but the key had to be kept private. So, anyone can send a message to the receiver by locking it with their padlock and since only the receiver has the key, only they can open it.

So, this was the theory, but it was still only a theory. Two things were needed to make it more practical and those two things were:

  • The trapdoor function.
  • The Diffie–Hellman key exchange.

The Trapdoor Function

A trapdoor function(or a one-way function) is a function where it is easy to go from one state aka the domain to the other state aka the range but it is hard to go back from the range to the domain unless you have knowledge of a key which is called the trapdoor function.

 

The idea of trapdoor functions is completely based on the idea of keys. When you have a public key (K) which is used to go from the domain and range.

Ok, we guess you should know what domains and range are. Consider the following:

f(x) = {k1, k2, k3, k4}.

In this function “x” is the domain and {k1, k2, k3, k4} is the range. The domain is the input which can cause the function to create different outputs which are the range.

So, in trapdoors, the public key (K) is used to go from the domain to range. However, in order to go from the range to the domain, you will need to use a trapdoor function which is also known as the private key (k). Since the private key and public key need to be inversely related to each other so the trapdoor function f() will work like this:

K= f(k)

Because we are using a trapdoor function, it is near infeasible for the private key to be determined by its corresponding public key.

A very simple example of the trapdoor functions is the mathematical operation of large numbers. Think of addition. Let’s add 145231 and 98918391283 you will get 98918536514. If you have the two numbers from the beginning then you will figure out how you will get the final number, however, if you just have 98918536514 you won’t be able to know which numbers were added to get that result.

Now, that’s the trapdoor function, let’s look into the Diffie-Hellman key exchange.

What is the Diffie-Hellman key exchange?

Before understanding what the Diffie-Hellman key exchange is, let’s do a simple thought experiment.

  • There are two people Alice and Bob and they want to communicate.
  • They are on either side of the eavesdropper Eve
  • Alice and Bob can only communicate with each other via a shared line which is being tapped by Eve.

The main thing about this thought experiment is that every single thing that Alice and Bob say to each other will be eavesdropped upon by the bank. So, how exactly can they communicate without Eve knowing what the plan is?

This where the Diffie-Hellman key exchange comes in. The idea is to make sure that two parties can get hold of secret information without sharing it. One of the best examples of the Diffie-Hellman is the idea of the secret color exchange.

For this there are 3 things that you need to keep in mind:

  • Suppose that Alice and Bob both publicly agree that yellow is the common paint that they are going to be using
  • Alice  then secretly decides that she is going to use orange along with yellow
  • Bob then secretly decides that he is going to use aqua along with yellow.

Step One

Both Alice and Bob have declared publicly that they are going to be using the color yellow. So, Eve, Alice, Bob all have the knowledge of Yellow

Eve: Yellow
Alice: Yellow
Bob: Yellow

Step Two

Do you remember that Alice and Bob have their own private colors, which was orange and aqua respectively? In this step, Alice mixes yellow with her private color, aka orange, which will create a new composite color. This is Alice’s color aka CA.

Similarly, Bob mixes his private aqua with yellow and gets a new composite color, which we will call CB.

So, now our equation looks like this:

Eve: Yellow
Alice: CA
Bob: CB

Step Three

Now that they have the composite colors, Alice and Bob will send it to each other. Now, since Eve is tapping this line, they will get their hands on CA and CB.

But this is where they face an obstacle.

Color combinations are classic trapdoor functions. Given two colors, you can easily get the third color, however, without those two colors, it is nearly infeasible to tell what made this new composite color.

To put it simply, Eve will get a hold of CA and CB but will have no idea which are the colors that have gone into its creation.

So, this is what things are looking like right now:

Eve: Yellow, CA, CB.
Alice: CB
Bob: CA.

Step Four

The magic happens in this step. Alice and Bob have combined their respective secret colors into yellow and given it to the other person. Right now, Alice and Bob and are going to once again mix their secret colors into the composite colors.

Just like that, both of them now share the same color aka Yellow + Aqua + Orange, which happens to be Brown.

However, Eve will still only have CA and CB because they don’t know what the secret colors are that they should put into the mix.

So, this is what things finally stand:

Eve: Yellow, CA and CB.
Alice: Brown.
Bob: Brown.

This is the beauty of Diffie-Hellman. Without exchanging private and secret data (i.e. their secret colors) with one another, they were able to reach the same conclusion, i.e. the color brown. In the process, anyone who was eavesdropping wasn’t able to get any information.

This is what the diagram of this entire exchange looks like:

This is the representation of the Diffie-Hellman exchange, but a mathematical means was needed to make sure that there could be practical applications of this as well. For this reason, the modulus function was used.

Combining Mathematics with Diffie-Hellman Exchange

Suppose there is a generator g for a finite field of size n. In this field, we need two random value a and b. This is how this is structured: Given only g, g^a, and g^b, it will be impossible for anyone apart from the intended recipient to determine g^ab.

This single condition activates the trapdoor function.

Given this condition, two parties can exchange messages and reach the same conclusion without explicitly communicating it with each other.

So what is the mathematical voodoo that happens behind the scenes? (Note: “n” is the range of value from which Alice and Bob choose their random values).

  • Alice chooses a random value “a” from the field n and determines a message M1 such that:
    M1 = g^a mod n.
  • Now, it is Bob’s turn to choose a random value and he chooses “b”. Using this he creates a message M2 such that:
    M2 = g^b mod n
  • Both Alice and Bob now exchanges their messages with each other.
  • Alice will now determine a special message K by using her secret “a” such that:
    K = g^ab mod n
  • Bob will now determine the special message K by using his secret “b” such that:
    K = g^ab mod n

As we have seen, Alice and Bob have reached the same conclusion without sharing their secret information.

The Trapdoor Function and Diffie-Hellman key exchange laid down the foundation of asymmetric cryptography.

Asymmetric Cryptography

Asymmetric cryptography utilizes two keys, a public key and a private to encrypt and decrypt a particular data. The use of one key cancels out the use of the other.

There is two real-world utilization of asymmetric cryptography that we are going to look into:

  • The Rivest-Shamir-Adleman algorithm aka the RSA.
  • The Elliptical Curve Cryptography.

#1 What is the RSA Algorithm?

The RSA algorithm was named after the MIT professors Rivest, Shamir, and Adleman who discovered this algorithm. The RSA algorithm is the most widely used and popular asymmetric cryptographic algorithm in history.

So how does it exactly work? In order to understand this, there will be some variables that we shall be working with.

  • A secret message “m”
  • “m” raised to the power of a random number “e”
  • The modulus of the above with a random number N which gives us the ciphertext c.

So, what we are looking at is: c = m^e mod N

First thing first, the trapdoor function condition. Which it is easy to get c given the equation “m^e mod N”, but it is given ONLY c, e, and N, it is impossible to get the message “m”.

So, how will our intended recipient decrypt the ciphertext? For this, they will need a key. This is why they choose a random variable “d” such that:

c^d mod N = m.

Previously, we found that: c = m^e mod N.

Now let’s substitute the value of c above and we will get: m ^ e ^ d mod N = m OR m ^ ed mod N = m.

In the equation above:

Public key = e and N.
Private key = d.

Now, let’s use some numbers and see how the equations work.

Suppose the message that you have to send is 42. In other words, m=42.

Along with that:

e = 17.
N = 3233.
d = 2753

The encryption process

c = m^e mod N.

Using simple substitution:

c = 42^17 mod 3233 = 2557.

So the ciphertext is 2557.

The decryption process

Let’s do c^d mod N.

2557^2753 mod 3233 = 42

And just like that, we got our original message back.

It is quite simple and pretty ingenious. As we have seen above, the private and public key needs to be mathematical derivatives of each other in a way that:

F(private key) = public key, where F() is another trapdoor function.

It should be difficult for anyone to determine the private key from the public key. In fact, it should be so difficult that it will take the world’s most powerful computer decades upon decades to derive one from the other.

Now, this won’t work in real life. How are we going to reap the fruits of asymmetric cryptography when it is nearly infeasible to derive the private key from a public key?

Well, this guy over here answered that question for us:

This guy right here, with a pretty enviable beard, is one of the greatest (if not the greatest) mathematician who ever lived, Euclid. One of his many areas of research was prime factorization. In fact, that is exactly what we are going to look into next.

Euclid and Prime Factorization

Euclid found out centuries ago that any number > 1 can be written as a product of prime numbers. Eg. 289 can be written as 17*17.

Let’s take a look at our equation again: C= m^e mod N.

In the equation, N is the key that opens this trapdoor function. However, by just knowing what N is, it is nearly infeasible to know the prime factors that make up this number. This especially true if N is a large number. Let’s do a little experiment.

Go on google and try to multiply two big numbers. More often than not, your browser should do this in a second.

However, if you try to ask your computer to find the prime factors of a huge number, that will take a long time (Don’t do this on your computer). In fact, it may take days, months and even years to find the prime factors.

And just like that, cryptographers were able to focus in on the heart of the trick. They were able to determine how N can be calculated, via prime factors. So, this begs the question. What exactly do you need to use the RSA algorithm?

  • Firstly, you need a big random prime number P1
  • Generate another big random prime number P2
  • Calculate N by multiplying P1 and P2. N has to be such a big number that it will take machines hours upon hours to find its prime factors.
  • Make N public while hiding the value of P1 and P2. So, N is the trapdoor and P1 and P2 are the keys that will open the trapdoor

Ok, so what do we have so far?

We have determined how to calculate N via its prime factors P1 and P2. But, we still have determined the value of “e” and “d” and we don’t know how exactly we are going to generate our private key from the public key.

In order to obtain all these values, we are going to take the help of this guy.

The Phi Function of Leonhard Euler

The guy over there with that terrible headdress is one of the greatest mathematicians who ever lived, Leonard Euler. In 1760, Euler did some research which proved to be path-breaking in every sense of the word. He studied the nature of the breakability of numbers, which he called the phi function.

Given phi(N) where N is a random integer, the value of N will be the number of numbers between 1 and N which do not share any common factors with N.

Let’s see how it works. If N=8 then the numbers from 1 to 8 are: 1,2,3,4,5,6,7 and 8.

In this sequence, only the following numbers don’t share any factors with 8 except 1: 1,3,5, and 7.

So, phi(8) = 4.

The calculation of the phi function difficult, but for one unique case. Check out the following graph:

The graph above tracks the distribution of phi values over integers upto 1000.

See that straight green line at the top which is conveniently arranged? That is the phi of prime numbers. So, why is it so conveniently arranged?

Remember the definition of a prime number?

A prime number can be factorized only by itself. So, if p is a prime factor then phi(p) will be p-1.

Let’s take an example.

Suppose the prime number in question is 3. The numbers between 1 and 3 are 1,2 and 3.

The only number that can factorize 3 is 3. So phi(3) = 2, which happens to be 3-1.

Similarly, if you were to find the phi of a large prime number, say, 541 then:

Phi(541) = 541-1 = 540.

So as you can see, it doesn’t matter how big the prime number is, it is extremely simple to calculate its phi.

Ok, so all this sounds pretty good, but what is the point of all this?

Well, you will see that in a second, once we look into the multiplicative nature of phi functions. What is the multiplicative nature of phi functions?

Multiplication of Phi

So, this is how the multiplication of phi works: phi(A*B) = phi(A) * phi(B).

Let’s take an example. We are using two prime numbers P1 and P2. We need to determine N by multiplying P1 and P2.

Following the multiplicative property of phi functions:

phi(N) = phi(P1) * phi(P2)

So, phi(N) = (P1-1) * (P2-1)

And just like that, we have discovered the trapdoor function for phi. If we know the prime factors of N then it is easy to calculate the phi(N).

For example, the number 221 has prime factors 17 and 13.

So phi(221) = (17-1)*(13-1) = 192.

This way, it becomes very simple to know the number of prime factors of any number N.

Ok, so what we have so far are phi functions and modular exponentiation functions. What we need to do is to bring these two together with a little bit of mathematical wizardry courtesy of our good friend Euler again.

The Euler’s theorem

Euler’s theorem states that:

For any two numbers m and n that don’t share a factor:

m ^ phi(n) ≡ 1 mod n

Meaning, for any two numbers m and n, as long as they don’t share a factor, m raised to the phi(n) divided by n will always leave a remainder of 1. Let’s see this in an example.

Suppose, m= 8 and n = 5.

Phi(5) = 4

So, 8 ^ 4 = 4096.

Replacing this in the Euler’s theorem equation:

4096 ≡ 1 mod 5 holds true because 4096 on being divided by 5 leaves a remainder of 1.

That is pretty ingenious, isn’t it?  However, we still need to tweak this a wee bit before you can finally get our solution.

Tweak #1

1^k = 1 for all k.

So, keeping this in mind, if in m ^ phi(n) ≡ 1 mod n we multiply the exponent phi(n) with any number k, the final solution will be 1^k which is still 1.

Now, this modifies the equation like this:

m ^ k*phi(n) ≡ 1 mod n

Tweak #2

For all m, m*1 = m.

So, in our modified equation, if we multiply both sides by m we get:

m*m ^ k*phi(n) ≡ m*1 mod n

Which becomes:

m ^ k*phi(n)+1 ≡ m mod n

That right here is the final form of the equation.

To round up everything, let’s bring back the old equations that we cam across in the guide earlier:

  • c = m^e mod N.
  • m = c^d mod N
  • m ^ e*d mod N = m

Check out the last equation on the list above:

m ^ k*phi(n)+1 ≡ m mod n

Doesn’t that look familiar to something we just saw?

Wait for a second, it does look pretty similar to the equation we came up with after the second tweak:

m ^ k*phi(n)+1 ≡ m mod n

This right here is where we get our breakthrough. If you compare the two equations then we will get:

e*d = k*phi(n) + 1

Do you realize what just happened here?

Finally, we have an equation where the values of e and d depend on phi(n). In other words, we can finally derive the public key from our private key.

Since we already know the value of e, it is easy to calculate d, the private key, ONLY if the factorization of N is known (which is a secret that Alice has kept to herself).

So, d= (k*phi(n) + 1)/e.

This is the trap door that will undo the encryption done by her private keys e and n.

Let’s see how this works.

So, you are exchanging messages with your friend Bob. The message that Bob wants to send you is M where M = 89.

You now need to generate your keys to initiate the encryption process.

You use two prime numbers p1 and p2 where:

P1 = 53.

P2 = 59.

Now you can generate N where N is the product of the two prime numbers.

N = P1 * P2 = 53 * 59 = 3127.

Using Euler’s multiplicative property:

Phi (N) = Phi(P1) * Phi (P2) = (P1 – 1) * (P2 – 1) = 52 * 58 = 3016

Now, you will need to generate a value e which will have no factors with the value of phi(N).

Let’s say you chose e = 3.

Now, it is time to generate your private key d, such that:

d = (k*phi(N) + 1)/e

Taking k = 2 we get:

d = (2* 3016 + 1) / 3 = 2011.

Now, she will lock up all the values except N and e which are her public key and make the knowledge of these two global. Alright, so let’s start with the encryption.

Bob encrypts the message

Now, Bob needs to send message M, which is 89, and he needs to calculate the cipher text c such that:

c = M^e mod N.

Now, we know that: M = 89, e = 3 and N = 3127.

So: c = 89^3 mod 3127 = 1394.

He then sends it over to you.

You decrypt the message

You get the ciphertext and all that you have to do is to decrypt it using her private key d which we know to be 2011.

So, you now do this calculation: c^d mod N

1394^2011 mod 3127 which is 89 aka the original message M.

And this, is the RSA algorithm, one of the most popular cryptographic algorithm ever made.

#2 What is Elliptical Curve Cryptography?

Bitcoin, Ethereum and most of the cryptocurrencies use elliptical curve cryptography for encryption purposes. An elliptical curve is any curve that follows this equation:

Y^2 = x^3 + ax + b

Where (x,y) is a point on the curve and a and b are constants.

There are infinite curves that you can make following this equation and more often than not, the following is how one of these curves, in general, will look like:

What are the properties of an elliptic curve?

  • The curve is symmetric across the x-axis i.e. the horizontal axis
  • Any line that goes through 2 points on the curve will intersect the curve on a third point.
  • Any tangent on the curve will intersect the curve on one more point.

Mathematical properties of the curve.

#1 Addition property of the curve

Like we saw above, the second property of the curve is that any line that goes through 2 points on the curve will intersect it at a third point. Look at the graph below.

So, we have a line that is going through two points on the curve, V and A and it intersects the curve on a third point X. We will reflect the point X back into the curve and the curve will finally look something like this:

The reflection of X on the curve is going to be the result of the addition of V and A. So, the reflection of X is V+A. That, right there, is the additive property of the curve.

By the way, there is an interesting feature here.

If we add the a point with its reflection, i.e. if we added X to V+A, then the result will be infinity.

Can you tell why?

Any line that passes through X and (V+A) will intersect the club at infinity.

Multiplication property of the curve

Alright, so we have seen the additive property. But did you know that these curves have multiplication properties as well?

Think about it, when we multiply a number by two, what exactly are we doing? We are adding the number to itself, right? Similarly, when we multiply a number by 3, we are adding it to itself three times. Let’s use the same logic here.

Suppose we have a point V, what do we do to find 2V?

All that we have to do is to run a tangent through the point V. The point where it intersects the curve is reflected back onto the curve. The reflected point is 2V.

Now suppose we want to find 3V. We will join V and 2V and then reflect the point of intersection, like this:

Do you see how each multiplied point is cycling around the curve? That is how it gets its multiplicative property.

Mathematical properties of an elliptical curve

Property #1: The points on the curve form an Abelian group

The properties of the Abelian group are as follows:

  • They have an identity.
  • The have inverses aka reflections.
  • The points are associative meaning for three points A, B and C on the curve: (A+B) + C = A + (B+C).
  • The points are closed on the curve.
  • The points are commutative meaning for two points A and B. A+B = B+A.

Property #2: Multiplication on the curve is fast

We have seen how the multiplication property in the curve works. An interesting fact is that multiplication in the curve happens extremely fast. Suppose we have a point P and we want to find 100P. Instead of adding the number to itself 100 times we can do the following:

  • Add the point P to itself to get 2P.
  • Add 2P and P to get 3P.
  • Add 3P to itself to get 6P.
  • Add 6P to itself to get 12P.
  • Add 12P to itself to get 24P.
  • Add 24P and P to get 25P.
  • Add 25P to itself to get 50P.
  • Add 50P to itself to get 100P.

Instead of going through 99 steps we need to just go through 8 steps.

Property #3: Division on the curve is slow

Whilst multiplication is fast, the division is very slow. Suppose we have Q = nP and we want to find the value of n by dividing Q by P, there is no simple way of doing that. In fact, the only that you can go about it is by manually go through the numbers one by one to find a value which satisfies the equation.

This kind of a brute force solution is obviously extremely tedious and not recommended.

This is called the discrete logarithmic problem and this is what gives the curves its trapdoor function. So, while it is easy and fast to multiply, it is not really that practical and infeasible to divide

The elliptical curve Diffie-Hellman key exchange

So, until now we have seen the various properties of the curve and we have also seen that the curve has a trapdoor function. Now how do we determine whether it is usable for cryptography or not?

For that, we need to get back our Diffie-Hellman key exchange.

How will you and your friend Bob come up with a common secret without explicitly knowing the information that the other person possesses? How are you two going to do this by using elliptical curves?

  • The two of you will publicly agree on a point on the curve that you want to use. Let’s say that this point is P. This will be public knowledge and known to everyone.
  • In secret, you will choose another point “a” and Bob will choose another point “b”
  • You will then compute the value “aP” and send it over to Bob. Bob will then calculate the value “bP” and send it over to you
  • The aP and bP values will be public knowledge, however, no one will be able to break it down and get the secret values a and b, because division in these curves in infeasible. That’t the trapdoor function after all.
  • You will then multiply a with the bP you have received. Bob will multiply b to the aP that he has received.
  • You will end up with abP while Bob ends up with baP. Since the points are Abelian in nature, abP = baP.

So as we can see. The curve satisfies the Diffie-Hellman key exchange.

Signature Verification Work On The Elliptical Curves?

Signature verification is a critical process for cryptocurrencies. So, let’s look at how this happens on elliptical curves.

Before we see how the process works let’s checkout certain variables and their meaning that we will be using the following equations.

  • Private key = d.
  • Message = z.
  • Public key = Q.

G will be a constant point on the graph which will be provided by bitcoin.

  • “k” is a random number which will be generated automatically for every unique signature.
  • “n” is another constant that will be provided by Bitcoin.

Alright, so now that we have all the basics set up, let’s see how it actually works.

Signing a message

Public key Q = dG. (it is impossible to get the private key from Q and G because, as we have already seen, division is infeasible in elliptical curves).

Let’s multiply G with a random number “k” and plot that point on the graph. The coordinates of the point is (x,y) where (x,y) = kG.

Next, we need to determine two values r and s such that:

r = x mod n.

s = (z + rd) k^-1 mod n

The reason why we generate r and s is because these are the co-ordinates of our signature.

When all this is set up, we send the point (r,s) for verification.

Verifying a message

The verifiers will use a simple equation:

z*s^-1*G + r*s^-1*Q

The value of this equation will give us the point (x,y).

Remember one thing, the verifiers don’t have the x coordinate given implicitly to them from the message sender. The only have the values r and n.

But, since we already know that r = x mod n, we can simply solve for x. If the value of x matches with the sender’s then we know that the signature has been verified positively.


What’s Happening Behind The Scenes

Let’s check out the equation that the verifiers will have to do once again:

  • Step 1: z*s^-1*G + r*s^-1*Q

We know that Q = d*G, let’s simply substitute the value.

  • Step 2: z*s^-1*g + r*s^-1*d*G

We can take (z + r*d) common

  • Step 3: (z + r*d)*s^-1*G

Now remember, we have already established that s = (z+r*d)*k^-1 mod n ,let’s substitute the values here:

  • Step 4: (z+r*d)*(z+r*d)^-1*k*G

The (z+r*d)*(z+r*d)^-1 cancel each other out and we are left with:

  • Step 5: k*G which is the co-ordinate (x,y) that the sender originally sent.

Weakness in Elliptical curves?

Elliptical curves have proven time and again that they are the best for of cryptography out there but, this doesn’t mean that they are free from any vulnerabilities.

  • If we choose a wrong curve, which ultimately loops into itself, then there is a chance that some iteration of a point will go back to the original point. So, we could have a situation where 500P = P, where P is any point on the graph.
  • A weak curve can also be chosen, which hackers can easily break into.

Having said that, while EEC may have weaknesses but they are pretty manageable weaknesses.

RSA vs EEC. Why did Bitcoin and Ethereum Go With EEC?

The reason why EEC was chosen over RSA is that it offers the same level of security as RSA by consuming far fewer bits.

In fact, a 256-bit key in EEC offers the same level of security as an RSA 3072-bit key! Similarly, a for a 384-bit key in EEC the RSA will have to provide a 7680- bit key to provide the same level of security.

Also, consider this.

The NSA has declared that a 384-bit key in EEC is strong and secure enough to encrypt top-level secret documents.

Keys In CryptoCurrency

As mentioned above, bitcoin and ethereum use elliptical curve cryptography. So, how exactly do transactions work in Bitcoin and Ethereum? Suppose you want to send some Bitcoin to your friend Bob, you send them to his public address which is a hash of his public key along with some extra information. The public key is derived mathematically from your private key. You should never share your private key because whoever gets them has full control of your friends.

Whenever you sign off on transactions, you use your private keys to validate the fact that it is indeed you who is sending the transaction. Oh and btw there is another thing that you should take a look at.

This is what a public address looks like this: 1CmWotWnsCyczUqbrgqQg8TP9M6S2VUdRM

Its corresponding private key looks like this: 5JSVWbHUTWn16i15vGWC7ENPSvhzd6FBrpTREiU6uFw7mSreAEC

Did you notice how the private key is significantly longer than the public address? So, how is the public address derived from the private key? Well, let’s take an example:

  • First, you will generate your 256-bit private key. You can either do it manually or you can use a random key generator.
  • After that, your wallet will automatically generate your public address for you via an inbuilt algorithm. Let’s see how that happens.
  • The private key will be passed through a SHA-256 algorithm to get the first hash
  • The hash then goes through a RIPE MD 160 hash function. It generates a new hash and a copy of it will be kept aside, let’s call this H1
  • The hash is then again passed through SHA 256 to get another hash.
  • This hash is…well…passed through SHA 256 again to generate another hash. The first 7 bits of this hash is saved and let’s call it H2.
  • H1 and H2 are added up and the result is your public address.

As you can see, the process to generate the public address from the private key is so complicated, it is infeasible for this process to be reversed. In fact, it will take the world’s most powerful computer 40000000000000000000000000000000 years to complete this calculation!

Bonus: Privacy Coin Cryptography

Privacy coins have some unique cryptography as well which we should look into. The two coins that you are looking into are:

  • Monero
  • ZCash

Monero

Unlike Bitcoin, which requires 1 public key and 1 private key, Monero needs multiple keys to function. The keys are broadly divided into two categories:

  • View Keys
  • Spend Keys

View Keys

The view keys are for the receiver of a particular transaction. Monero has a public view key and a private view key. The public view key will generate the stealth public address i.e. the one-time address where the funds will be sent for the receiver. The private view key will be used by the receiver to scan the blockchain and locate their funds.

Spend Keys

If the View Keys were all about the receiver, the spend keys are about the sender. Quite like view keys, there are also two kinds of spend keys, public spend key and private spend key. The public spend key will help the sender do ring transactions. The private spend key will help in key image creation and enable them to send transactions.

The Monero address btw is a 95-character string which is made of the public spend and public view key.

The Three Pillars of Monero

Ok, so till now what we have done is just look at Monero on the surface. We need to go deep and see what makes it so special. For that, we need to familiarize ourselves with what we like to call the “Three Pillars of Monero.”

  • Ring Signatures.
  • Stealth Address
  • Ring Confidential Transactions

Ring Signatures

How do normal signatures works in real life?

You get a pen, put your signature on the paper and that’s that right? However, having said that, there is a very real problem of someone forging your signature and misusing it for malicious purposes. Now, what happens if you combine your signature with 5 random people’s signature? Won’t it make your signature more difficult to decipher?

That’s the logic behind ring signatures (for simplicity’s sake, we are going to consider only pre-RingCT transactions.)

Imagine Alice wants to send 100 XMR to Bob. How is she going to utilize ring signatures? Well, firstly she will have to determine her ring size. The ring size is random outputs which are taken from the Monero blockchain. So, if she were to send 100 XMR, she will need to find other similar outputs. The bigger the ring size, the more expensive the transaction (i.e. transaction fees). After choosing the outputs for her ring size she signs them with her private spend key.

So, how does this save her identity?

If Alice chooses a ring size of 5,

So, suppose Alice chooses a ring size of 5 i.e. 4 decoy outputs and her own transaction. For an outsider, any of these decoys have as much chance of being the transaction as Alice’s real one.

In a ring signature transaction, any of the decoys is as likely of being an output as the actual output because of which any unintended third party (including the miners) won’t be able to know who the sender is. Oh, and btw, miners are included in this “unintended third party” group. This, however, causes a problem.

Now, this brings us to a problem.

It is critical for these miners to actually see the transaction to prevent double spend attacks. In simple words, suppose you own 1 BTC, and you try to send that Bitcoin to your friend. After doing so, you cannot send the same Bitcoin to another person. In cryptocurrencies like bitcoin and ethereum, such action can be denied via miners. Since the miners can see the bitcoins that you are using to transact, they can immediately spot a potential double spending scenario,

However, in Monero, the miners can’t see your XMR. So, how do you prevent double spend without compromising privacy?

Via more cryptography of course.

Every transaction in Monero has its own unique key image (more on this later). Since the key image is unique for each transaction, the miners can simply check the key image and know whether it has been used or not and prevent double spending.

So, this is how the sender’s identity gets protected. So what about the receiver?

Stealth Address

If Alice is sending Monero to Bob, then what is protecting Bob’s identity? Remember that there are two public keys for each user in Monero? There is a public spend key and a public view key.

So, Alice’s wallet will use these two public keys to generate a unique one-time public key. The computation of this one-time public key (P) works like this:

P = H(rA)G + B

In this equation:

r = Random scalar chosen by Alice.

A = Bob’s public view key.

G = Cryptographic constant.

B = Bob’s public spend key.

H() = The Keccak hashing algorithm used by Monero.

This one-time public key creates a one-time public address which is called “stealth address”. The stealth address is a random location in the Monero blockchain where Alice had sent the Monero to. Now, how is the Bob going to find this random location and get her Monero?

Remember that Bob also has a private spend key? This is exactly where it comes to play. Bob’s private spend key will act like a tracker to trace the entire Blockchain to locate and unlock Alice’s transaction.

This is where it comes into play. The private spend key basically helps Bob scan the blockchain for his transaction. When Bob comes across the transaction, he can calculate a private key which corresponds to the one-time public key and retrieves his Monero. So Alice paid Bob in Monero without anyone getting to know.

Before we go any further, remember the key image that we were talking about? How Monero prevents any and all double spends because of unique key image generation?

We have seen how the one-time public key(P) was calculated. Suppose the private spend key of the sender is “x”. The key image (I) will be:

I = xH(P).

Note the following:

  • It is infeasible to derive the one-time public address P from the key image “I”.
  • What prevents the creation of multiple key images? Remember that “I” is a hash of P. P is for one-time use only and hash functions are deterministic, meaning H(P) will ALWAYS be I.

Ring Confidential Transactions

Alright, so now we have seen how the identity of the sender and the receiver gets protected. What about the identity of the transactions?

Earlier, if Alice had to send 12.5 XMR to Bob, the 12.5 will get broken down into smaller denominations, which would get ring signatures. Let’s say the 12.5 breaks down into 10, 2 and 0.5. So, Monero would now go through the blockchain to look for similar outputs and then create rings for each of the denominations.

There was a small problem though. While this did protect the identities, the fact remained that it didn’t really protect the transactions. Suppose, Alice wanted to send Bob 0,9 XMR and there weren’t many such outputs available, she wouldn’t be able to get her desired ring size.

This is where Ring CT comes in.

What RingCT does is simple, it hides the transaction amounts in the blockchain. What this also means is that any transaction inputs don’t need to be broken down into known denominations, a wallet can now pick up ring members from any Ring CT outputs.

Because all the outputs now get hidden, it is impossible to know the value of the transaction.

These three pillars working in harmony, Monero gets its privacy.

ZCash

Zcash is a decentralized peer-to-peer cryptocurrency. It was created as a fork of Bitcoin and quite like bitcoin it also has a hard limit of 21 million coins. However, that’s where the comparisons end. Zcash gives you complete and total privacy through the utilization of Zk-SNARKs. In order to understand Zcash, you must understand Zk-SNARKs. Zk-SNARKS stand for Zero-Knowledge Succinct Non-Interactive Arguments of Knowledge. So, we guess the first question you should ask yourself is….

What are Zero-Knowledge Proofs?

In any trusted system, there are two parties, the prover, and the verifier. Prior to this discovery, it was always assumed that it was the “prover” who would be a malicious one instead of the verifier. These researchers flipped this idea around and questioned the morality of the verifier instead of the prover. They asked two simple questions:

  • How can anyone know for sure that the verifier is not going to leak any information?
  • How much about the prover should the verifier actually know?

The later actually has some very severe real-life implications. Take the example of password protection.

Imagine you are trying to login to a website using a password. The standard protocol is that the client i.e. you, writes in the password and you send it to the server, the server will then hash your password and then compare it with the hash in their system. If it is a match, then the password is deemed successful.

This system works smoothly, save for one problem.

The server has the plain, original version of your password, which completely compromises your privacy. If the server gets attacked in any way, then the attacker will get hold of your password. This is why innovations like ZKP are essential. Using ZKP, the client would only need to reveal as much information as necessary to the server in order to get in.

That, in essence, is the idea behind ZKPs.

A ZKP has the following three properties:

  • Completeness: If the statement is true then an honest verifier can be convinced of it by an honest prover.
  • Soundness: If the prover is dishonest, they can’t convince the verifier of the soundness of the statement by lying.
  • Zero-Knowledge: If the statement is true, the verifier will have no idea what the statement actually is.

Let’s do a thought experiment and see how ZKP works.

The Billiard Balls

So we have a prover and a verifier, but the verifier is color-blind. The prover has two billiard balls, red and green. Now, color-blind people can’t tell the difference between the two colors, as you can see from the following image:

So, this is the situation right now. The verifier believes that both the balls are of the same color, while the prover wants to prove that the colors are both the same. How are we going to do this?

The verifier takes both the balls and hides it behind their back. Now, he can either switch the balls in his hands or keep them as is. After he is done switching the balls (or not), he presents them to the prover. The prover can obviously see the actual color of the balls and will know instantly whether the switch has been made or not.

The verifier can then repeat this experiment as many times as he wants before he is satisfied with the fact that the prover wasn’t lying about the color of the balls.

Let’s look up the three properties of the ZKP in the experiment given above:

  • Completeness: Since the statement was true, the honest prover convinced the honest verifier.
  • Soundness: If the prover was dishonest, they couldn’t have fooled the verifier because the test was done multiple times.
  • Zero-Knowledge: The prover never saw the verifier switching the balls in his hand.

Zk-SNARKs: An Introduction

So, now that you know what zero-knowledge proofs are and how it can be used in cryptography, let’s introduce ourselves with Zk-SNARKS aka “Zero-Knowledge Succinct Non-Interactive Argument of Knowledge” aka a fine example why developers shouldn’t be allowed to name anything.

Its use in blockchain technology is immense. Let’s check this with an example. Suppose Alice and Bob enter a smart contract agreement, where Bob has to do certain tasks in a row and in a proper order in order to receive the payment from Alice. Obviously, Alice wants to be shown proof that Bob is following each and every step sequentially, however, what if Bob works for a security company and is dealing with highly classified data?

Using Zk-Snarks, Bob can show just enough proof to Alice as to what he is doing, without explicitly showing her the actual parts. That way, Bob can be credible without giving away any confidential information.

Functionality of Zk-SNARKS

A typical Zk-Snark has three algorithms: G, P, and V.

  • G: A key generator which takes an input “lambda” and a program C. It generates two public keys, a proving key (pk) and a verification key (vk). The value “lambda” should be confidential and not revealed under any circumstances whatsoever.
  • P: The prover will be taking three items as input. The inputs being: pk, random input x (another public item), and the actual statement (w) whose validity they want to prove without revealing the details. The P algorithm will generate proof prf such that: prf = P(pk, x, w)
  • V: This is the verifier algorithm which checks the prover’s claim. It returns the value “TRUE” or “FALSE”. This algorithm takes in three inputs: vk, x, and prf. The function will act thus: V(vk, x, prf).

Christian Lundkvist shows how the algorithms will work in unison in an example function. Consider the following:

function C(x, w)

{

return ( sha256(w) == x );

}

The function takes in two values x and w. It returns TRUE if the SHA-256 hash of w is the same as x otherwise it returns FALSE.

So, how does Zk-Snark work with the above program?

Step #1: Key Generation

The verifier will have to first generate the proving and verifying key using the generator G. The first thing they will have to do is to generate lambda. As mentioned above, lambda is a secret value so they need to be extra careful while doing so. Generation will look like this:

G(C, lambda) = (pk , vk).

Now that the keys are generated, we go to the next stage.

Step #2: Proof Generation

Alright, so the two keys have been generated, the Prover must now generate a proof to validate her statement. She is going to use the proving algorithm P. Refer to the example function given above. Her statement is “w” and the SHA-256 hash is “x”.

prf = P( pk, x, w).

Step #3: Verification

With the proof now generated, it is upto the Verifier to verify the validity. He does it through the verification function V.

Conclusion

So, as can be seen, public key cryptography aka asymmetric cryptography is one of the backbones of cryptocurrency. It is impossible to even imagine how bitcoin and ethereum would have been secure without it. Every time you make a transaction, be thankful to all the mathematicians and cryptographers who have made this wonderful medium possible.

 

post quantum

Leave A Reply

Your email address will not be published.