One disadvantage of symmetric cryptosystems which require private key like AES is that the private key must be exchanged. And Diffie-Hellman key exchange algorithm enables exchange private keys over a public channel. So it can solves following dilemma.

Alice and Bob want to share a secret key which is going to be used in a symmetric cipher, but all of their communication channel are insecure, furthermore every infomation that is exchanged over channel is observed by their adversary. In this situation how could they share a key without making it available to their adversary?

And one of the solutions is Diffie-Hellman key exchange, and this is not about encryption or decryption **but to securely exchange the private keys for symmetric cryptosystems.**

# How Algorithm Proceed?

- First sender Alice generate huge prime numbers
and*n*, and it is best to choose numbers which*g*is the*g***primitive root**oftypically both numbers are over 1024 bits. Then sender (Alice) sends it to the receiver (Bob). And everyone can see*n,*and*n*.*g*

Before explaining about why it is best to choose primitive root, what is primitive root?

## primitive root

If ** g** is the primitive root of

**, then**

*n***,**

*g mod n***…**

*g² mod n***generates**

*gⁿ⁻¹ mod n***all**the integers within the range

**[1,**. For example, let’s take

*n*-1]**and**

*n*=11

*g*=8**8¹ mod 11 = 8**

**8²**

*mod*11 = 9**8³**

*mod*11 = 6**8⁴**

*mod*11 = 4**8⁵**

8⁶

8⁷

8⁸

8⁹

8¹⁰

*mod*11 = 108⁶

*mod*11 = 38⁷

*mod*11 = 28⁸

*mod*11 = 58⁹

*mod*11 = 78¹⁰

*mod*11 = 1We can see that** 8¹ mod 11, 8² mod 11, … 8¹⁰ mod 11** generates integers set which values are in the range **[1, 10]**, so we can conclude that 8 is a primitive root of 11.

Then how about ** n=11** and

**?**

*g*=10**10¹ mod 11 = 1010² mod 11 = 110³ mod 11 = 1010⁴ mod 11 = 110⁵ mod 11 = 1010⁶ mod 11 = 110⁷ mod 11 = 1010⁸ mod 11 = 110⁹ mod 11 = 1010¹⁰ mod 11 =1**

In the case of ** n=11** and

**, the result integer set is**

*g*=10**{1, 10}**which is

**not all**the integers within the range of

**[1, 10]**. So 10 is not a primitive root of 11.

2. Both sender (Alice) and receiver (Bob) generate a random number which is less than ** n-1**, let’s assume that

**Alice**generates

**and**

*x***Bob**generates

**, and these values are going to be their private keys.**

*y*3. Alice calculates A = ** gˣ mod n **with its own private key

**in the same way Bob calculates**

*x,***with its own private key**

*B = gʸ mod n***and send these to each other.**

*y,*4. Alice and Bob then can calculate the shared secret key X:

- Alice calculates:
*Bˣ mod n = (gʸ mod n)ˣ mod n = gˣʸ mod n = X* - Bob calculates:
*Aʸ mod n = (gˣ mod n)ʸ mod n = gˣʸ mod n = X*

## Insight

And here’s important insight. Both Alice and Bob knows nothing about counterpart’s private key, but they can calculate same value.

One more important thing to know is that A can be calculated by Alice exclusively, B by Bob exclusively. And X includes both A and B.

As a result both of them know nothing about each other’s private key, but at the same time, they can calculate (share) a new secret key which include each private key.

## Example

Let’s follow the algorithm with concrete number.

- Alice generate huge prime number, in this example for simplicity, let’s assume n=37 and g=13, then sends these to Bob.
- Both Alice and Bob generates its own private key. Now Alice generates x=23, Bob generates y=14.
- Alice calculates A = 13²³ mod 37 = 2, Bob calculates B = 13¹⁴ mod 37 = 25. And send these values to each other.
- Alice calculates
, which is*Bˣ mod n*, Bob calculates*25²³ mod 37 = 30*, which is*Aʸ mod n*, both calculates same value and now they can use this value for symmetric cryptosystem private key like AES.*2¹⁴ mod 37 = 30*

## Why choose primitive root for n and g?

The size of keyspace is crucial in cryptography, if there are few possible keys someone malicious can crack the system with brute-force attack quite fast. So we have to make the size of the keyspace as large as possible.

Let’s think about what would happen if we choose ** n** and

**and**

*g***is not primitive root of**

*g***. Take**

*n***and**

*n*=11**as example (we talked 10 is not primitive root of 11).**

*g*=10Then when Alice and Bob calculates new private key X, they use ** gˣʸ mod n** modular formula. But as you know

**result integer set is just**

*g mod n***{1, 10}**, so

**there’s only two possible keys.**

What about ** n=11** and

**is primitive root of 11, we know**

*g*=8,*g***result integer set is**

*g mod n***{1, 2 …. 10}, which is the maximum set.**

# Cracking the Diffie-Hellman Key Exchange

The Diffie-Hellman cryptosystem relies on the fact that there is no efficient algorithm to calculate the discrete logarithm.

The attacker may know n, g, A and B because those values are being sent over a public channel. But in the modular formula which calculates ** X**,

**or**

*Aʸ mod n***, there’s no fast algorithm to calculate**

*Bˣ mod n***or**

*x***.**

*y*So it is known that Diffie-Hellman cryptosystem is secure because of the fact that the discrete logarithm problem is extremely hard to solve.

## Man in the middle attack

One problem in Diffie-Hellman key exchange algorithm is that **there’s no authentication** when exchanging ** n**,

**,**

*g***and**

*A***values. So attacker can use man-in-the-middle attack.**

*B*This is not about cracking the cryptosystem directly because it is practically impossible to solve discrete logarithm problem, but about exploiting the fact that Diffie-Hellman doesn’t provide any authentication.

- After Eve got
**A**, Eve generate a random number**z**, which is smaller than. And he is going to pretend to Bob that he is Alice.*n*-1

2. Instead of sending ** A** to Bob, Eve calculates

**, and sends C to Bob.**

*gᶻ mod n = C*3. At this point Bob thinks that he ** Cʸ mod n = X **is the shared secret key with Alice, but it is with Eve. And Bob sends

**to Eve.**

*B*4. After Eve got ** B**, Eve generates a random number

**, which is smaller than**

*w***, calculates**

*n*-1**, sends**

*gʷ mod n = D***to Alice, pretends he is Bob.**

*D*5. Alice calculates ** Dˣ mod n = X’**,and think that it is shared secret key with Bob but it is with Eve.

6. After that Eve will going to use ** X’** to encrypt/decrypt message when communicating with Alice, and

**for Bob.**

*X*One solution for this attack can be using SHA256 hashes for authentication. But this topic will not covered in this post.

And That’s it. In this post, we covered followings:

- What problem Diffie-Hellman key exchange algorithm solves
- How Diffie-Hellman key exchange algorithm works
- The reason why Diffie-Hellman key exchange is hard to crack

Thanks for reading 🙂

p.s. avatar images from https://getavataaars.com