Hash Collisions for Diffie-Hellman Keys

Back to the Advanced FAQ »

The Problem

UPD: These calculations are for the earlier versions of Telegram that used a 128-bit fingerprint. Currently, 288-bit fingerprints are used, which means the already infeasible attack is now completely impossible.

Suppose somebody (M) wants to implement a MiTM-attack on Diffie—Hellmann Key Exchange between A and B, so that the spoofed shared keys between A and M and between B and M have same hash value (used for key visualization and verification).

Let us denote the hash function in question by H(x); it takes b-bit values. For our application, b=128 and H(x) = first 128 bits of SHA1(x).

Let (p,g) be the public Diffie-Hellman parameters; it makes sense to use the same parameters between A and M and between B and M, which can be accomplished if the parameters are supplied by M (as in our case). Denote by n the number of bits in p. In our case, n=2048.

Now, once A supplies M with its g_1 = g^u mod p and B supplies M with g_2 = g^v mod p, the problem is to find x and y, such that H(g_1^x)=H(g_2^y).

Birthday paradox

An obvious approach, based on the birthday paradox, is to scan 2^{b/2} values of x (say, from an arithmetic progression, or by taking powers of two modulo (p-1)/2, so that the next value of g_1^x mod p is computed from the previous by one multiplication modulo p ), store and sort resulting H(g_1^x), repeat the same process for 2^{b/2} values of y, then compare sorted lists looking for the collision sought.

However, this requires O(2^{b/2}) modular multiplications and hash (SHA-1) computations, and, more importantly, O(b.2^{b/2}) bits of memory to keep the intermediate results.

There are several approaches which require less memory, but still O(2^{b/2}) modular operations, this time exponentiation mod p (which is usually a thousand times more complex operation). They use cycle finding in some form.

Cycle Finding

Cycle finding is usually applied for detecting collisions for one hash function. In our case, we have to find collision between two hash functions H_1(x):=H(g_1^x) and H_2(y):=H(g_2^y) from {0,1}^n into {0,1}^b.

However, this problem can be reduced to that for one hash function, if we are ready to replace n with n+1: consider H_0 from {0,1}^{n+1} to {0,1}^b, given by H_0(2x):=H_1(x), H_0(2y+1):=H_2(y). Once we find a hash collision for H_0 on arguments of different parity, we obtain a hash collision between H_1 and H_2. On average, one collision in two will have the required property, so we'll have to find two-three collisions for H_0.

Next, we need a projection function g from {0,1}^b into {0,1}^{n+1}, so as to reduce finding collisions for H_0 to finding collisions for f(x):=H_0(g(x)). An obvious choice is to map {0,1}^b into {0,1}^{n+1} by adding n+1-b leading zero bits. This has the advantage of making the exponents for modular power elevation smaller.

So we are reduced to finding pairs distinct b-bit values z and z', such that f(z)=f(z'). We might start from a random z_0, construct z_n:=f(z_{n-1}), and use, say, a Pollard rho method to find a loop in this sequence. This would require O(1) memory, but O(2^{b/2}) unparallelalizable computations of f(z).

Distinguished values

A better approach, allowing some parallelization at the expense of some memory, is to start from a randomly chosen z_0, construct sequence z_n as before, stopping if a loop is detected by Pollard rho, but in addition storing triple (z_n,z_0,n) into a special table whenever z_n happens to be distinguished (say, has t leading bits equal to 0, for suitably chosen t < b ). Once a distinguished value is reached, processing stops, and a new random value of z_0 is chosen to construct a new sequence z_n.

This might be done in parallel, starting from several randomly chosen values of z_0.

Once the same value of first component in the results table appears twice, we are likely to have found a collision, reconstructing the two colliding sequences from their (distinct) values of z_0.

This requires O(2^{b/2}) n-bit modular power elevation operations with a table for O(2^{b/2-t}) b-bit values. The computation can be done in parallel on up to O(2^{b/2-t}) processor cores if t is suitably chosen. For example, we might take t=32 for b=128, thus requiring several gigabytes of memory, but allowing parallel computation on billions of cores.

Required Resources

We still need to compute O(2^{b/2}) = O(2^{64}) powers with 128-bit exponents modulo 2048-bit prime p, and the same amount of SHA-1 hashes.

Each such power computation would require several million clock cycles on a modern CPU core (tens of thousands cycles for each modular multiplication). Let us assume that it requires two million cycles on a 2GHz core, so that one core can perform thousand such operations per second.

A modern 8-core CPU (with hyperthreading, which will barely help us here since the problem is CPU-intensive) costs about 500$ and consumes about 50 Watt of power.

In this way, we can perform 8000 (let us replace this with 10^4) such operations per second on 500$ equipment consuming 50 Watt of electric power.

We have to perform about 2^{64} ~ 2 * 10^{19} operations (actually, several times more — say, twice as much) to find a collision of requested form.

If we wish to achieve this, say, in one month (clearly too long a time to expect the parties not to worry about the delay of their DH key exchange), i.e. in 2 * 10^6 seconds, we'll need 2 * 10^9 (two billion) 500$ CPUs, worth a trillion dollars. They will consume approximately 50 billion kW.h of electricity in the process, worth hundred millions of dollars. Deployment of FPGA and ASICs might lower this estimate by an order of magnitude, still leaving such an attack to be infeasible.

All this to intercept one secret chat!