Lecture 10-11: General attacks on LFSR based stream ciphers

23
Lecture 10-11: General attacks on LFSR based stream ciphers Thomas Johansson T. Johansson (Lund University) 1 / 23

Transcript of Lecture 10-11: General attacks on LFSR based stream ciphers

Lecture 10-11: General attacks on LFSR based streamciphers

Thomas Johansson

T. Johansson (Lund University) 1 / 23

Introduction

z = z1, z2, . . . , zN is a known keystream sequencefind a distinguishing attack, orfind a key recovery attack

with complexity lower than exhaustive key search.

T. Johansson (Lund University) 2 / 23

Linear complexity and Berlekamp-Massey algorithm

the keystream sequence z = z1, z2, . . . will be periodic (after possiblyremoving some of the first symbols).Any such sequence can be generated by an LFSR.One possible approach would then be to replace the entire generatorwith an (in general very long) LFSR.

T. Johansson (Lund University) 3 / 23

I

DefinitionThe linear complexity of a sequence s (finite or semi-infinite), denotedL(s), is the length of the shortest LFSR that can produce the sequence.

To find the shortest LFSR producing a certain sequence we use theBerlekamp-Massey algorithm.

T. Johansson (Lund University) 4 / 23

Berlekamp-Massey algorithmIN: A sequence s = (s0, s1, . . . , sN−1) of length N .

OUT: The shortest LFSR < C(D), L > generating s.1. Initilization C(D) = 1, L = 0, C∗(D) = 1, d∗ = 1, m = −1, n = 0.2. While (n < N) do the following:

2.1 Compute the discrepancy

d = sn −L∑

i=1

−cisn−i.

2.2 If d 6= 0 do the following:T (D) = C(D), C(D) = C(D)− d · (d∗)−1 · C∗(D)Dn−m.If L ≤ n/2 then L = n + 1− L, C∗(D) = T (D), d∗ = d, m = n.

2.3 n = n + 1.

3. Return < C(D), L >

T. Johansson (Lund University) 5 / 23

N 0

e 1

d0 1

C0(z) 1

L 0

C(z) 1

d sN �

LX

i=1

(�ci)sN�i

d = 0?

N < 2L?

c 1

d0 d

C0(z) C1(z)

L N + 1� L

C(z) C(z)� dd�1

0 z�eC0(z)

C1(z) C(z)

N N + 1

e e+ 1

C(z) C(z)� dd�1

0 z�eC0(z)Ja

Ja

Nej

Nej

Massey's algorithm

T. Johansson (Lund University) 6 / 23

Properties of Berlekamp-Massey algorithm

The running time of the Berlekamp-Massey algorithm for determiningthe linear complexity of a length n sequence is O(n2) operations.Delivers one connection polynomial and the length L of the LFSR.If (and only if) L ≤ N/2 there is a unique connection polynomial.The proof is left out...

T. Johansson (Lund University) 7 / 23

Properties of Berlekamp-Massey algorithm

We want to find the shortest LFSR generating s, a periodic sequencewith period T .Berlekamp-Massey algorithm can provide the solution if the input isthe length 2T sequence (s0, s1, . . . , sT−1, s0, s1, . . . , sT−1).You only need to process the first T + k symbols of the sequence,where k is the first positive integer such that(s0, s1, . . . , sT−1, s0, s1, . . . , sk−1) has linear complexity ≤ k.

T. Johansson (Lund University) 8 / 23

Linear complexity

Let s1 = s10, s

11, s

12, . . . and s2 = s2

0, s21, s

22, . . .

f(x1, x2) be a function in two variables, x1, x2 ∈ Fq.By

s = f(s1, s2)

we mean the sequence s = f(s10, s

20), f(s1

1, s21), f(s1

2, s22), . . ..

T. Johansson (Lund University) 9 / 23

Linear complexity

Theorem

Let s1 and s2 be two sequences with linear complexity L(s1) and L(s2)respectively. Then

If f(x1, x2) = x1 + x2 then L(f(s1, s2)) ≤ L(s1) + L(s2).If f(x1, x2) = x1x2 then L(f(s1, s2)) ≤ L(s1)L(s2).

T. Johansson (Lund University) 10 / 23

Example

z = f(s1, s2, s3, s4),

where si, i = 1, . . . , 4 are LFSR sequences with period 2Li − 1(m-sequences). Let

f(x1, x2, x3, x4) = x1 + x2x3 + x3x4 + x2x4.

Find a bound on the linear complexity of the keystream sequence L(z).

Solution: Using Theorem 2 we get

L(z) = L(f(s1, s2, s3, s4)) ≤ L(s1)+L(s2)L(s3)+L(s3)L(s4)+L(s2)L(s4) ≤ L1+L2L3+L3L4+L2L4.

T. Johansson (Lund University) 11 / 23

Correlation attacks - idea

A common method to build a keystream generator is to combineseveral linear feedback shift registers to get a keystream with desiredstatistical properties.Correlation attack: If one can detect a correlation between z and theoutput of one individual LFSR, this can be used in a“divide-and-conquer” attack on the individual LFSR.

T. Johansson (Lund University) 12 / 23

Project 3

T. Johansson (Lund University) 13 / 23

Project 3

The values of Li and Ci(D) for the different LFSRs are,

L1 = 13, C1(D) = 1 + D1 + D2 + D4 + D6 + D7 + D10 + D11 + D13,L2 = 15, C2(D) = 1 + D2 + D4 + D6 + D7 + D10 + D11 + D13 + D15,L3 = 17, C3(D) = 1 + D2 + D4 + D5 + D8 + D10 + D13 + D16 + D17.

The secret key K is the initial state of the three LFSRs.K = (K1,K2,K3), where Ki is the initial state of the ith LFSR.

T. Johansson (Lund University) 14 / 23

Project 3

Exercise 1:

Each group is given a keystream z1, z2, . . . , zN of some length N . Find thekey K that was used to produce this keystream.

Exercise 2:

Assume that the attack takes T seconds. How long would it take to attackby an exhaustive search over the entire keyspace?

T. Johansson (Lund University) 15 / 23

Project 3 - the correlation attack

a correlation between the output of one of the shift registers and thekeystream, i.e., Pui = zi 6= 0.5,

T. Johansson (Lund University) 16 / 23

Project 3 - the correlation attack

Let u(j)i be the output of the jth LFSR and assume that

Pu(j)i = zi = p, where p 6= 0.5.

What is the exact value of p?Guess that the initial state of the jth LFSR is

u0 = (u(j)1 , u

(j)2 , . . . , u

(j)Lj

).

We can calculate an LFSR output sequence u = (u1, u2, . . . , uN ), where

ui =u(j)i , 0 < i ≤ Lj ,

ui =Lj∑l=1

clui−l, Lj < i ≤ N.

T. Johansson (Lund University) 17 / 23

Project 3 - the correlation attack

The Hamming distance between x and y, dH(x,y), is defined to bethe number of coordinates in which x and y differ.Estimate the correlation p with p∗, where

p∗ = 1− dH(u, z)N

.

If the guessed initial state, u0, is correct, we get p∗ ≈ p, otherwisep∗ ≈ 0.5.

T. Johansson (Lund University) 18 / 23

Algorithm

1. For each possible initial state, calculatep∗;

2. Output the initial state for which p∗ ismaximal.

T. Johansson (Lund University) 19 / 23

Linear distinguishing attacks

Introduce linear approximations of all nonlinear operations in a specific“path” of the cipher.The path should connect some know values, i.e., key stream symbols.If the linear approximation is true, this leads to a linear relationshipamong the known key stream symbols.If it is not true, we can think of the error introduced by the linearapproximation as truly random noise.A linear combination of key stream symbols above can be viewed as asample from a very noisy (but not uniform) distribution. By collectingmany such samples, we can eventually distinguish the distribution theyare drawn from, from the uniform distribution.

T. Johansson (Lund University) 20 / 23

Example

A long LFSR with connection polynomial C(D) = 1 + D34 + D69 isgenerating a sequence s = (s0, s1, s2, . . .) in F2. The output of thegenerator is generated as

zi = f(si, si+1, si+2, si+3), i = 0, 1, . . . ,

where f is the Boolean functionf(x1, x2, x3, x4) = x1 + x2 + x3 + x1x2x3x4. filter generatorDescribe a linear distinguishing attack on the generator.

T. Johansson (Lund University) 21 / 23

Example

The LFSR sequence obeys the recursion

si + si+35 + si+69 = 0, i = 0, 1, . . . .

Replace f with the linear function g(x1, x2, x3, x4) = x1 + x2 + x3.After replacement, we have

zi = g(si, si+1, si+2, si+3) = si + si+1 + si+2, i = 0, 1, . . . .

Now we try to find a set of dependent linear equations.As si + si+35 + si+69 = 0 we also have

si+si+1+si+2+si+35+si+36+si+37+si+69+si+70+si+71 = 0, i = 0, 1, . . . .

T. Johansson (Lund University) 22 / 23

Example cont’

But zi = si + si+1 + si+2, so we must have zi + zi+35 + zi+69 = 0(assuming that g always gives the result of f).So in our attack we create a sequence of sample valuesQi = zi + zi+35 + zi+69, i = 0, 1, . . .. Calculating P (Qi) givesP (Qi = 0) = 0.835.The number of zeros in Q has a binomial distribution with probability0.835, whereas a random Q probability 0.5. Apply hypothesis testing!

T. Johansson (Lund University) 23 / 23