Two-Stage Optimisation in the Design of Boolean Functions John A Clark and Jeremy L Jacob Dept. of...

27
Two-Stage Optimisation in the Design of Boolean Functions John A Clark and Jeremy L Jacob Dept. of Computer Science University of York, UK [email protected] [email protected]
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    0

Transcript of Two-Stage Optimisation in the Design of Boolean Functions John A Clark and Jeremy L Jacob Dept. of...

Two-Stage Optimisation in the Design of Boolean Functions

John A Clark and Jeremy L JacobDept. of Computer Science

University of York, [email protected]

[email protected]

Overview

Optimisation general introduction hill-climbing simulated annealing.

Boolean function design (reprise) Experimental approach and results. Conclusions and future work.

Optimisation

Subject of huge practical importance. An optimisation problem may be stated as follows:

Find the value x that maximises the function z(y) over D.

Given a domain D and a function z: D find x in D such that

z(x)=sup{z(y): y in D}

Optimisation

Traditional optimisation techniques include: calculus (e.g. solve differential equations for extrema)

f(x)= -3 x2+6x solve f '(x)=-6x+6=0 to obtain x=1 with maximum f(x)=3

hill-climbing: inspired by notion of calculus gradient ascent etc.

(quasi-) enumerative: brute force (a crypto-favourite) linear programming branch and bound dynamic programming

Optimisation Problems

Traditional techniques not without their problems assumptions may simply not hold

e.g. non-differentiable discontinuous functions non-linear functions

problem may suffer from ‘ curse (joy?) of dimensionality ’ - the problem is simply too big to handle exactly (e.g. by brute force or dynamic programming). NP hard problems.

Some techniques may tend to get stuck in local optima for non-linear problems (see later)

The various difficulties have led researchers to investigate the use of heuristic techniques typically inspired by natural processes that typically give good solutions to optimisation problems (but forego guarantees).

Heuristic Optimisation

A variety of techniques have been developed to deal with non-linear and discontinuous problems

highest profile one is probably genetic algorithms works with a population of solutions and breeds new solutions by

aping the processes of natural reproduction Darwinian survival of the fittest

proven very robust across a huge range of problems can be very efficient

Simulated annealing - a local search technique based on cooling processes of molten metals (used in this paper)

Will illustrate problems with non-linearity and then describe simulated annealing.

Local Optimisation - Hill Climbing

Let the current solution be x. Define the neighbourhood N(x) to be the set of solutions

that are ‘close’ to x If possible, move to a neighbouring solution that improves

the value of z(x), otherwise stop. Choose any y as next solution provided z(y) >= z(x)

loose hill-climbing Choose y as next solution such that z(y)=sup{z(v): v in N(x)}

steepest gradient ascent

Local Optimisation - Hill Climbing

x0 x1 x2

z(x)

Neighbourhood of a point x might be N(x)={x+1,x-1}Hill-climb goes x

0 x

1 x

2 since

f(x0)<f(x

1)<f(x

2) > f(x

3)

and gets stuck at x2 (local

optimum)

xopt

Really want toobtain x

opt

x3

Simulated Annealing

x0 x1

x2

z(x)Allows non-improving moves so that it is possible to go down

x11

x4

x5

x6

x7

x8

x9

x10

x12

x13

x

in order to rise again

to reach global optimum

Simulated Annealing

Allows non-improving moves to be taken in the hope of escaping from local optimum.

Previous slide gives idea. In practice the size of the neighbourhood may be very large and a candidate neighbour is typically selected at random.

Quite possible to accept a worsening move when an improving move exists.

Simulated Annealing

Improving moves always accepted Non-improving moves may be accepted probabilistically and

in a manner depending on the temperature parameter Temp. Loosely

the worse the move the less likely it is to be accepted a worsening move is less likely to be accepted the cooler the

temperature

The temperature T starts high and is gradually cooled as the search progresses.

Initially virtually anything is accepted, at the end only improving moves are allowed (and the search effectively reduces to hill-climbing)

Simulated Annealing Current candidate x.

farsobestisSolution

TempTemp

rejectelse

acceptyxcurrentUifelse

acceptyxcurrentif

yfxf

xighbourgenerateNey

timesDo

dofrozenUntil

TTemp

xxcurrent

Temp

95.0

)( ))1,0((exp

)( )0(

)()(

)(

1000

)(

0

0

/

At each temperature consider 400 moves

Always accept improving moves

Accept worsening moves probabilistically.

Gets harder to do this the worse the move.

Gets harder as Temp decreases.

Temperature cycle

Crypto and Heuristic Optimisation

Most work on cryptanalysis attacking variety of simple ciphers - simple substitution and transposition through poly-alphabetic ciphers etc.

more recent work in attacking NP hard problems

But perhaps most successful work has been in design of cryptographic elements.

Most work is rather direct in its application.

Boolean Function Design A Boolean function }1,0{}1,0{: nf

For present purposes we shall use the polar representation

0 0 00 0 10 1 00 1 11 0 01 0 11 1 01 1 1

01234567

1 -10 10 10 11 -10 11 -11 -1

f(x) f(x)x

Will talk only about balanced functions where there are equal numbers of 1s and -1s.

Preliminary Definitions

Definitions relating to a Boolean function f of n variables

)(ˆ)(ˆ)(12

0

xLxfFn

x

Walsh Hadamard

Linear functionL(x)=1x1… nxn

L(x)=(-1)L(x)

(polar form)

Preliminary Definitions

Non-linearity

Auto-correlation

For present purposes we need simply note that these can be easily evaluated given a function f. They can therefore be used as the functions to be optimised. Traditionally they are.

)(max22

1 FN n

f

ACf=max | f(x)f(x+s) |xs

Using Parseval’s Theorem

Parseval’s Theorem

Loosely, push down on F()2 for some particular and it appears elsewhere.

Suggests that arranging for uniform values of F()2 will lead to good non-linearity. This is the initial motivation for our new cost function.

n

n

F 212

0

2 2)(ˆ

12

0

22)()(costn n

Ff

NEW FUNCTION!

Moves Preserving Balance Start with balanced (but otherwise random) solution. Move strategy preserves

balance

Neighbourhood of a particular function f to be the set of all functions obtained byexchanging (flipping) any two dissimilar values.

Here we have swapped f(2) and f(4)

0 0 00 0 10 1 00 1 11 0 01 0 11 1 01 1 1

01234567

1 -10 10 10 11 -10 11 -11 -1

f(x) f(x)x

-11

1

1-1-1

g(x)

1

-1

Getting in the Right Area

Previous work (QUT) has shown strongly Heuristic techniques can be very effective for cryptographic design

synthesis Boolean function, S-box design etc

Hill-climbing works far better than random search Combining heuristic search and hill-climbing generally gives best results

Aside – notion applies more generally too - has led to development of memetic algorithms in GA work.

GAs known to be robust but not suited for ‘fine tuning’.

We will adopt this strategy too: use simulated annealing to get in the ‘right area’ then hill-climb.

But we will adopt the new cost function for the first stage.

Hill-climbing With Traditional CF (n=8)

106 108 110 112 114 116104 0 1 0 0 0 096 0 0 0 0 0 088 0 2 1 0 0 080 0 5 7 2 0 072 1 19 31 6 0 0

Varying the Technique (n=8)

108 110 112 114 11680 0 0 0 0 072 0 0 10 0 064 0 0 59 0 056 0 0 186 0 048 0 0 140 1 040 0 0 2 0 032 0 0 0 0 024 0 0 0 0 0

108 110 112 114 11680 0 0 0 0 072 0 0 0 0 064 0 0 0 0 056 0 1 0 0 048 2 7 35 5 040 4 39 158 27 032 4 26 79 11 024 0 0 2 0 0

108 110 112 114 11680 0 0 0 0 072 0 0 0 0 064 0 0 0 0 056 0 0 1 7 148 0 0 14 56 240 0 0 27 176 1832 0 0 23 64 1124 0 0 0 0 0

Non-linearity Non-linearity Non-linearity

Au

tocorr

ela

tion

Simulated AnnealingWith Traditional CF

Simulated AnnealingWith New CF

Simulated AnnealingWith New CF+Hill Climbing With Traditional CF

Tuning the Technique

Experience has shown that experimentation is par for the course with optimisation.

Initial cost function motivated by theory but the real issue is how the cost function and search technique interact.

Have generalised the initial cost function to give a parametrised family of new cost functions

Cost(f)=||F()|-(2 n/2+K)| R

Tuning the Technique (n=8)

112 114 11656 0 0 048 2 5 1340 9 68 8032 29 74 11524 1 1 316 0 0 0

K=4

Non-linearityA

uto

corr

ela

tion

112 114 11656 0 0 048 0 1 340 11 16 832 41 42 2424 201 6 516 42 0 0

K=-12

Illustration of how results change as K is varied400 runs

Tuning the Technique (n=8)

112 114 11648 0 0 040 3 2 232 19 10 624 45 2 016 11 0 0

K=-14

Non-linearity

Au

tocorr

ela

tion

112 114 11648 0 1 040 3 2 332 15 8 224 53 2 016 11 0 0

K=-12

112 114 11648 0 0 140 2 5 232 15 9 324 51 2 016 10 0 0

K=-10

112 114 11648 0 0 040 2 2 132 11 12 624 44 3 116 18 0 0

K=-8

112 114 11648 0 0 140 0 5 1332 2 27 4224 0 0 1016 0 0 0

K=-6

112 114 11648 0 3 240 3 19 2032 5 17 2724 1 1 216 0 0 0

K=-4

112 114 11648 0 8 540 6 32 1532 3 16 1524 0 0 016 0 0 0

K=-2

112 114 11648 5 12 140 12 43 132 6 19 124 0 0 016 0 0 0

K=0

Further illustration of how results change as K is varied. 100 Runs

Comparison of Results

4 5 6 7 8 9 10 11 124 12 26 56 118 244 494 1000 20144 12 26 56 116 240 492 992 20104 12 26 56 116 240 480 992 19844 12 26 56 116 236 484 980 19764 12 26 56 116 238 484 984 1990

MethodLeast Upper BoundBest Known ExampleBent ConcatenationGenetic Algorithms

Our Simulated Annealing

Summary and Conclusions

Have shown that local search can be used effectively for a cryptographic non-linear optimisation problem - Boolean Function Design.

‘Direct’ cost functions not necessarily best. Cost function is a means to an end.

Whatever works will do. Cost function efficacy depends on problem, problem parameters, and

the search technique used. You can take short cuts with annealing parameters (and computationally

there may be little choice) Experimentation is highly beneficial

should look to engaging theory more?

Future Work

Opportunities for expansion: detailed variation of parameters use of more efficient annealing processes (e.g. thermostatistical

annealing). evolution of artefacts with hidden properties (you do not need to be

honest - e.g. develop S-Boxes with hidden trapdoors) experiment with different cost function families

multiple criteria etc. evolve sets of Boolean functions other local techniques (e.g. tabu search, TS)

more generally, when do GAs, SA, TS work best? investigate non-balanced functions.