Two-Stage Optimisation in the Design of Boolean Functions John A Clark and Jeremy L Jacob Dept. of...
-
date post
20-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of Two-Stage Optimisation in the Design of Boolean Functions John A Clark and Jeremy L Jacob Dept. of...
Two-Stage Optimisation in the Design of Boolean Functions
John A Clark and Jeremy L JacobDept. of Computer Science
University of York, [email protected]
Overview
Optimisation general introduction hill-climbing simulated annealing.
Boolean function design (reprise) Experimental approach and results. Conclusions and future work.
Optimisation
Subject of huge practical importance. An optimisation problem may be stated as follows:
Find the value x that maximises the function z(y) over D.
Given a domain D and a function z: D find x in D such that
z(x)=sup{z(y): y in D}
Optimisation
Traditional optimisation techniques include: calculus (e.g. solve differential equations for extrema)
f(x)= -3 x2+6x solve f '(x)=-6x+6=0 to obtain x=1 with maximum f(x)=3
hill-climbing: inspired by notion of calculus gradient ascent etc.
(quasi-) enumerative: brute force (a crypto-favourite) linear programming branch and bound dynamic programming
Optimisation Problems
Traditional techniques not without their problems assumptions may simply not hold
e.g. non-differentiable discontinuous functions non-linear functions
problem may suffer from ‘ curse (joy?) of dimensionality ’ - the problem is simply too big to handle exactly (e.g. by brute force or dynamic programming). NP hard problems.
Some techniques may tend to get stuck in local optima for non-linear problems (see later)
The various difficulties have led researchers to investigate the use of heuristic techniques typically inspired by natural processes that typically give good solutions to optimisation problems (but forego guarantees).
Heuristic Optimisation
A variety of techniques have been developed to deal with non-linear and discontinuous problems
highest profile one is probably genetic algorithms works with a population of solutions and breeds new solutions by
aping the processes of natural reproduction Darwinian survival of the fittest
proven very robust across a huge range of problems can be very efficient
Simulated annealing - a local search technique based on cooling processes of molten metals (used in this paper)
Will illustrate problems with non-linearity and then describe simulated annealing.
Local Optimisation - Hill Climbing
Let the current solution be x. Define the neighbourhood N(x) to be the set of solutions
that are ‘close’ to x If possible, move to a neighbouring solution that improves
the value of z(x), otherwise stop. Choose any y as next solution provided z(y) >= z(x)
loose hill-climbing Choose y as next solution such that z(y)=sup{z(v): v in N(x)}
steepest gradient ascent
Local Optimisation - Hill Climbing
x0 x1 x2
z(x)
Neighbourhood of a point x might be N(x)={x+1,x-1}Hill-climb goes x
0 x
1 x
2 since
f(x0)<f(x
1)<f(x
2) > f(x
3)
and gets stuck at x2 (local
optimum)
xopt
Really want toobtain x
opt
x3
Simulated Annealing
x0 x1
x2
z(x)Allows non-improving moves so that it is possible to go down
x11
x4
x5
x6
x7
x8
x9
x10
x12
x13
x
in order to rise again
to reach global optimum
Simulated Annealing
Allows non-improving moves to be taken in the hope of escaping from local optimum.
Previous slide gives idea. In practice the size of the neighbourhood may be very large and a candidate neighbour is typically selected at random.
Quite possible to accept a worsening move when an improving move exists.
Simulated Annealing
Improving moves always accepted Non-improving moves may be accepted probabilistically and
in a manner depending on the temperature parameter Temp. Loosely
the worse the move the less likely it is to be accepted a worsening move is less likely to be accepted the cooler the
temperature
The temperature T starts high and is gradually cooled as the search progresses.
Initially virtually anything is accepted, at the end only improving moves are allowed (and the search effectively reduces to hill-climbing)
Simulated Annealing Current candidate x.
farsobestisSolution
TempTemp
rejectelse
acceptyxcurrentUifelse
acceptyxcurrentif
yfxf
xighbourgenerateNey
timesDo
dofrozenUntil
TTemp
xxcurrent
Temp
95.0
)( ))1,0((exp
)( )0(
)()(
)(
1000
)(
0
0
/
At each temperature consider 400 moves
Always accept improving moves
Accept worsening moves probabilistically.
Gets harder to do this the worse the move.
Gets harder as Temp decreases.
Temperature cycle
Crypto and Heuristic Optimisation
Most work on cryptanalysis attacking variety of simple ciphers - simple substitution and transposition through poly-alphabetic ciphers etc.
more recent work in attacking NP hard problems
But perhaps most successful work has been in design of cryptographic elements.
Most work is rather direct in its application.
Boolean Function Design A Boolean function }1,0{}1,0{: nf
For present purposes we shall use the polar representation
0 0 00 0 10 1 00 1 11 0 01 0 11 1 01 1 1
01234567
1 -10 10 10 11 -10 11 -11 -1
f(x) f(x)x
Will talk only about balanced functions where there are equal numbers of 1s and -1s.
Preliminary Definitions
Definitions relating to a Boolean function f of n variables
)(ˆ)(ˆ)(12
0
xLxfFn
x
Walsh Hadamard
Linear functionL(x)=1x1… nxn
L(x)=(-1)L(x)
(polar form)
Preliminary Definitions
Non-linearity
Auto-correlation
For present purposes we need simply note that these can be easily evaluated given a function f. They can therefore be used as the functions to be optimised. Traditionally they are.
)(max22
1 FN n
f
ACf=max | f(x)f(x+s) |xs
Using Parseval’s Theorem
Parseval’s Theorem
Loosely, push down on F()2 for some particular and it appears elsewhere.
Suggests that arranging for uniform values of F()2 will lead to good non-linearity. This is the initial motivation for our new cost function.
n
n
F 212
0
2 2)(ˆ
12
0
22)()(costn n
Ff
NEW FUNCTION!
Moves Preserving Balance Start with balanced (but otherwise random) solution. Move strategy preserves
balance
Neighbourhood of a particular function f to be the set of all functions obtained byexchanging (flipping) any two dissimilar values.
Here we have swapped f(2) and f(4)
0 0 00 0 10 1 00 1 11 0 01 0 11 1 01 1 1
01234567
1 -10 10 10 11 -10 11 -11 -1
f(x) f(x)x
-11
1
1-1-1
g(x)
1
-1
Getting in the Right Area
Previous work (QUT) has shown strongly Heuristic techniques can be very effective for cryptographic design
synthesis Boolean function, S-box design etc
Hill-climbing works far better than random search Combining heuristic search and hill-climbing generally gives best results
Aside – notion applies more generally too - has led to development of memetic algorithms in GA work.
GAs known to be robust but not suited for ‘fine tuning’.
We will adopt this strategy too: use simulated annealing to get in the ‘right area’ then hill-climb.
But we will adopt the new cost function for the first stage.
Hill-climbing With Traditional CF (n=8)
106 108 110 112 114 116104 0 1 0 0 0 096 0 0 0 0 0 088 0 2 1 0 0 080 0 5 7 2 0 072 1 19 31 6 0 0
Varying the Technique (n=8)
108 110 112 114 11680 0 0 0 0 072 0 0 10 0 064 0 0 59 0 056 0 0 186 0 048 0 0 140 1 040 0 0 2 0 032 0 0 0 0 024 0 0 0 0 0
108 110 112 114 11680 0 0 0 0 072 0 0 0 0 064 0 0 0 0 056 0 1 0 0 048 2 7 35 5 040 4 39 158 27 032 4 26 79 11 024 0 0 2 0 0
108 110 112 114 11680 0 0 0 0 072 0 0 0 0 064 0 0 0 0 056 0 0 1 7 148 0 0 14 56 240 0 0 27 176 1832 0 0 23 64 1124 0 0 0 0 0
Non-linearity Non-linearity Non-linearity
Au
tocorr
ela
tion
Simulated AnnealingWith Traditional CF
Simulated AnnealingWith New CF
Simulated AnnealingWith New CF+Hill Climbing With Traditional CF
Tuning the Technique
Experience has shown that experimentation is par for the course with optimisation.
Initial cost function motivated by theory but the real issue is how the cost function and search technique interact.
Have generalised the initial cost function to give a parametrised family of new cost functions
Cost(f)=||F()|-(2 n/2+K)| R
Tuning the Technique (n=8)
112 114 11656 0 0 048 2 5 1340 9 68 8032 29 74 11524 1 1 316 0 0 0
K=4
Non-linearityA
uto
corr
ela
tion
112 114 11656 0 0 048 0 1 340 11 16 832 41 42 2424 201 6 516 42 0 0
K=-12
Illustration of how results change as K is varied400 runs
Tuning the Technique (n=8)
112 114 11648 0 0 040 3 2 232 19 10 624 45 2 016 11 0 0
K=-14
Non-linearity
Au
tocorr
ela
tion
112 114 11648 0 1 040 3 2 332 15 8 224 53 2 016 11 0 0
K=-12
112 114 11648 0 0 140 2 5 232 15 9 324 51 2 016 10 0 0
K=-10
112 114 11648 0 0 040 2 2 132 11 12 624 44 3 116 18 0 0
K=-8
112 114 11648 0 0 140 0 5 1332 2 27 4224 0 0 1016 0 0 0
K=-6
112 114 11648 0 3 240 3 19 2032 5 17 2724 1 1 216 0 0 0
K=-4
112 114 11648 0 8 540 6 32 1532 3 16 1524 0 0 016 0 0 0
K=-2
112 114 11648 5 12 140 12 43 132 6 19 124 0 0 016 0 0 0
K=0
Further illustration of how results change as K is varied. 100 Runs
Comparison of Results
4 5 6 7 8 9 10 11 124 12 26 56 118 244 494 1000 20144 12 26 56 116 240 492 992 20104 12 26 56 116 240 480 992 19844 12 26 56 116 236 484 980 19764 12 26 56 116 238 484 984 1990
MethodLeast Upper BoundBest Known ExampleBent ConcatenationGenetic Algorithms
Our Simulated Annealing
Summary and Conclusions
Have shown that local search can be used effectively for a cryptographic non-linear optimisation problem - Boolean Function Design.
‘Direct’ cost functions not necessarily best. Cost function is a means to an end.
Whatever works will do. Cost function efficacy depends on problem, problem parameters, and
the search technique used. You can take short cuts with annealing parameters (and computationally
there may be little choice) Experimentation is highly beneficial
should look to engaging theory more?
Future Work
Opportunities for expansion: detailed variation of parameters use of more efficient annealing processes (e.g. thermostatistical
annealing). evolution of artefacts with hidden properties (you do not need to be
honest - e.g. develop S-Boxes with hidden trapdoors) experiment with different cost function families
multiple criteria etc. evolve sets of Boolean functions other local techniques (e.g. tabu search, TS)
more generally, when do GAs, SA, TS work best? investigate non-balanced functions.