Exact Computation of the Fitness-Distance Correlation for ... · computation of the FDC using the...

1 / 22 April 2012 EvoCOP 2012, Málaga, Spain

Introduction Landscape Theory Result Implications

Conclusions & Future Work

Exact Computation of the Fitness-Distance Correlation for Pseudoboolean Functions

with One Global Optimum

Francisco Chicano and Enrique Alba




•  It is a measure of the difficulty of a problem defined by Jones and Forrest

Fitness-Distance Correlation




Fitness-Distance Correlation: Definition

Distance to the optimum

Fitn

ess

valu

e Definition 2. Given a function f : Bn 7! R the fitness-distance correlation forf is defined as

r =Cov

fd

�

f

�

d

, (10)

where Cov

fd

is the covariance of the fitness values and the distances of thesolutions to their nearest global optimum, �

f

is the standard deviation of thefitness values in the search space and �

d

is the standard deviation of the distancesto the nearest global optimum in the search space. Formally:

Cov

fd

=1

2n

X

x2Bn

(f(x)� f)(d(x)� d),

f =1

2n

X

x2Bn

f(x), �

f

=

s1

2n

X

x2Bn

(f(x)� f)2,

d =1

2n

X

x2Bn

d(x), �

d

=

s1

2n

X

x2Bn

(d(x)� d)2, (11)

where the function d(x) is the Hamming distance between x and its nearest globaloptimum.

The FDC r is a value between �1 and 1. The lower the absolute value ofr, the more di�cult the optimization problem is supposed to be. The exactcomputation of the FDC using the previous definition requires the evaluationof the complete search space. It is required to determine the global optima todefine d(x) and compute the statistics for d and f . If the objective function f isa constant function, then the FDC is not well-defined, since �

f

= 0.

In the following we will focus on the case in which there exists one onlyglobal optimum x

⇤ and we know the elementary landscape decomposition of f .The following lemma provides an expression for d and �

d

in this case.

Lemma 1. Given an optimization problem defined over Bn, if there is only oneglobal optimum x

⇤, then the distance function d(x) defined in Definition 2 is theHamming distance between x and x

⇤ and its average and standard deviation inthe whole search space are given by

d =n

2, �

d

=

pn

2. (12)

Proof. Since there is only one global optimum, the function d(x) is defined asd(x) = H(x, x⇤). Given an integer number 0 k n, the number of solutions

at distance k from x

⇤ is

✓n

k

◆. Then we can compute the two first raw moments




Fitness-Distance Correlation: Definition


Fitn

ess

valu

e Definition 2. Given a function f : Bn 7! R the fitness-distance correlation forf is defined as

r =Cov

fd

�

f

�

d

, (10)

where Cov

fd

is the covariance of the fitness values and the distances of thesolutions to their nearest global optimum, �

f

is the standard deviation of thefitness values in the search space and �

d

is the standard deviation of the distancesto the nearest global optimum in the search space. Formally:

Cov

fd

=1

2n

X

x2Bn

(f(x)� f)(d(x)� d),

f =1

2n

X

x2Bn

f(x), �

f

=

s1

2n

X

x2Bn

(f(x)� f)2,

d =1

2n

X

x2Bn

d(x), �

d

=

s1

2n

X

x2Bn

(d(x)� d)2, (11)

where the function d(x) is the Hamming distance between x and its nearest globaloptimum.

The FDC r is a value between �1 and 1. The lower the absolute value ofr, the more di�cult the optimization problem is supposed to be. The exactcomputation of the FDC using the previous definition requires the evaluationof the complete search space. It is required to determine the global optima todefine d(x) and compute the statistics for d and f . If the objective function f isa constant function, then the FDC is not well-defined, since �

f

= 0.

In the following we will focus on the case in which there exists one onlyglobal optimum x

⇤ and we know the elementary landscape decomposition of f .The following lemma provides an expression for d and �

d

in this case.

Lemma 1. Given an optimization problem defined over Bn, if there is only oneglobal optimum x

⇤, then the distance function d(x) defined in Definition 2 is theHamming distance between x and x

⇤ and its average and standard deviation inthe whole search space are given by

d =n

2, �

d

=

pn

2. (12)

Proof. Since there is only one global optimum, the function d(x) is defined asd(x) = H(x, x⇤). Given an integer number 0 k n, the number of solutions

at distance k from x

⇤ is

✓n

k

◆. Then we can compute the two first raw moments

Difficult when |r| < 0.15 (Jones & Forrest)




•  A landscape is a triple (X,N, f) where

Ø  X is the solution space

Ø  N is the neighbourhood operator

Ø  f is the objective function

Landscape Definition Landscape Definition Elementary Landscapes Landscape decomposition

The pair (X,N) is called configuration space

s0

s4 s7

s6

s2

s1

s8 s9

s5

s3 2

0

3

5

1

2

4 0

7 6

•  The neighbourhood operator is a function

N: X →P(X)

•  Solution y is neighbour of x if y ∈ N(x)

•  Regular and symmetric neighbourhoods

•  d=|N(x)| ∀ x ∈ X

•  y ∈ N(x) ⇔ x ∈ N(y)

•  Objective function

f: X →R (or N, Z, Q)




•  An elementary landscape is a landscape for which

where

•  Grover’s wave equation

Elementary Landscapes

Linear relationship

Eigenvalue

Depend on the problem/instance

Landscape Definition Elementary Landscapes Landscape decomposition

def




Elementary Landscapes: Examples Problem Neighbourhood d k

Symmetric TSP 2-opt n(n-3)/2 n-1 swap two cities n(n-1)/2 2(n-1)

Antisymmetric TSP inversions n(n-1)/2 n(n+1)/2 swap two cities n(n-1)/2 2n

Graph α-Coloring recolor 1 vertex (α-1)n 2α Graph Matching swap two elements n(n-1)/2 2(n-1) Graph Bipartitioning Johnson graph n2/4 2(n-1) NEAS bit-flip n 4 Max Cut bit-flip n 4 Weight Partition bit-flip n 4





•  What if the landscape is not elementary?

•  Any landscape can be written as the sum of elementary landscapes

•  There exists a set of elementary functions that form a basis of the function space (Fourier basis)

Landscape Decomposition Landscape Definition Elementary Landscapes Landscape decomposition

X X X

e1

e2

Elementary functions

(from the Fourier basis)

Non-elementary function

f Elementary components of f

f < e1,f > < e2,f >

< e2,f >

< e1,f >




Landscape Decomposition: Examples Problem Neighbourhood d Components

General TSP inversions n(n-1)/2 2 swap two cities n(n-1)/2 2

Subset Sum Problem bit-flip n 2 MAX k-SAT bit-flip n k NK-landscapes bit-flip n k+1

Radio Network Design bit-flip n max. nb. of reachable antennae

Frequency Assignment change 1 frequency (α-1)n 2 QAP swap two elements n(n-1)/2 3





•  The set of solutions X is the set of binary strings with length n •  Neighborhood used in the proof of our main result: one-change neighborhood

Ø  Two solutions x and y are neighbors iff Hamming(x,y)=1

Pseudoboolean functions Pseudoboolean functions Spheres FDC formula

0 1 1 1 0 1 0 0 1 0

0 1 1 1 0 1 0 0 1 0

0 1 1 1 0 1 0 0 0 0

0 1 1 1 0 1 0 1 1 0

0 1 1 1 0 1 1 0 1 0

0 1 1 1 0 0 0 0 1 0

0 1 1 0 0 1 0 0 1 0

0 1 0 1 0 1 0 0 1 0

0 0 1 1 0 1 0 0 1 0

1 1 1 1 0 1 0 0 1 0

0 1 1 1 0 1 0 0 1 1 0 1 1 1 1 1 0 0 1 0




•  Any arbitrary pseudoboolean function can be written as

where

and for p > 0

Pseudoboolean function decomposition

Elementary landscape with eigenvalue 2p

(order-p elementary landscape)

f(x) =

nX

j=0

f[j](x)

where

j =

�1

2

f[j] = 0 for j > 0

f[0] = f

1

f(x) =

nX

p=0

f[p](x)

where

j =

�1

2

f[p] = 0 for j > 0

f[0] = f

1

f(x) =

nX

p=0

f[p](x)

where

j =

�1

2

f[p] = 0 for j > 0

f[0] = f

1

Pseudoboolean functions Spheres FDC formula




•  If f is elementary, the average of f in any sphere and ball of any size around x is a linear expression of f(x)!!!

Spheres around a Solution

H=1 H=2

H=3

Σ f(y’) = λ1 f(x)

Σ f(y’’) = λ2 f(x)

Σ f(y’’’) = λ3 f(x)

n+1 possible values

0, 2, 4, …, 2n

Sutton

Whitley

Langdon





•  We can use this property to compute the covariance between f(x) and d(x)

Computing the Covariance

H=1 H=2

H=3

Σ f(y’) = λ1 f(x)

Σ f(y’’) = λ2 f(x)

Σ f(y’’’) = λ3 f(x)

of d(x) over the search space as:

↵1 = d =1

2n

nX

k=0

✓n

k

◆k =

n2n�1

2n=

n

2,

↵2 = d

2 =1

2n

nX

k=0

✓n

k

◆k

2 =n(n+ 1)2n�2

2n=

n(n+ 1)

4.

Using these moments we can compute the standard deviation asp

↵2 � ↵

21,

which yields:

�

d

=

rn(n+ 1)

4� n

2

4=

rn

4=

pn

2. (13)

ut

Now we are ready to prove the main result of this work.

Theorem 1. Let f be an objective function whose elementary landscape decom-position is f =

Pn

p=0 f[p], where f[0] is the constant function f[0](x) = f andeach f[p] with p > 0 is an order-p elementary function with zero o↵set. If thereexists only one global optimum in the search space x

⇤, the FDC can be exactlycomputed as:

r =�f[1](x

⇤)

�

f

pn

. (14)

Proof. Let us expand the covariance as

Cov

fd

=1

2n

X

x2Bn

f(x)d(x)� f d =1

2n

nX

k=0

k

X

x2BnH(x,x

⇤)=k

f(x)� f

n

2

=1

2n

nX

k=0

k

X

x2BnH(x,x

⇤)=k

nX

p=0

f[p](x)� f[0]n

2=

1

2n

nX

k=0

k

nX

p=0

K(n)k,p

f[p](x⇤)� f[0]

n

2

=nX

p=0

1

2n

nX

k=0

kK(n)k,p

!f[p](x

⇤)� f[0]n

2=

n

2f[0] � 1

2f[1](x

⇤)� f[0]n

2

= �1

2f[1](x

⇤), (15)

where we used the result in Proposition 1. Substituting in (10) we obtain (14).ut

The previous theorem shows that the only thing we need to know on theglobal optimum is the value of the first elementary component. With this infor-mation we can exactly compute the FDC. Some problems for which we knowthe elementary landscape decomposition based on the numerial data defininga problem instance are MAX-SAT, 0-1 Unconstrained Quadratic Optimization



↵1 = d =1

2n

nX

k=0

✓n

k

◆k =

n2n�1

2n=

n

2,

↵2 = d

2 =1

2n

nX

k=0

✓n

k

◆k

2 =n(n+ 1)2n�2

2n=

n(n+ 1)

4.


↵2 � ↵

21,

which yields:

�

d

=

rn(n+ 1)

4� n

2

4=

rn

4=

pn

2. (13)

ut



Pn



r =�f[1](x

⇤)

�

f

pn

. (14)


Cov

fd

=1

2n

X

x2Bn

f(x)d(x)� f d =1

2n

nX

k=0

k

X

x2BnH(x,x

⇤)=k

f(x)� f

n

2

=1

2n

nX

k=0

k

X

x2BnH(x,x

⇤)=k

nX

p=0

f[p](x)� f[0]n

2=

1

2n

nX

k=0

k

nX

p=0

K(n)k,p

f[p](x⇤)� f[0]

n

2

=nX

p=0

1

2n

nX

k=0

kK(n)k,p

!f[p](x

⇤)� f[0]n

2=

n

2f[0] � 1

2f[1](x

⇤)� f[0]n

2

= �1

2f[1](x

⇤), (15)







•  We can use this property to compute the covariance between f(x) and d(x)

Computing the Covariance

H=1 H=2

H=3

Σ f(y’) = λ1 f(x)

Σ f(y’’) = λ2 f(x)

Σ f(y’’’) = λ3 f(x)


↵1 = d =1

2n

nX

k=0

✓n

k

◆k =

n2n�1

2n=

n

2,

↵2 = d

2 =1

2n

nX

k=0

✓n

k

◆k

2 =n(n+ 1)2n�2

2n=

n(n+ 1)

4.


↵2 � ↵

21,

which yields:

�

d

=

rn(n+ 1)

4� n

2

4=

rn

4=

pn

2. (13)

ut



Pn



r =�f[1](x

⇤)

�

f

pn

. (14)


Cov

fd

=1

2n

X

x2Bn

f(x)d(x)� f d =1

2n

nX

k=0

k

X

x2BnH(x,x

⇤)=k

f(x)� f

n

2

=1

2n

nX

k=0

k

X

x2BnH(x,x

⇤)=k

nX

p=0

f[p](x)� f[0]n

2=

1

2n

nX

k=0

k

nX

p=0

K(n)k,p

f[p](x⇤)� f[0]

n

2

=nX

p=0

1

2n

nX

k=0

kK(n)k,p

!f[p](x

⇤)� f[0]n

2=

n

2f[0] � 1

2f[1](x

⇤)� f[0]n

2

= �1

2f[1](x

⇤), (15)




↵1 = d =1

2n

nX

k=0

✓n

k

◆k =

n2n�1

2n=

n

2,

↵2 = d

2 =1

2n

nX

k=0

✓n

k

◆k

2 =n(n+ 1)2n�2

2n=

n(n+ 1)

4.


↵2 � ↵

21,

which yields:

�

d

=

rn(n+ 1)

4� n

2

4=

rn

4=

pn

2. (13)

ut



Pn



r =�f[1](x

⇤)

�

f

pn

. (14)


Cov

fd

=1

2n

X

x2Bn

f(x)d(x)� f d =1

2n

nX

k=0

k

X

x2BnH(x,x

⇤)=k

f(x)� f

n

2

=1

2n

nX

k=0

k

X

x2BnH(x,x

⇤)=k

nX

p=0

f[p](x)� f[0]n

2=

1

2n

nX

k=0

k

nX

p=0

K(n)k,p

f[p](x⇤)� f[0]

n

2

=nX

p=0

1

2n

nX

k=0

kK(n)k,p

!f[p](x

⇤)� f[0]n

2=

n

2f[0] � 1

2f[1](x

⇤)� f[0]n

2

= �1

2f[1](x

⇤), (15)



Σ i λi f(x)





•  Using the previous facts we get for the elementary landscapes…

•  In general, for an arbitrary function…

Fitness-Distance Correlation Formulas

f(x) =

nX

p=0

f[p](x)

where

j =

�1

2

f[p] = 0 for j > 0

f[0] = f

r =

�f[p](x⇤)

�fpn

0

1

if p=1

f(x) =

nX

p=0

f[p](x)

where

j =

�1

2

f[p] = 0 for j > 0

f[0] = f

r =

�f[p](x⇤)

�fpn

r = 0

1

if p>1

f(x) =

nX

p=0

f[p](x)

where

j =

�1

2

f[p] = 0 for j > 0

f[0] = f

r =

�f[p](x⇤)

�fpn

r = 0

f(x) = f[0](x) + f[1](x) + f[2](x) + . . .+ f[n](x)

1

… the only component contributing to r is f[1](x)


↵1 = d =1

2n

nX

k=0

✓n

k

◆k =

n2n�1

2n=

n

2,

↵2 = d

2 =1

2n

nX

k=0

✓n

k

◆k

2 =n(n+ 1)2n�2

2n=

n(n+ 1)

4.


↵2 � ↵

21,

which yields:

�

d

=

rn(n+ 1)

4� n

2

4=

rn

4=

pn

2. (13)

ut



Pn



r =�f[1](x

⇤)

�

f

pn

. (14)


Cov

fd

=1

2n

X

x2Bn

f(x)d(x)� f d =1

2n

nX

k=0

k

X

x2BnH(x,x

⇤)=k

f(x)� f

n

2

=1

2n

nX

k=0

k

X

x2BnH(x,x

⇤)=k

nX

p=0

f[p](x)� f[0]n

2=

1

2n

nX

k=0

k

nX

p=0

K(n)k,p

f[p](x⇤)� f[0]

n

2

=nX

p=0

1

2n

nX

k=0

kK(n)k,p

!f[p](x

⇤)� f[0]n

2=

n

2f[0] � 1

2f[1](x

⇤)� f[0]n

2

= �1

2f[1](x

⇤), (15)



Rugged components are not considered by FDC





•  Fitness-Distance Correlation for order-1 (linear) elementary landscapes (assume max.)

•  We can define a linear elementary landscape with the desired FDC ρ (greater than 0)

•  “Difficult” problems can be obtained starting in n=45 (|r| < 0.15)

Implications for Linear Functions

The previous corollary states that only elementary landscapes with orderp = 1 have a nonzero FDC. Furthermore, the FDC does depend on the valueof the objective function in the global optimum f(x⇤) and the average value f ,but not on the solution x

⇤ itself. We can also observe that if we are maximizing,then f(x⇤) > f and the FDC is negative, while if we are minimizing f(x⇤) < f

and the FDC is positive.Interestingly, the order-1 elementary landscapes can always be written as

linear functions and they can be optimized in polynomial time. That is, if f isan order-1 elementary function then it can be written in the following way:

f(x) =nX

i=1

a

i

x

i

+ b. (17)

where a

i

and b are real values. The following proposition provides the averageand the standard deviation for this family of functions.

Proposition 2. Let f be an order-1 elementary function, which can be writtenas (17). Then, the average and the standard deviation of the function values inthe whole search space are:

f = b+1

2

nX

i=1

a

i

, �

f

=1

2

vuutnX

i=1

a

2i

. (18)

Proof. Using the linearity property of the average we can write: f =P

n

i=1 aixi

+b, and f in (18) follows from the fact that x

i

= 1/2. Now we can compute thevariance of f as:

V ar[f ] = (f(x)� f)2 =

nX

i=1

a

i

x

i

� 1

2

nX

i=1

a

i

!2

=

nX

i=1

a

i

✓x

i

� 1

2

◆!2

=nX

i,j=1

a

i

a

j

✓x

i

� 1

2

◆✓x

j

� 1

2

◆=

nX

i,j=1

a

i

a

j

✓x

i

x

j

� 1

2x

i

� 1

2x

j

+1

4

◆

=nX

i,j=1

a

i

a

j

✓x

i

x

j

� 1

4

◆=

nX

i,j=1

a

i

a

j

✓�

j

i

1

4+

1

4� 1

4

◆=

1

4

nX

i=1

a

2i

, (19)

where we used again x

i

= x

j

= 1/2 and x

i

x

j

= 1/4(�ji

+ 1), being �

j

i

theKronecker delta. The expression for �

f

in (18) follows from (19). utUsing Proposition 2 we can compute the FDC for the order-1 elementary

landscapes.

Proposition 3. Let f be an order-1 elementary function written as (17) suchthat all a

i

6= 0. Then, it has one only global optimum and its FDC (assumingmaximization) is:

r =�Pn

i=1 |ai|pn

Pn

i=1 a2i

, (20)

which is always in the interval �1 r < 0.





f(x) =nX

i=1

a

i

x

i

+ b. (17)

where a

i



f = b+1

2

nX

i=1

a

i

, �

f

=1

2

vuutnX

i=1

a

2i

. (18)


n

i=1 aixi


i


V ar[f ] = (f(x)� f)2 =

nX

i=1

a

i

x

i

� 1

2

nX

i=1

a

i

!2

=

nX

i=1

a

i

✓x

i

� 1

2

◆!2

=nX

i,j=1

a

i

a

j

✓x

i

� 1

2

◆✓x

j

� 1

2

◆=

nX

i,j=1

a

i

a

j

✓x

i

x

j

� 1

2x

i

� 1

2x

j

+1

4

◆

=nX

i,j=1

a

i

a

j

✓x

i

x

j

� 1

4

◆=

nX

i,j=1

a

i

a

j

✓�

j

i

1

4+

1

4� 1

4

◆=

1

4

nX

i=1

a

2i

, (19)


i

= x

j

= 1/2 and x

i

x

j

= 1/4(�ji

+ 1), being �

j

i


f


landscapes.


i


r =�Pn

i=1 |ai|pn

Pn

i=1 a2i

, (20)






f(x) =nX

i=1

a

i

x

i

+ b. (17)

where a

i



f = b+1

2

nX

i=1

a

i

, �

f

=1

2

vuutnX

i=1

a

2i

. (18)


n

i=1 aixi


i


V ar[f ] = (f(x)� f)2 =

nX

i=1

a

i

x

i

� 1

2

nX

i=1

a

i

!2

=

nX

i=1

a

i

✓x

i

� 1

2

◆!2

=nX

i,j=1

a

i

a

j

✓x

i

� 1

2

◆✓x

j

� 1

2

◆=

nX

i,j=1

a

i

a

j

✓x

i

x

j

� 1

2x

i

� 1

2x

j

+1

4

◆

=nX

i,j=1

a

i

a

j

✓x

i

x

j

� 1

4

◆=

nX

i,j=1

a

i

a

j

✓�

j

i

1

4+

1

4� 1

4

◆=

1

4

nX

i=1

a

2i

, (19)


i

= x

j

= 1/2 and x

i

x

j

= 1/4(�ji

+ 1), being �

j

i


f


landscapes.


i


r =�Pn

i=1 |ai|pn

Pn

i=1 a2i

, (20)


Proof. The global optimum x

⇤ has 1 in all the positions i such that ai

> 0 andthe maximum fitness value is:

f(x⇤) = b+nX

i=1a

i

>0

a

i

. (21)

Using Proposition 2 we can write:

f � f(x⇤) =

b+

1

2

nX

i=1

a

i

!�

0

B@b+nX

i=1a

i

>0

a

i

1

CA = �1

2

nX

i=1

|ai

|. (22)

Replacing the previous expression and �

f

in (16) we prove the claimed result.ut

When all the values of ai

are the same, the FDC computed with (20) is �1.This happens in particular for the Onemax problem. But if there exist di↵erentvalues for a

i

, then we can reach any arbitrary value in [�1, 0) for r. The followingtheorem provides a way to do it.

Theorem 2. Let ⇢ be an arbitrary real value in the interval [�1, 0), then anylinear function f(x) given by (17) where n > 1/⇢2, a2 = a3 = . . . = a

n

= 1 anda1 is

a1 =(n� 1) + n|⇢|p(1� ⇢

2)(n� 1)

n⇢

2 � 1(23)

has exactly FDC r = ⇢.

Proof. The expression for a1 is well-defined since n⇢

2> 1. Replacing all the a

i

in (20) we get r = ⇢. utTheorem 2 provides a solid argument against the use of FDC as a measure

of the di�culty of a problem. In e↵ect, we can always build an optimizationproblem based on a linear function, which can be solved in polynomial time,with an FDC as near as desired to 0 (but not zero), that is, as “di�cult” asdesired according to the FDC. However, we have to highlight here that for agiven FDC value ⇢ we need at least n > 1/⇢2 variables. Thus, an FDC nearer to0 requires more variables.

4 FDC, Autocorrelation Length and Local Optima

The autocorrelation length ` [8] has also been used as a measure of the di�cultyof a problem. Chicano and Alba [4] found a negative correlation between ` andthe number of local optima in the 0-1 Unconstrained Quadratic Optimizationproblem (0-1 UQO), an NP-hard problem [9]. Kinnear [11] also studied theuse of the autocorrelation measures as problem di�culty, but the results wereinconclusive. In this section we investigate which of the two measures, ` or the




f(x⇤) = b+nX

i=1a

i

>0

a

i

. (21)


f � f(x⇤) =

b+

1

2

nX

i=1

a

i

!�

0

B@b+nX

i=1a

i

>0

a

i

1

CA = �1

2

nX

i=1

|ai

|. (22)


f




i



n

= 1 anda1 is

a1 =(n� 1) + n|⇢|p(1� ⇢

2)(n� 1)

n⇢

2 � 1(23)




i








f(x⇤) = b+nX

i=1a

i

>0

a

i

. (21)


f � f(x⇤) =

b+

1

2

nX

i=1

a

i

!�

0

B@b+nX

i=1a

i

>0

a

i

1

CA = �1

2

nX

i=1

|ai

|. (22)


f




i



n

= 1 anda1 is

a1 =(n� 1) + n|⇢|p(1� ⇢

2)(n� 1)

n⇢

2 � 1(23)




i





Linear functions Autocorrelation FAQ

f(x) =

nX

p=0

f[p](x)

where

j =

�1

2

f[p] = 0 for j > 0

f[0] = f

r =

�f[p](x⇤)

�fpn

r = 0

f(x) = f[0](x) + f[1](x) + f[2](x) + . . .+ f[n](x)

a1 = 7061.43

a2 = a3 = . . . = a45 = 1

1

f(x) =

nX

p=0

f[p](x)

where

j =

�1

2

f[p] = 0 for j > 0

f[0] = f

r =

�f[p](x⇤)

�fpn

r = 0

f(x) = f[0](x) + f[1](x) + f[2](x) + . . .+ f[n](x)

a1 = 7061.43

a2 = a3 = . . . = a45 = 1

1




•  Autocorrelation length can also be exactly computed with landscape theory

•  The expression only depends on the instance data (not the global optimum)

Comparison with Autocorrelation Length

absolute value of FDC, is more correlated to the number of local optima for somerandom instances of the 0-1 UQO. In particular, we have randomly generated1650 UQO instances using the Palubeckis instance generator [13]. The size ofthe instances varies between n = 10 and n = 20 and the density (percentage ofnonzero elements in the coe�cients matrix) varies from 10 to 90 in steps of 20. Foreach n and density, 30 random instances were generated by randomly selectingthe nonzero elements of the coe�cients matrix from the interval [�100, 100]. Forall the instances we computed the autocorrelation length `, the absolute value ofthe FDC |r| and the number of local optima (minima) by complete enumerationof the search space. In Table 1 we show the Spearman rank correlation coe�cientbetween the number of local optima and ` and |r|. The correlations are computedusing all the instances with the same size n.

Table 1. Spearman correlation coe�cient for the number of local optima against theautocorrelation length (`) and the absolute value of the FDC (|r|).

n 10 11 12 13 14 15` �0.5467 �0.5545 �0.5896 �0.4796 �0.4725 �0.5511|r| �0.1407 �0.1843 �0.0787 �0.1203 �0.1944 �0.0538n 16 17 18 19 20` �0.4959 �0.5740 �0.5872 �0.5249 �0.4829|r| �0.1251 �0.1791 �0.1339 �0.3310 �0.0338

We can observe a high inverse correlation (around �0.5) between the numberof local optima and the autocorrelation length, supporting the autocorrelationlength conjecture. However, the correlation between the number of local optimaand FDC is low, again supporting the hypothesis that FDC is not an appropriatemeasure of the di�culty of a problem (this time, from an experimental point ofview).

5 Conclusion

We have applied landscape theory to exactly compute the Fitness-DistanceCorrelation of combinatorial optimization problems defined over sets of binarystrings. The result is valid in the case in which one single global optimum existsin the landscape. We defer to future work the analysis of the general case.

The expression for the FDC takes only into account the order-1 elementarycomponent of the objective function, while previous work suggests that the com-ponents making a problem di�cult are the higher order elementary components.This fact questions the use of FDC as a measure of di�culty of the problem.We prove that there exist polynomial time solvable problems with an FDC ar-bitrarily near to zero. An experimental study over random instances of the 0-1UQO shows a low correlation between FDC and the number of local optima,supporting the hypothesis that FDC fails to capture the problem di�culty.

Spearman rank correlation coefficients

Correlations with the number of local optima

0-1 Unconstrained Quadratic Optimization

(1650 random instances considered)

Higher correlation with autocorrelation length

Tomassini

Bierwirth





•  Autocorrelation length can also be exactly computed with landscape theory

•  The expression only depends on the instance data (not the global optimum)

Comparison with Autocorrelation Length

absolute value of FDC, is more correlated to the number of local optima for somerandom instances of the 0-1 UQO. In particular, we have randomly generated1650 UQO instances using the Palubeckis instance generator [13]. The size ofthe instances varies between n = 10 and n = 20 and the density (percentage ofnonzero elements in the coe�cients matrix) varies from 10 to 90 in steps of 20. Foreach n and density, 30 random instances were generated by randomly selectingthe nonzero elements of the coe�cients matrix from the interval [�100, 100]. Forall the instances we computed the autocorrelation length `, the absolute value ofthe FDC |r| and the number of local optima (minima) by complete enumerationof the search space. In Table 1 we show the Spearman rank correlation coe�cientbetween the number of local optima and ` and |r|. The correlations are computedusing all the instances with the same size n.

Table 1. Spearman correlation coe�cient for the number of local optima against theautocorrelation length (`) and the absolute value of the FDC (|r|).

n 10 11 12 13 14 15` �0.5467 �0.5545 �0.5896 �0.4796 �0.4725 �0.5511|r| �0.1407 �0.1843 �0.0787 �0.1203 �0.1944 �0.0538n 16 17 18 19 20` �0.4959 �0.5740 �0.5872 �0.5249 �0.4829|r| �0.1251 �0.1791 �0.1339 �0.3310 �0.0338

We can observe a high inverse correlation (around �0.5) between the numberof local optima and the autocorrelation length, supporting the autocorrelationlength conjecture. However, the correlation between the number of local optimaand FDC is low, again supporting the hypothesis that FDC is not an appropriatemeasure of the di�culty of a problem (this time, from an experimental point ofview).

5 Conclusion

We have applied landscape theory to exactly compute the Fitness-DistanceCorrelation of combinatorial optimization problems defined over sets of binarystrings. The result is valid in the case in which one single global optimum existsin the landscape. We defer to future work the analysis of the general case.

The expression for the FDC takes only into account the order-1 elementarycomponent of the objective function, while previous work suggests that the com-ponents making a problem di�cult are the higher order elementary components.This fact questions the use of FDC as a measure of di�culty of the problem.We prove that there exist polynomial time solvable problems with an FDC ar-bitrarily near to zero. An experimental study over random instances of the 0-1UQO shows a low correlation between FDC and the number of local optima,supporting the hypothesis that FDC fails to capture the problem di�culty.

Spearman rank correlation coefficients

Correlations with the number of local optima

0-1 Unconstrained Quadratic Optimization

(1650 random instances considered)

Higher correlation with autocorrelation length

Tomassini

Bierwirth


f(x) =

nX

p=0

f[p](x)

where

j =

�1

2

f[p] = 0 for j > 0

f[0] = f

r =

�f[p](x⇤)

�

f

pn

r = 0

f(x) = f[0](x) + f[1](x) + f[2](x) + . . .+ f[n](x)

a1 = 7061.43

a2 = a3 = . . . = a45 = 1

R(A, f) = �(f)| {z }problem

⌦ ⇤(A)| {z }algorithm

(1)

1




Frequently Asked Questions

How can we use the result?

•  Can be used to improve the computation of FDC

But, we need the elementary landscape decomposition of the problem •  Only the order-1 elementary component is required

How can we obtain this?

•  Some problems have been decomposed. If yours is not there, you can decompose it (if possible)





Frequently Asked Questions

We also need to know the global optimum, but this is usually not known

•  Right, if the optimum is not known the result will be an approximation

The empirical computation of FDC is also an approximation, what is the advantage of the proposed formula?

•  First, it requires less computational resources (memory and CPU) •  Second, it gives a value which considers all the solutions in the search space

(against the relatively small number of sampled solutions using the empirical approach)

What happens if more than one global optimum exist?

•  Touché. We are working on it.






Conclusions •  We provide a closed-form formula for FDC •  FDC only takes into consideration the order-1 elementary

component of the function when only one global optimum exists •  Linear functions can be defined to have a value of FDC as low as

desired (greater than 0) •  Autocorrelation length seems to be more correlated with the

number of local optima than FDC for UQO

Future Work •  Find an expression for the case in which more than one global

optimum exists •  Check how good is FDC in “interesting” problems


Thanks for your attention !!!

Exact Computation of the Fitness-Distance Correlation for Pseudoboolean Functions with One Global Optimum

Exact Computation of the Fitness-Distance Correlation for ... · computation of the FDC using the...

Documents

Transcript of Exact Computation of the Fitness-Distance Correlation for ... · computation of the FDC using the...