Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n...

24
Global Optimization Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n n x is discrete we call this combinatorial optimiza . an optimization problem with a finite number of sible solutions. e that when the objective function and/or constrain not be expressed analytically, solution techniques combinatorial problems can be used to solve this pr

Transcript of Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n...

Page 1: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Global Optimization

The Problemminimize f(x)subject to gi(x)>bi i=1,…,m

hj(x)=cjj=1,…,n

When x is discrete we call this combinatorial optimizationi.e. an optimization problem with a finite number offeasible solutions.

Note that when the objective function and/or constraintscannot be expressed analytically, solution techniques usedin combinatorial problems can be used to solve this problem

Page 2: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

P vs. NP

P: Optimization problems for which there exists an algorithm tosolve it with polynomial time complexity. This means that thetime it takes to solve the problem can be expressed as a polynomialthat is a function of the dimension of the problem.

NP: Stands for ‘non-deterministic polynomial’; not P.

NP hard: If a problem Q is such that every problem in NP ispolynomially transformable to Q, we say that Q is NP hard.

NP problems and their existence give justification for heuristicmethods for solving optimization problems.

Page 3: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Assignment ProblemA set of n people is available to carry out n tasks.If person i does task j, it costs cij units.Find an assignment {x1,…xn} that minimizes

n

i=1cixi

The solution is represented by the permutation {x1,…xn} of the numbers {1,…,n}.

Example task solution x1 does task 2

1 2 3 x2 does task 3x1| 3 2 5 x3 does task 1x2| 9 7 1 costx3| 4 5 8

pers

on

Page 4: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Knapsack Problem

A set of n items is available to be packed into a knapsack withcapacity C units. Item i has value vi and uses up ci units of capacity.Determine the subset I of items which should be packed in order to

minimize Ivi

subject to Ici< CHere the solution is represented by the subset I of the set {1,…,n}.

Examplevalue capacity solution

1 2.7 C/2 I={1,3}2 3.2 C/43 1.1 C/2

Page 5: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Traveling Salesman Problem (TSP)A salesperson must visit n cities once before returning home.

The distance from city i to city j is dij. What ordering of the cities

minimizes the distance the salesperson must travel before returninghome?

SETUP minimize n

i,j=1dijxij

subject to n

i=1xij=1, n

j=1xij=1

where xij= 1 if go from city i to city j

= 0 otherwise

Note that this is an integer programming problem and there are (n-1)! possible solutions to this problem.

Page 6: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Integer ProgrammingInteger problems involve large numbers of variables and constraintand quickly become very large problems.

Finding a SolutionIf the function is piecewise linear, the problem can be solvedexactly with a mixed integer program method that usesbranch and bound (later).

Otherwise, Heuristic methods (‘finding’ methods) can be used tofind approximate solutions.

What are heuristic methods?Definition: A heuristic is a technique which seeks good (i.e. nearoptimal) solutions at a reasonable computational cost without beingable to guarantee either feasibility or optimality, or even in many casesto state how close to optimality a particular feasible solution is.

Page 7: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Branch and Bound (In general)Branch and bound is a general search method used to find theminimum of a function, f(x), where x is restricted to some feasibleregion.

f0L=2f0U=9

f1L=2f1U=7

f2L=4f2U=9

Step 0 Step 1 Step 2

f1L=2f1U=7

f4L=5f4U=9

f3L=4f3U=4

In step 2 the lower bound equals the upper bound in region 3, so 4 isa optimal solution for region 3. Region 4 can be removed fromconsideration since it has a lower bound of 5, which is greater than 4.Continue branching until the location of the global optimal is found.

The difficulty with this method is in determining the lower andupper bounds on each of the regions.

L = lower boundU = upper bound

Page 8: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Branch and Bound applied to Mixed Integer Programming

A Mixed Integer ProgramThe solution to the optimization problem includes elements thatare integers . So minimize f(x) where x=(x1,x2,…,xn)and xi = integer, for at least one i.

Branch and Bound: Suppose x1 and x2 must be an integers.

x1

x2

*

x1

x2

Find a global minimumfor a relaxed problem.

x1

x2

Find minima for thesubproblems I and II.

*

Find minima for thesubproblems III, IV and V.

*

5 65.1

3.5

43

5 6

III

III

IVV

Page 9: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Clustering MethodsClustering methods are an improvement to multistart methods.Multistart methods: These are methods of optimization that determine theglobal optimal by comparing the local optimal attained from a large numberof different starting points. These methods are inefficient because manystarting points may lead to the same local minimum.Clustering methods: A form of a multistart method, with one majordifference. Neighborhoods of starting points that lead to the same localminimum are estimated. This decreases redundancies in local minimumvalues that are found.

x1

x2

x1

x2

x1

x2

*

*

***

**

*

Step 0: Sample points Step 1: Create groups Step 2: Continue sampling

*

*

***

**

*

*

*

***

**

***

**

***The challenge is how to identify the groups.

Page 10: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Simulated AnnealingA method that is useful in solving combinatorial optimization problems.At the the heart of this method is the annealing process studied inthermodynamics.

High Temperature Low Temperature

Thermal mobility is lost as

the temperature decreases.

Thermodynamic Structure Combinatorial OptimizationSystem states Feasible solutionsEnergy CostChange of state Neighboring solutionTemperature Control parameterFrozen state Heuristic solution

Page 11: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Simulated AnnealingThe Boltzmann probability distribution: Prob(E)~exp(-E/kT).k is the Boltzmann’s constant which relates temperature to energy.

A system in thermal equilibrium at temperature T has its energyprobabilistically distributed among all different energy states E.

Even if the temperature is low, there is a chance that the energystate will be high. (This is a small chance, but it is a chance nonetheless.)There is a chance for the system to get out of local energy minimumsin search of the global minimum energy state.

The general scheme of always taking downhill steps while sometimestaking an uphill step has come to be known as theMetropolis algorithm, named after Metropolis who first incorporatedsimulated annealing ideas in an optimization problem in 1953.

Page 12: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Simulated AnnealingAlgorithm to minimize the cost function, f(s).1) Select an initial solution s0, an initial temperature t0 and set iter=0.2) Select a temperature reduction function, , and a maximum number

of iterations, nrep.3) Randomly select a new point s in the neighborhood of s0; set iter=iter+1.4) If f(s)<f(s0), then s0=s and go to step 3 until iter > nrep.5) Generate random x between 0 and 1.6) If x < exp(f(s0)-f(s)/t) then s0=s and go to step 3 until iter > nrep.7) Let s0 remain the same and go to step 3 until iter > nrep.8) Set t= (t) until stopping criteria is met.9) The approximation to the optimal solution is s0.

All of the following parameters affect the speed and quality of the solution.t0:a high value for free exchange. N(s0): by swap, mutation, random, etc. :cooling should be gradual t decay. stopping criteria:minimum temp; total nrep: related to the dimension; number of iterations exceeded;

can vary with t (higher with low t). proportion of acceptable moves.

Page 13: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Hybrid Methods

MINLP: Mixed Integer Nonlinear Programming

Branch and Bound (Talked about this earlier.)1) Relax the integer constraints forming a nonlinear problem.2) Fix the integer values found to be closest to the solution foundin step 1.3) Solve the new nonlinear programming problems for the fixedinteger values until all of the integer parameters are determined.**Requires a large number of NLP problems to be solved.

Page 14: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Hybrid Methods

Tree Annealing: Simulated annealing applied to continuous functions

Algorithm1) Randomly choose an initial point, x, over a search interval, S0.2) Randomly travel down the tree to an arbitrary terminal node i, andgenerate a candidate point, y, over the subspace defined by Si.3) If f(y) < f(x), then replace x with y and go to step 5.4) Compute P=exp(-(f(y)-f(x))/T). If P > R, where R is a randomnumber uniformly distributed between 0 and 1, then replace x with y.5) If y replaced x, then decrease T slightly and update the tree.6) Set i=i+1, and go to 2 until T < Tmin.

Page 15: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Hybrid MethodsDifferences between tree and simulated annealing:1) The points x and y are sampled from a continuous space.2) A minimum value is determined by an increasing density of nodesin a given region. The subspace over which candidate points arechosen decreases in size as a minimum value is approached.3) The probability of accepting y is now governed by a modifiedcriterion: P=g(x)p(y)/g(y)p(x)where p(y)=exp(-f(x)/T) and g(x)=(1/Vx)qi

g is dependent upon the volume of the node associated withthe subspace defined by x, Vx, as well as the path from theroot node to the current node, qi.

Tree annealing is not guaranteed to converge and often convergence is very slow. Use tree annealing as a first step, then use a gradientmethod to attain the minimum.

Page 16: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Statistical Global Optimization

A statistical model of the objective function is used to bias theselection of new sample points. This is a Bayesian argument,where we use information about the objective function gatheredto make decisions about where to sample new points.

Problems1) The statistical model used may not truly represent theobjective function. If the statistical model is determinedprematurely, the optimization algorithm may be biased and leadto unreliable solutions.2) Determining a statistical model can be mathematically intense.

Page 17: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Tabu SearchThe tabu search is designed to cross boundaries of feasibility or localoptimality normally treated as barriers, and systematically to imposeand release constraints to permit exploration of otherwise forbiddenregions.Example:

2 1435

2 54312 1435

1)Assume an initial solution:2)Define a neighborhood by some type of operation applied to thissolution, such as a swap exchange:The 10 adjacent solutions attained by such a swap for a neighborhood.3) For each swap define a move value or a change in fitness value.4) Classify a subset of the moves in a neighborhood as forbiddenor tabu, such as pairs that have been swapped cannot be swappedfor 3 iterations. Call this the tabu classification.5) Define an aspiration criteria that allows us to override the tabuclassification, such as if the move results in a new global minimum.

Page 18: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Tabu Search Example

2 1435

Iteration 0

Value = 4

Tabu structure

0

0

000

0 0 0

0

0

12

3

4

2 3 4 5Swap value

5,4 -6

3,4 -2

1,2 0

2,3 2

0: free to swap>0: tabu

*

Iteration 1

2 1534

Tabu structure

0

0

000

0 0 0

0

3

12

3

4

2 3 4 5Swap value

3,1 -2

2,3 -1

1,5 1

3,4 2

*

Value = 10

Page 19: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Tabu Search Example (cont.)

2 3514

Iteration 2

Value = 4

Tabu structure

0

0

003

0 0 0

0

2

12

3

4

2 3 4 5Swap value

1,3 2

3,4 2

4,5 3

2,3 4

T

Iteration 3

2 1543

Tabu structure

0

3

002

0 0 0

0

1

12

3

4

2 3 4 5Swap value

4,5 -3

2,4 -1

1,3 1

3,5 2

Value = 2

T*

Choose a move that does notimprove the solution.

Although 4,5 is tabu, the resultingvalue obtained from this search isless than the lowest value attainedthus far (4-3=1 < 2).

T*

T

Page 20: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Tabu Search

Some of the factors that affect the efficiency and the qualityof the tabu search include:• The number of swaps per iteration.• The number of moves that become tabu.• The tenure of the tabu.• Tabu restrictions (can restrict any swap that includes one

member of a tabu pair).• Can take into account the frequency of swaps and penalize

move values between those pairs that have high swapfrequencies.

Page 21: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Nested Partitions Method(Lyeuan Shi, Operations Research, May 2000)

This method systematically partitions the feasible region intosubregions, and then concentrates the computational effort in themost promising region.

1) Assume that we have a most promising region, .2) Partition this region into M subregions and aggregate the entiresurrounding region into one region.3) Sample each of these M+1 regions and determine the promisingindex for each.4) Set equal to the most promising region. Go to step 1.If the surrounding region is found to be the best, the algorithmbacktracks to a larger region that contains the old most promisingregion.

Page 22: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Nested Partitions ExampleSet M = 2Set = 3

= {1,2,3,4,5,6,7,8} partition:partition: 5= {1} --> prom index = 21= {1,2,3,4} --> prom index = 5 6= {2}--> prom index = 32 = {5,6,7,8}--> prom index = 4 0\(5U6) = {3,4,5,6,7,8}-->

prom index = 4Set = 1

partition: Backtrack to = 1

3= {1,2} --> prom index = 5 new partition:4 = {3,4}--> prom index = 3 7= {1,2,3} --> prom index = 52= {5,6,7,8} --> prom index = 4 8= {4} --> prom index = 2

2= {5,6,7,8} --> prom index = 4

continue until a minimum is found

Page 23: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

The Tunneling Method

n

ffT

)][(')[(

)()(

**

*

qqqq

qq

f(q) = original objective function n = the pole strength

f* = the local minimum determined in the minimization phase

q* = the pumping rate for the local minimum determined in the minimization phase

The Tunneling Function

Cycle through the minimization and tunneling phases until no rootscan be determined for the tunneling function.Minimization Phase: A local minimum value is determined.Tunneling Phase: The objective function is transformed into

a tunneling function, whose roots become the startingpoints for the next minimization phase.

Page 24: Global Optimization The Problem minimize f(x) subject to g i (x)>b i i=1,…,m h j (x)=c j j=1,…,n When x is discrete we call this combinatorial optimization.

Tunneling Method Illustrated

4)5cos()( xxxf2)67.(

)67(.)()(

x

fxfxT

.67.67

Starting point: x=0

1st root found