Local search and Optimisation Introduction: global versus local Study of the key local search...
-
Upload
patricia-osborne -
Category
Documents
-
view
215 -
download
0
Transcript of Local search and Optimisation Introduction: global versus local Study of the key local search...
Local search and OptimisationLocal search and Optimisation
Introduction: global versus localIntroduction: global versus local
Study of the key local search techniquesStudy of the key local search techniques
Concluding RemarksConcluding Remarks
Introduction:Introduction:
Global versus Local searchGlobal versus Local search
Global versus Local searchGlobal versus Local search Global search:Global search:
interest: find interest: find a patha path to a goal to a goal
properties:properties: search through search through partial pathspartial paths
in a systematic way (consider all paths – completeness)in a systematic way (consider all paths – completeness) opportunity to detect loops opportunity to detect loops
Global versus Local search (2)Global versus Local search (2)
4
Local search:Local search: interest: find interest: find a goal a goal
or or a statea state that maximizes/minimizes some that maximizes/minimizes some objective objective functionfunction. .
4-queens example:4-queens example:
interest in the solutionsinterest in the solutions not in the way we find themnot in the way we find them
or or a statea state that maximizes/minimizes some that maximizes/minimizes some objective objective functionfunction. .
Global versus Local search (3)Global versus Local search (3)
5
objective function: estimates quality of rosterobjective function: estimates quality of roster optimize the objective functionoptimize the objective function
rostering:rostering:
Local search:Local search: interest: find interest: find a goal a goal
or or a statea state that maximizes/minimizes some that maximizes/minimizes some objective objective functionfunction. .
6
Is the path relevant or not ?Is the path relevant or not ?
The 8-puzzle: The 8-puzzle: path relevantpath relevant
Chess: Chess: path relevantpath relevant
Water jugs: Water jugs: path relevantpath relevant
Traveling sales person: Traveling sales person: path relevant *path relevant *
Symbolic integrals: Symbolic integrals: could be bothcould be both
Blocks planning: Blocks planning: path relevantpath relevant
q-queens puzzle: q-queens puzzle: not relevantnot relevant
rostering: rostering: not relevantnot relevant
7
Local search! Local search! - the path is encoded in every state- the path is encoded in every state- just find a good/optimal state- just find a good/optimal state
Traveling sales person:Traveling sales person:
Representation is a potential Representation is a potential solution:solution:
(New York, Boston, Miami, SanFran, Dallas, New York)(New York, Boston, Miami, SanFran, Dallas, New York)
Representation is a partial Representation is a partial sequence:sequence:
Global search !Global search !
(New York, Boston)(New York, Boston)
8
General observations on Local General observations on Local Search:Search:
Applicable if path to solution is not importantApplicable if path to solution is not important but see comment on TSPbut see comment on TSP
Keeps only Keeps only 11 (or a fixed (or a fixed kk) state(s).) state(s). kk for local beam search and genetic algorithms for local beam search and genetic algorithms
Most often, does not systematically investigate all Most often, does not systematically investigate all possibilitiespossibilities and as a result, may be incomplete or suboptimaland as a result, may be incomplete or suboptimal
Does not avoid loopsDoes not avoid loops unless explicitly designed to (e.g. Tabu search) or unless explicitly designed to (e.g. Tabu search) or
included in the state representationincluded in the state representation
Is often for optimization of an objective functionIs often for optimization of an objective function
Local Search Algorithms:Local Search Algorithms:
Hill Climbing (3) (local Hill Climbing (3) (local version)version)
Simulated AnnealingSimulated Annealing
Local k-Beam SearchLocal k-Beam Search
Genetic AlgorithmsGenetic Algorithms
Tabu SearchTabu Search
Heuristics and MetaheuristicsHeuristics and Metaheuristics
Hill-Climbing (3)Hill-Climbing (3)or Greedy local searchor Greedy local search
The really “local-search” variant of The really “local-search” variant of
Hill ClimbingHill Climbing
11
Hill Climbing (3) algorithm:Hill Climbing (3) algorithm:
State:= S;State:= S;STOP := False;STOP := False;WHILEWHILE not STOPnot STOP DODO Neighbors := Neighbors := successors(State);successors(State); IFIF max(max(hh(Neighbors)) > (Neighbors)) > hh(State)(State)
State:= State:= maximal_h_neighbor;maximal_h_neighbor;
ElseElse STOP:= True;STOP:= True;
ReturnReturn StateState
Let h be the objective functionLet h be the objective function
or, minimization (see 8-queens)or, minimization (see 8-queens)
Hill Climbing 2,Hill Climbing 2,but without pathsbut without paths
The problems: The problems:
12
Foothills:Foothills:
LocalLocalmaximummaximum
PlateausPlateaus
RidgesRidges
More properties:More properties:
13
Termination ?Termination ?
If h is bounded (in relevant direction)If h is bounded (in relevant direction)and there is a minimal step in h-and there is a minimal step in h-function. function.
Completeness ?Completeness ? No !No !
Case study: 8-queensCase study: 8-queens
14
h = the number of pairs of queens h = the number of pairs of queens attacking each other in the stateattacking each other in the state
n1 n2 n3 n4 n5 n6 n7 n8n1 n2 n3 n4 n5 n6 n7 n8
h h = 17 = 17
Minimization !Minimization !
8-queens (cont.)8-queens (cont.)
15
Neighbors of (n1, n2,…., n8):Neighbors of (n1, n2,…., n8): obtained by changing obtained by changing only 1 only 1 nini
56 neighbors 56 neighbors a local minimum a local minimum h = 1h = 1
How well does it work?How well does it work?
For 8-queens:For 8-queens:
But how to improve the success rate ? But how to improve the success rate ?
Plateau’s: sideway movesPlateau’s: sideway moves
17
At plateau: allow to move to equal-h neighborAt plateau: allow to move to equal-h neighbor
Danger: Non-termination !Danger: Non-termination !Allow only a maximum number of consecutive Allow only a maximum number of consecutive sideway moves (say: 100)sideway moves (say: 100)
Result:Result:
Success rate:Success rate: 94% : success 6 % : local minimum94% : success 6 % : local minimum
Variants on HC (3)Variants on HC (3)
18
Move to a random neighbor with a better hMove to a random neighbor with a better h
Stochastic Hill Climbing:Stochastic Hill Climbing:
Gets more solutions / but is slowerGets more solutions / but is slower
Move to the first-found neighbor with a better Move to the first-found neighbor with a better hh
First-choice Hill Climbing:First-choice Hill Climbing:
Useful if there are VERY many neighborsUseful if there are VERY many neighbors
Garanteed completeness:Garanteed completeness:Random-Restart Hill ClimbingRandom-Restart Hill Climbing
19
IfIf HC terminates HC terminates withoutwithout producing a solution: producing a solution:ThenThen restart HC with a restart HC with a random newrandom new initial state initial state
If there are only finitely many states, andIf there are only finitely many states, andIf each HC terminates, then:If each HC terminates, then:
RRCH is complete (with probability RRCH is complete (with probability 1)1)
Analysis of RRHC:Analysis of RRHC:
20
Pure RRHC:Pure RRHC:
If HC has a probability p of reaching success, If HC has a probability p of reaching success, then we need (average) 1/p restarts in RRHC.then we need (average) 1/p restarts in RRHC.
With sideway moves added:With sideway moves added:
For 8-queens:For 8-queens: - p= 0.14 1/p ≈ 7 iterations- p= 0.14 1/p ≈ 7 iterations - Cost (average) ?- Cost (average) ? (6 (6 (failures) (failures) * 3) + (1 * 3) + (1 (success) (success) * 4) = 22 * 4) = 22 steps steps
For 8-queens:For 8-queens: - p= 0.94 1/p = 1.06 iterations- p= 0.94 1/p = 1.06 iterations
of which of which (1-p)/p = 0.06/0.94 fail(1-p)/p = 0.06/0.94 fail - Cost (average) ?- Cost (average) ? (0.06/0.94 (0.06/0.94 (failures) (failures) * 64) + (1 * 64) + (1 (success) (success) * 21) ≈ * 21) ≈
25 25 steps steps
1/p-11/p-1
Conclusion ?Conclusion ?
21
Local Search is replacing more and more other Local Search is replacing more and more other solvers in solvers in _many_ domains_many_ domains continuouslycontinuously
including optimization problems in ML.including optimization problems in ML.
Simulated AnnealingSimulated AnnealingKirkpatrick et al. 1983Kirkpatrick et al. 1983
Simulate the process of annealingSimulate the process of annealing
from metallurgy from metallurgy
Motivations:Motivations:
HC (3): best movesHC (3): best moves fast, but stuck in local optimafast, but stuck in local optima
Stochastic HC: random movesStochastic HC: random movesslow, but completeslow, but complete
1)1)
Combine !Combine !
RRHV: resart after failureRRHV: resart after failure why wait until failure?why wait until failure?
include ‘jumps’ during the include ‘jumps’ during the process ! process !
2)2)
At high ‘temperature’ : frequent big At high ‘temperature’ : frequent big jumpsjumpsAt low ‘temperature’ : few smaller At low ‘temperature’ : few smaller onesonesGet ping-pong ball to deepest hole, by rolling Get ping-pong ball to deepest hole, by rolling
ball and shaking surfaceball and shaking surface3)3)
24
The algorithm:The algorithm:
State:= S;State:= S;
FORFOR Time = 1 to ∞Time = 1 to ∞ DODO Temp:= DecreaseFunction(Time);Temp:= DecreaseFunction(Time); IFIF Temp = 0 Temp = 0 ThenThen ReturnReturn State;State; ElseElse Next:= random_neighborg(State); Next:= random_neighborg(State); ΔΔhh:= := hh(Next) – (Next) – hh(State);(State); IFIF ΔΔhh > 0 > 0 ThenThen State:= Next; State:= Next; ElseElse State:= Next with probability State:= Next with probability e^(e^(ΔΔh/Temp)h/Temp)End_FOREnd_FOR
For slowly deceasing temperature, will reach For slowly deceasing temperature, will reach global optimum (with probablity 1)global optimum (with probablity 1)
Local k-Beam SearchLocal k-Beam Search
Beam search, without keeping partial Beam search, without keeping partial pathspaths
Local k-Beam SearchLocal k-Beam Search
26
≠ ≠ k parallel HC(3) searchesk parallel HC(3) searches
The k new states are the k best The k new states are the k best of ALL the neighborsof ALL the neighbors
Stochastic Beam SearchStochastic Beam Search
Genetic AlgorithmsGenetic AlgorithmsHolland 1975Holland 1975
Search inspired by evolution theorySearch inspired by evolution theory
28
General ContextGeneral Context Similar to stochastic k-beam searchSimilar to stochastic k-beam search
keeps track of k Stateskeeps track of k States
Different:Different: generation of new states is “sexual”generation of new states is “sexual”
In addition has: In addition has: selectionselection and and mutationmutation
States must be represented as strings over States must be represented as strings over some alphabetsome alphabet e.g. 0/1 bits or decimal numberse.g. 0/1 bits or decimal numbers
Objective function is called Objective function is called fitness functionfitness function
CrossoverCrossover
8-queens example:8-queens example:
State representation: 8-string of numbers [1,8]State representation: 8-string of numbers [1,8]
Population: set of k states -- here: k=4Population: set of k states -- here: k=4
8-queens (cont.)8-queens (cont.)
Fitness function applied to populationFitness function applied to population Probability of being selected: Probability of being selected:
proportional to fitnessproportional to fitness
Select: k/2 Select: k/2 pairspairs of states of states
Step 1: Selection:Step 1: Selection:
8-queens (cont.)8-queens (cont.)
Select random crossover point Select random crossover point here: 3 for pair one, 5 for pair twohere: 3 for pair one, 5 for pair two
Crossover applied to the stringsCrossover applied to the strings
Step 2: Crossover:Step 2: Crossover:
8-queens (cont.)8-queens (cont.)
With a small probability:With a small probability: change a string member to a random change a string member to a random valuevalue
Step 3: Mutation:Step 3: Mutation:
The algorithm:The algorithm:
PopPop:= the set of k initial states;:= the set of k initial states;REPEATREPEAT New_PopNew_Pop:= {};:= {}; FORFOR i=1,k i=1,k DoDo x:= RandomSelect(x:= RandomSelect(PopPop, , FitFit);); y:= RandomSelect(y:= RandomSelect(PopPop, , FitFit);); child:= child:= crossovercrossover(x,y);(x,y); IFIF (small_random_probability) (small_random_probability) ThenThen child:= child:= mutatemutate(child);(child); New_PopNew_Pop:= := New_PopNew_Pop U {child}; U {child};End_ForEnd_ForPopPop:= := New_PopNew_Pop;;UntilUntil a member of Pop fit enough or time is a member of Pop fit enough or time is upup
Given: Given: FitFit (a fitness function) (a fitness function)
Different from theDifferent from theexample, where bothexample, where bothcrossovers were usedcrossovers were used
34
Comments on GAComments on GA
Very many variants – this is only one instance!Very many variants – this is only one instance! keep part of Pop, different types of crossover, …keep part of Pop, different types of crossover, …
What is added value?What is added value? If the encoding is well-constructed: substrings If the encoding is well-constructed: substrings
may represent useful building blocksmay represent useful building blocks
Then crossover may produce more useful states !Then crossover may produce more useful states !
In general: advantages of GA are not well understoodIn general: advantages of GA are not well understood
Ex: (246*****) is a usefull pattern for 8-queensEx: (246*****) is a usefull pattern for 8-queensEx Circuit design: some substring may Ex Circuit design: some substring may represent a useful subcircuitrepresent a useful subcircuit
Interpretation of crossover:Interpretation of crossover:
If we change our representation from 8 decimal If we change our representation from 8 decimal numbers to 24 binary digits, how does the numbers to 24 binary digits, how does the interpretation change?interpretation change?
Tabu SearchTabu SearchGlover 1986Glover 1986
Another way to get HC out of local Another way to get HC out of local minimaminima
Tabu = forbiddenTabu = forbidden
In order to get HC out of a local maximum:In order to get HC out of a local maximum:
Naïve idea:Naïve idea: Allow one/some moves downhillAllow one/some moves downhill
Problem: When switching back to HC, it will just move Problem: When switching back to HC, it will just move back!back!
TabuList:TabuList: Where you are Where you are forbiddenforbidden to go next.to go next.
The Tabu search idea:The Tabu search idea: Keep a list Keep a list TabuListTabuList with information on which with information on which
new states are not allowednew states are not allowed
n3n3 n3n3
n3: 2 6 n3: 2 6
Add to Add to TabuListTabuList:: (n3, 6, 2) : (n3, 6, 2) : don’t make the opposite move don’t make the opposite move oror
(n3, 2) : (n3, 2) : don’t place n3 back on 2 don’t place n3 back on 2 oror
(n3) : (n3) : don’t move n3 don’t move n3
Hoped effect: visualizedHoped effect: visualized
TabuListTabuList is kept short: only recent history is kept short: only recent history determines it.determines it.
TabuListTabuList determines an area where NOT determines an area where NOT to to move back.move back.
The algorithm:The algorithm:Given: Given: FitFit (a fitness function) (a fitness function)
State:= S;State:= S;Best:= S;Best:= S;TabuListTabuList:= {};:= {};WHILEWHILE not(StopCondition) not(StopCondition) DoDo Candidates:= {};Candidates:= {}; FORFOR every Child in Neighbors(State) every Child in Neighbors(State) DoDo IFIF not_forbidden(Child, not_forbidden(Child, TabuListTabuList) ) thenthen Candidates:= Candidates U {Child}; Candidates:= Candidates U {Child}; Succ:= Maximal_Succ:= Maximal_FitFit(Candidates);(Candidates); State:= Succ;State:= Succ; IfIf FitFit(Succ) > (Succ) > FitFit(Best) (Best) thenthen TabuListTabuList:= := TabuListTabuList U {ExcludeCondition(Succ, U {ExcludeCondition(Succ, Best)};Best)}; Best:= Succ;Best:= Succ; Eliminate_old(Eliminate_old(TabuListTabuList););Return Best;Return Best;
Example Personel Rostering: Example Personel Rostering:
Initialise to satisfy the Initialise to satisfy the required amountsrequired amounts
Allow only vertical Allow only vertical swaps (swaps (neighbors)neighbors)
If a swap has If a swap has influenced a certain influenced a certain region of the timetable, region of the timetable, do not allow any other do not allow any other swap to influence this swap to influence this region for a specified region for a specified number of moves number of moves (Tabu (Tabu list, Tabu attributes)list, Tabu attributes)
Shift 1 2 3 4 5 7 1 2 3 C
Pjotr A A A T T F 1
Ludwig C C R T T T 0
Clara T T R R F T T 1
Hildegard A A A T T T 0
Johann A C C T T F 1
Wolfgang R T T T C T T T 0
Guiseppe R F T T 1
Antonio R R R C F T T 1
Arranger 2 2 2 1 0 0
Tonesetter 1 2 1 1 0 0
Composer 1 1 1 1 2 0
Reader 3 1 1 1 2 0
Heuristics and Meta-heuristicsHeuristics and Meta-heuristics
More differences with Global searchMore differences with Global search
Examples of problems and heuristicsExamples of problems and heuristics
Meta-heuristicsMeta-heuristics
HeuristicsHeuristics
In Global search: In Global search: hh: States N: States N
In Local search:In Local search:
- How to represent a State?How to represent a State?- How to define the Neighbors?How to define the Neighbors?- How to define the objective or How to define the objective or FitFit function? function?
Are all Are all heuristicheuristic choices that influence the choices that influence the search search VERYVERY much. much.
Finding routes:Finding routes:
• Given a weighted graph, (V,E), and two Given a weighted graph, (V,E), and two vertices, ‘source’ and ‘destination’, find the vertices, ‘source’ and ‘destination’, find the path from source to destination with the path from source to destination with the smallest accumulated weight.smallest accumulated weight.
• DijkstraDijkstra: O(|E|: O(|E|2 + 2 + |V|) |V|) (for sparse (for sparse graphs graphs OO(|(|EE|log||log|VV|))|))
The objective function:The objective function:
OROR
In generalIn general: many : many different functions different functions possiblepossible
X 10000X 10000
Out of Out of rectangular rectangular sheets with sheets with fixed fixed dimensionsdimensions
• NP-complete, even in one NP-complete, even in one dimension (pipe-cutting)dimension (pipe-cutting)
Stock cutting:Stock cutting:
Objective function?Objective function?
Minimize Minimize waste!waste!
Personel rosteringPersonel rostering
– Constraints: • Shifts have start times
and end times• Employees have a
qualification• A required capacity per
shift is given per qualification
• Employees can work subject to specific regulations
• …
Shift 1 2 3 4 5 7 1 2 3 C
Pjotr A A A C T T F 1
Ludwig C C R R T T T 0
Clara T T C R R F T T 1
Hildegard A A A T T T 0
Johann C C T T F 1
Wolfgang C T T C T T T 0
Guiseppe R R A A F T T 1
Antonio R R C C F T T 1
Arranger 2 2 2 1 0 0
Tonesetter 1 2 1 1 0 0
Composer 1 1 1 1 2 0
Reader 3 1 1 1 2 0
Consists of Consists of assignments of assignments of employees to employees to working shifts while working shifts while satisfying all satisfying all constraintsconstraintsConstraints:
• Shifts have start times and end times
• Employees have a qualification
• A required capacity per shift is given per qualification
• Employees can work subject to specific regulations
• …
Objective function?Objective function?
Just solve the problem Just solve the problem (CP)(CP)
Number of constraints Number of constraints violatedviolated
Weighted number of Weighted number of constraints violatedconstraints violated
Amount under Amount under assignmentassignment
Amount over Amount over assignmentassignment
This may lead to the This may lead to the definition of a definition of a goal goal functionfunction (representing (representing a lot of a lot of domaindomain information)information)
Shift 1 2 3 4 5 7 1 2 3 C
Pjotr A A A C T T F 1
Ludwig C C R R T T T 0
Clara T T C R R F T T 1
Hildegard A A A T T T 0
Johann C C T T F 1
Wolfgang C T T C T T T 0
Guiseppe R R A A F T T 1
Antonio R R C C F T T 1
Arranger 2 2 2 1 0 0
Tonesetter 1 2 1 1 0 0
Composer 1 1 1 1 2 0
Reader 3 1 1 1 2 0
Neighbors for RosteringNeighbors for Rostering One can easily think ofOne can easily think of
SwapsSwapsRemovalsRemovals InsertionsInsertions ‘‘Large Swaps’Large Swaps’
These ‘easy’ options do These ‘easy’ options do depend on the depend on the domaindomain
They define ‘steps’ in a They define ‘steps’ in a ‘solution space’ with an ‘solution space’ with an associated change in associated change in the goal function.the goal function.
One obvious heuristic One obvious heuristic is a is a hill-climberhill-climber based based on a selection of these on a selection of these possible steps.possible steps.
Shift 1 2 3 4 5 7 1 2 3 C
Pjotr A A A C T T F 1
Ludwig C C R R T T T 0
Clara T T C R R F T T 1
Hildegard A A A T T T 0
Johann C C T T F 1
Wolfgang C T T C T T T 0
Guiseppe R R A A F T T 1
Antonio R R C C F T T 1
Arranger 2 2 2 1 0 0
Tonesetter 1 2 1 1 0 0
Composer 1 1 1 1 2 0
Reader 3 1 1 1 2 0
Traveling Sales Person:Traveling Sales Person:Neighbors:Neighbors:
n citiesn cities n! routesn! routes 2-change connects them all !2-change connects them all !
52
Meta-heuristicsMeta-heuristics
All the methods, HC, RRHC, Sim.Ann., GA, Tabu All the methods, HC, RRHC, Sim.Ann., GA, Tabu Search, … are Search, … are meta-heuristicsmeta-heuristics
They provide frameworks in which the user can They provide frameworks in which the user can plug in heuristics.plug in heuristics.
At a higher level:At a higher level: Meta-heuristics can be combined:Meta-heuristics can be combined:
use algorithm 1 until condition 1 holdsuse algorithm 1 until condition 1 holds then, use algorithm 2 until condition 2 holdthen, use algorithm 2 until condition 2 holds, ….s, ….
Specific combinations are known to work well for Specific combinations are known to work well for certain types of problems.certain types of problems.
ML is used to ‘learn’ which algorithms work ML is used to ‘learn’ which algorithms work better in which problems.better in which problems.
Concluding remarksConcluding remarks
Local Search in Continuous SpacesLocal Search in Continuous Spaces
Variable Neighborhoods SearchVariable Neighborhoods Search
Relation to BDA: some pointersRelation to BDA: some pointers
Continuous Search SpacesContinuous Search Spaces
Some basic ideasSome basic ideas
In 1-dimension: the derivativeIn 1-dimension: the derivative All problems studied so far were for All problems studied so far were for discretediscrete
search spaces!search spaces!
How is HC different for continuous spaces?How is HC different for continuous spaces?
x1x1
ddhhdxdx
(x1) > 0(x1) > 0
(3)(3)
Vector points in Vector points in ascending directionascending direction
In 1-dimension: the derivativeIn 1-dimension: the derivative All problems studied so far were for All problems studied so far were for discretediscrete
search spaces!search spaces!
How is HC different for continuous spaces?How is HC different for continuous spaces?
ddhhdxdx
(x2) < 0(x2) < 0
(-5)(-5)
Vector still points in Vector still points in ascending direction !ascending direction !
x2x2
Let HC move in Let HC move in the direction of the direction of dd
hhddxx For instance:For instance:
x:= x + a.x:= x + a. ddhhddxx
In 1-dimension: the derivativeIn 1-dimension: the derivative All problems studied so far were for All problems studied so far were for discretediscrete
search spaces!search spaces!
How is HC different for continuous spaces?How is HC different for continuous spaces?
Eventually we get Eventually we get to a (local) maximum.to a (local) maximum.
ddhhdxdx
(x3) = 0(x3) = 0
(0)(0) x3x3
In n-dimensions: the gradientIn n-dimensions: the gradient
Gives: gradient ascent / gradient descent approaches.Gives: gradient ascent / gradient descent approaches.
(x1,y1)(x1,y1)
The direction of the strongest ascent: The direction of the strongest ascent: The gradient: The gradient: hh = ( d = ( dhh/dx1, d/dx1, dhh/dx2, …, d/dx2, …, dhh/dxn)/dxn)
ΔΔ
Example: airport placementExample: airport placement Place an airport, nearest to n given cities.Place an airport, nearest to n given cities.
Cities: C1, C2, …, CnCities: C1, C2, …, Cn hh(x,y) = (x,y) = ΣΣ (x-xCi)^2 + (x-xCi)^2 + ΣΣ (y-yCi)^2 (y-yCi)^2
i=1,ni=1,n i=1,ni=1,n
dh/dx = 2 dh/dx = 2 ΣΣ(x-xCi) dh/dy = 2 (x-xCi) dh/dy = 2 ΣΣ(y-yCi) (y-yCi)
Solve: Solve: ΣΣ(x-xCi) = 0 , (x-xCi) = 0 , ΣΣ(y-yCi) = 0 (y-yCi) = 0
Iterative method: Newton-Raphson: converges to rootsIterative method: Newton-Raphson: converges to roots
Solution:Solution: x = x = ΣΣ(xCi) /n y = (xCi) /n y = ΣΣ(yCi) /n (yCi) /n
Obviously correct: center Obviously correct: center of the x- and y-coordinatesof the x- and y-coordinates
Variable Neighborhood SearchVariable Neighborhood Search
Exploit various different ways of defining Exploit various different ways of defining neighborhoods to move out of local optimaneighborhoods to move out of local optima
Variable neighborhood search Variable neighborhood search (Mladenovic and Hanssen 1997)(Mladenovic and Hanssen 1997)
Facts: Facts:
A local minimum with respect to one A local minimum with respect to one neighborhood structure is not necessarily so neighborhood structure is not necessarily so for another.for another.
A global minimum is a local minimum with A global minimum is a local minimum with respect to all neighborhood structuresrespect to all neighborhood structures
For many problems local minima with respect to For many problems local minima with respect to one or several neighbourhoods are relatively one or several neighbourhoods are relatively close to each otherclose to each other
Idea: use different neighborhoodsIdea: use different neighborhoods
By moving to a different neighborhood: you By moving to a different neighborhood: you may get out of the local optimum !may get out of the local optimum !
Define a Define a numbernumber of different neighborhoods of different neighborhoods different ways to compute successorsdifferent ways to compute successors
If you can not get out of local optimum in If you can not get out of local optimum in one, try the next neighborhood.one, try the next neighborhood.
AlgorithmAlgorithm
Select a set of neighbourhood structures Select a set of neighbourhood structures NNll ( (ll == 11 toto llmaxmax) ) State:= S;State:= S;ll:= := 11;;RepeatRepeat until termination condition metuntil termination condition met
Exploration:Exploration:find the best neighbour Succ of State infind the best neighbour Succ of State in NNll((StateState));;
Acceptance:Acceptance:IfIf hh(Succ) > (Succ) > hh(State) (State) thenthen State:= Succ;State:= Succ; l l := := 11;;ElseElse ll:= := l+1l+1;;IfIf l > l > llmaxmax thenthen ll:= := 11; ;
Broad subdomainBroad subdomain
Many variants exist !Many variants exist ! possible topic for you presentation.possible topic for you presentation. find a paper on variable neighborhood searchfind a paper on variable neighborhood search
technique or application.technique or application.
Relation to BDA: some pointersRelation to BDA: some pointers
ExamplesExamples
Sub-modularitySub-modularity
Discrete example:Discrete example: Given Given AA, a set of 30 possible features to diagnose the flu. , a set of 30 possible features to diagnose the flu.
Temp Temp >38>38
DiareDiareaa
Wife had Wife had the fluthe flu
CoughCoughss......
Given Given hh, a function from 2^, a function from 2^AA -> N, giving how well -> N, giving how well this subset allows to discriminate flu versus not-flu.this subset allows to discriminate flu versus not-flu.
Find the Find the best discriminating subset best discriminating subset with 5 elements.with 5 elements.
42% 42% precisionprecision
hh
Discrete example:Discrete example:
Start with empty subset,Start with empty subset, Add the one element that increase Add the one element that increase hh the most, the most, Then add the next element that increase Then add the next element that increase hh most, most, etc. etc.
Temp Temp >38>38
DiareDiareaa
Wife had Wife had the fluthe flu
CoughCoughss......
This is Hill Climbing !!This is Hill Climbing !!
Contiuous example:Contiuous example:
Find the vector/line that discriminates best.Find the vector/line that discriminates best. E.g.: maximize the minimal distance to the pointsE.g.: maximize the minimal distance to the points Continuous optimization problem.Continuous optimization problem.
++--
E.g.: by continuous local search.E.g.: by continuous local search.
In discrete case:In discrete case:SubmodularitySubmodularity
If A C B and s If A C B and s ЄЄ S, then S, then
The property of Diminishing The property of Diminishing Returns:Returns:
hh(A U {s}) – (A U {s}) – hh(A) ≥ (A) ≥ hh(B U {s}) - (B U {s}) - hh(B)(B)
BB
hh
ΔΔ
Definition ofDefinition ofSubmodularitySubmodularity
Submodularity holds for MANY objectives in ML !!Submodularity holds for MANY objectives in ML !!
AA ssaddadd
Relevance of HC for ML:Relevance of HC for ML:
71
In ML: if In ML: if hh is a submodular function: is a submodular function:
TheoremTheorem::IfIf Greedy Local returns A_greedy, Greedy Local returns A_greedy, thenthenhh(A_greedy) ≥ (1- 1/e) max (A_greedy) ≥ (1- 1/e) max hh(A)(A)
ACS ACS
IF P ≠ NP: IF P ≠ NP: This is the very best one can hope for (in Poly. time)This is the very best one can hope for (in Poly. time)
≈ ≈ 63 %63 %
VERY many problems in ML are submodular !!VERY many problems in ML are submodular !! Local search in very relevant for ML.Local search in very relevant for ML.
Reading assignment and Reading assignment and presentations:presentations:
Applications of Local SearchApplications of Local Search
Other Local Search or variants of the studied methodsOther Local Search or variants of the studied methods
Applications of Local Search in MLApplications of Local Search in ML
Further aspects of SubmodularityFurther aspects of Submodularity
For the coming SAT-solving:For the coming SAT-solving:
MAX-SAT solvingMAX-SAT solving
Mini-SatMini-Sat
Further relations between SAT and Local SearchFurther relations between SAT and Local Search
Start with Google and WikiStart with Google and Wiki
Study at least one “real”/scientific sourceStudy at least one “real”/scientific source
Provide the reference on your sources.Provide the reference on your sources.