Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of...

272

Transcript of Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of...

Page 1: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Instance Spaces for Objective Assessment ofAlgorithms and Benchmark Test Suites

Kate Smith-Miles

School of Mathematics and StatisticsUniversity of Melbourne

Instance Spaces for Performance Evaluation 1 / 89

Page 2: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Acknowledgements

This research is funded by ARC Discovery Project grantDP120103678 and ARC Australian Laureate FellowshipFL140100012.

The instance space and evolving instances methodology isjoint work with Dr. Jano van Hemert (University ofEdinburgh), Dr. Davaa Baatar, Dr. Mario Andrés MuñozAcosta, and students Simon Bowly and Thomas Tan

The generalisation to machine learning is joint work with Dr.Laura Villanova, Dr. Mario Andrés Muñoz Acosta, and Dr.Davaa Baatar

Instance Spaces for Performance Evaluation 2 / 89

Page 3: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Acknowledgements

This research is funded by ARC Discovery Project grantDP120103678 and ARC Australian Laureate FellowshipFL140100012.

The instance space and evolving instances methodology isjoint work with Dr. Jano van Hemert (University ofEdinburgh), Dr. Davaa Baatar, Dr. Mario Andrés MuñozAcosta, and students Simon Bowly and Thomas Tan

The generalisation to machine learning is joint work with Dr.Laura Villanova, Dr. Mario Andrés Muñoz Acosta, and Dr.Davaa Baatar

Instance Spaces for Performance Evaluation 2 / 89

Page 4: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Acknowledgements

This research is funded by ARC Discovery Project grantDP120103678 and ARC Australian Laureate FellowshipFL140100012.

The instance space and evolving instances methodology isjoint work with Dr. Jano van Hemert (University ofEdinburgh), Dr. Davaa Baatar, Dr. Mario Andrés MuñozAcosta, and students Simon Bowly and Thomas Tan

The generalisation to machine learning is joint work with Dr.Laura Villanova, Dr. Mario Andrés Muñoz Acosta, and Dr.Davaa Baatar

Instance Spaces for Performance Evaluation 2 / 89

Page 5: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

The Importance of Test Instances

Standard practice: use benchmark instances to reportalgorithm strengths (but rarely weaknesses!)

NFL Theorem (Wolpert & Macready, 1997) warns againstexpecting an algorithm to perform well on all instances,regardless of their structure and characteristics.

The properties (or measurable features) of instances mayprovide explanations about an algorithm's behaviour across arange of instances → predictions, insights.

Requires the right kinds of test instances (diverse, challenging,real-world-like, etc.) and suitable features

Reference

Smith-Miles, K. & Lopes, L., �Measuring Instance Di�culty for Combinatorial Optimization Problems�,Comp. & Oper. Res., vol. 39(5), pp. 875-889, 2012.

Instance Spaces for Performance Evaluation 3 / 89

Page 6: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

The Importance of Test Instances

Standard practice: use benchmark instances to reportalgorithm strengths (but rarely weaknesses!)

NFL Theorem (Wolpert & Macready, 1997) warns againstexpecting an algorithm to perform well on all instances,regardless of their structure and characteristics.

The properties (or measurable features) of instances mayprovide explanations about an algorithm's behaviour across arange of instances → predictions, insights.

Requires the right kinds of test instances (diverse, challenging,real-world-like, etc.) and suitable features

Reference

Smith-Miles, K. & Lopes, L., �Measuring Instance Di�culty for Combinatorial Optimization Problems�,Comp. & Oper. Res., vol. 39(5), pp. 875-889, 2012.

Instance Spaces for Performance Evaluation 3 / 89

Page 7: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

The Importance of Test Instances

Standard practice: use benchmark instances to reportalgorithm strengths (but rarely weaknesses!)

NFL Theorem (Wolpert & Macready, 1997) warns againstexpecting an algorithm to perform well on all instances,regardless of their structure and characteristics.

The properties (or measurable features) of instances mayprovide explanations about an algorithm's behaviour across arange of instances → predictions, insights.

Requires the right kinds of test instances (diverse, challenging,real-world-like, etc.) and suitable features

Reference

Smith-Miles, K. & Lopes, L., �Measuring Instance Di�culty for Combinatorial Optimization Problems�,Comp. & Oper. Res., vol. 39(5), pp. 875-889, 2012.

Instance Spaces for Performance Evaluation 3 / 89

Page 8: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

The Importance of Test Instances

Standard practice: use benchmark instances to reportalgorithm strengths (but rarely weaknesses!)

NFL Theorem (Wolpert & Macready, 1997) warns againstexpecting an algorithm to perform well on all instances,regardless of their structure and characteristics.

The properties (or measurable features) of instances mayprovide explanations about an algorithm's behaviour across arange of instances → predictions, insights.

Requires the right kinds of test instances (diverse, challenging,real-world-like, etc.) and suitable features

Reference

Smith-Miles, K. & Lopes, L., �Measuring Instance Di�culty for Combinatorial Optimization Problems�,Comp. & Oper. Res., vol. 39(5), pp. 875-889, 2012.

Instance Spaces for Performance Evaluation 3 / 89

Page 9: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

The Importance of Test Instances

Standard practice: use benchmark instances to reportalgorithm strengths (but rarely weaknesses!)

NFL Theorem (Wolpert & Macready, 1997) warns againstexpecting an algorithm to perform well on all instances,regardless of their structure and characteristics.

The properties (or measurable features) of instances mayprovide explanations about an algorithm's behaviour across arange of instances → predictions, insights.

Requires the right kinds of test instances (diverse, challenging,real-world-like, etc.) and suitable features

Reference

Smith-Miles, K. & Lopes, L., �Measuring Instance Di�culty for Combinatorial Optimization Problems�,Comp. & Oper. Res., vol. 39(5), pp. 875-889, 2012.

Instance Spaces for Performance Evaluation 3 / 89

Page 10: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Travelling Salesman Problem (TSP) Example

Easy Hard

Instance Spaces for Performance Evaluation 4 / 89

Page 11: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

What makes the TSP easy or hard?

A TSP Formulation (not the only one)

Let Xi ,j = 1 if city i is followed by city j in the tour; 0 otherwise

minimiseN

∑i=1

N

∑j=1

Di ,jXi ,j

subject to

∑i

Xi ,j = 1 ∀j

∑j

Xi ,j = 1 ∀i

∑i∈S

∑j∈S

Xi ,j ≤ |S |−1 ∀S 6= {0},S ⊂ {1,2, . . . ,N}

TSP is NP-hard, but some instances are easy depending onproperties of the inter-city distance matrix D

Instance Spaces for Performance Evaluation 5 / 89

Page 12: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

What makes the TSP easy or hard?

A TSP Formulation (not the only one)

Let Xi ,j = 1 if city i is followed by city j in the tour; 0 otherwise

minimiseN

∑i=1

N

∑j=1

Di ,jXi ,j

subject to

∑i

Xi ,j = 1 ∀j

∑j

Xi ,j = 1 ∀i

∑i∈S

∑j∈S

Xi ,j ≤ |S |−1 ∀S 6= {0},S ⊂ {1,2, . . . ,N}

TSP is NP-hard, but some instances are easy depending onproperties of the inter-city distance matrix D

Instance Spaces for Performance Evaluation 5 / 89

Page 13: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

What makes the TSP easy or hard?

A TSP Formulation (not the only one)

Let Xi ,j = 1 if city i is followed by city j in the tour; 0 otherwise

minimiseN

∑i=1

N

∑j=1

Di ,jXi ,j

subject to

∑i

Xi ,j = 1 ∀j

∑j

Xi ,j = 1 ∀i

∑i∈S

∑j∈S

Xi ,j ≤ |S |−1 ∀S 6= {0},S ⊂ {1,2, . . . ,N}

TSP is NP-hard, but some instances are easy depending onproperties of the inter-city distance matrix D

Instance Spaces for Performance Evaluation 5 / 89

Page 14: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

What makes the TSP easy or hard?

A TSP Formulation (not the only one)

Let Xi ,j = 1 if city i is followed by city j in the tour; 0 otherwise

minimiseN

∑i=1

N

∑j=1

Di ,jXi ,j

subject to

∑i

Xi ,j = 1 ∀j

∑j

Xi ,j = 1 ∀i

∑i∈S

∑j∈S

Xi ,j ≤ |S |−1 ∀S 6= {0},S ⊂ {1,2, . . . ,N}

TSP is NP-hard, but some instances are easy depending onproperties of the inter-city distance matrix D

Instance Spaces for Performance Evaluation 5 / 89

Page 15: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Questions

How do instance features help us understand the strengths andweaknesses of algorithms?

How can we infer and visualise algorithm performance across ahuge �instance space�?

How easy or hard are the benchmark instances in theliterature? How diverse are existing instances?

How can we measure objectively the relative performance ofalgorithms?

How can we generate new test instances to gain insights intoalgorithmic power?

Instance Spaces for Performance Evaluation 6 / 89

Page 16: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Questions

How do instance features help us understand the strengths andweaknesses of algorithms?

How can we infer and visualise algorithm performance across ahuge �instance space�?

How easy or hard are the benchmark instances in theliterature? How diverse are existing instances?

How can we measure objectively the relative performance ofalgorithms?

How can we generate new test instances to gain insights intoalgorithmic power?

Instance Spaces for Performance Evaluation 6 / 89

Page 17: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Questions

How do instance features help us understand the strengths andweaknesses of algorithms?

How can we infer and visualise algorithm performance across ahuge �instance space�?

How easy or hard are the benchmark instances in theliterature? How diverse are existing instances?

How can we measure objectively the relative performance ofalgorithms?

How can we generate new test instances to gain insights intoalgorithmic power?

Instance Spaces for Performance Evaluation 6 / 89

Page 18: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Questions

How do instance features help us understand the strengths andweaknesses of algorithms?

How can we infer and visualise algorithm performance across ahuge �instance space�?

How easy or hard are the benchmark instances in theliterature? How diverse are existing instances?

How can we measure objectively the relative performance ofalgorithms?

How can we generate new test instances to gain insights intoalgorithmic power?

Instance Spaces for Performance Evaluation 6 / 89

Page 19: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Questions

How do instance features help us understand the strengths andweaknesses of algorithms?

How can we infer and visualise algorithm performance across ahuge �instance space�?

How easy or hard are the benchmark instances in theliterature? How diverse are existing instances?

How can we measure objectively the relative performance ofalgorithms?

How can we generate new test instances to gain insights intoalgorithmic power?

Instance Spaces for Performance Evaluation 6 / 89

Page 20: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Aims

Develop a new methodology toI visualise �instance space� based on instance featuresI visualise algorithm performance across the instance spaceI de�ne where algorithm performance is expected to be �good�

(called the �algorithm footprint�)I measure the relative size of an algorithm's footprintI evolve new instances at target locations in instance space

Enable objective assessment of algorithmic power.

Enable useful test instances to be generated with controllablecharacteristics to drive insights

Understand and report the boundary of good performance ofan algorithm � essential for good research practice, and toavoid deployment disasters.

Instance Spaces for Performance Evaluation 7 / 89

Page 21: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Aims

Develop a new methodology toI visualise �instance space� based on instance featuresI visualise algorithm performance across the instance spaceI de�ne where algorithm performance is expected to be �good�

(called the �algorithm footprint�)I measure the relative size of an algorithm's footprintI evolve new instances at target locations in instance space

Enable objective assessment of algorithmic power.

Enable useful test instances to be generated with controllablecharacteristics to drive insights

Understand and report the boundary of good performance ofan algorithm � essential for good research practice, and toavoid deployment disasters.

Instance Spaces for Performance Evaluation 7 / 89

Page 22: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Aims

Develop a new methodology toI visualise �instance space� based on instance featuresI visualise algorithm performance across the instance spaceI de�ne where algorithm performance is expected to be �good�

(called the �algorithm footprint�)I measure the relative size of an algorithm's footprintI evolve new instances at target locations in instance space

Enable objective assessment of algorithmic power.

Enable useful test instances to be generated with controllablecharacteristics to drive insights

Understand and report the boundary of good performance ofan algorithm � essential for good research practice, and toavoid deployment disasters.

Instance Spaces for Performance Evaluation 7 / 89

Page 23: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Aims

Develop a new methodology toI visualise �instance space� based on instance featuresI visualise algorithm performance across the instance spaceI de�ne where algorithm performance is expected to be �good�

(called the �algorithm footprint�)I measure the relative size of an algorithm's footprintI evolve new instances at target locations in instance space

Enable objective assessment of algorithmic power.

Enable useful test instances to be generated with controllablecharacteristics to drive insights

Understand and report the boundary of good performance ofan algorithm � essential for good research practice, and toavoid deployment disasters.

Instance Spaces for Performance Evaluation 7 / 89

Page 24: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Aims

Develop a new methodology toI visualise �instance space� based on instance featuresI visualise algorithm performance across the instance spaceI de�ne where algorithm performance is expected to be �good�

(called the �algorithm footprint�)I measure the relative size of an algorithm's footprintI evolve new instances at target locations in instance space

Enable objective assessment of algorithmic power.

Enable useful test instances to be generated with controllablecharacteristics to drive insights

Understand and report the boundary of good performance ofan algorithm � essential for good research practice, and toavoid deployment disasters.

Instance Spaces for Performance Evaluation 7 / 89

Page 25: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Aims

Develop a new methodology toI visualise �instance space� based on instance featuresI visualise algorithm performance across the instance spaceI de�ne where algorithm performance is expected to be �good�

(called the �algorithm footprint�)I measure the relative size of an algorithm's footprintI evolve new instances at target locations in instance space

Enable objective assessment of algorithmic power.

Enable useful test instances to be generated with controllablecharacteristics to drive insights

Understand and report the boundary of good performance ofan algorithm � essential for good research practice, and toavoid deployment disasters.

Instance Spaces for Performance Evaluation 7 / 89

Page 26: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Aims

Develop a new methodology toI visualise �instance space� based on instance featuresI visualise algorithm performance across the instance spaceI de�ne where algorithm performance is expected to be �good�

(called the �algorithm footprint�)I measure the relative size of an algorithm's footprintI evolve new instances at target locations in instance space

Enable objective assessment of algorithmic power.

Enable useful test instances to be generated with controllablecharacteristics to drive insights

Understand and report the boundary of good performance ofan algorithm � essential for good research practice, and toavoid deployment disasters.

Instance Spaces for Performance Evaluation 7 / 89

Page 27: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Aims

Develop a new methodology toI visualise �instance space� based on instance featuresI visualise algorithm performance across the instance spaceI de�ne where algorithm performance is expected to be �good�

(called the �algorithm footprint�)I measure the relative size of an algorithm's footprintI evolve new instances at target locations in instance space

Enable objective assessment of algorithmic power.

Enable useful test instances to be generated with controllablecharacteristics to drive insights

Understand and report the boundary of good performance ofan algorithm � essential for good research practice, and toavoid deployment disasters.

Instance Spaces for Performance Evaluation 7 / 89

Page 28: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Aims

Develop a new methodology toI visualise �instance space� based on instance featuresI visualise algorithm performance across the instance spaceI de�ne where algorithm performance is expected to be �good�

(called the �algorithm footprint�)I measure the relative size of an algorithm's footprintI evolve new instances at target locations in instance space

Enable objective assessment of algorithmic power.

Enable useful test instances to be generated with controllablecharacteristics to drive insights

Understand and report the boundary of good performance ofan algorithm � essential for good research practice, and toavoid deployment disasters.

Instance Spaces for Performance Evaluation 7 / 89

Page 29: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Algorithm Selection Problem, Rice (1976)

Instance Spaces for Performance Evaluation 8 / 89

Page 30: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Applications of Rice's Framework

Rice and colleagues used this approach to predict theperformance of the many methods (A) for numerical solutionof elliptic partial di�erential equations (PDEs).

Reference

Weerana, Rice, et al., �PYTHIA: a knowledge-based system to select scienti�c algorithms�, ACM Trans.on Math. Software, vol. 22(4), pp. 447-468, 1996.

It has also been used for pre-conditioners for linear systemsolvers, and extensively for machine learning (meta-learning).

Reference

Smith-Miles, K. A., �Cross-disciplinary perspectives on meta-learning for algorithm selection�, ACMComputing Surveys, vol. 41(1), 2008.

Instance Spaces for Performance Evaluation 9 / 89

Page 31: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Applications of Rice's Framework

Rice and colleagues used this approach to predict theperformance of the many methods (A) for numerical solutionof elliptic partial di�erential equations (PDEs).

Reference

Weerana, Rice, et al., �PYTHIA: a knowledge-based system to select scienti�c algorithms�, ACM Trans.on Math. Software, vol. 22(4), pp. 447-468, 1996.

It has also been used for pre-conditioners for linear systemsolvers, and extensively for machine learning (meta-learning).

Reference

Smith-Miles, K. A., �Cross-disciplinary perspectives on meta-learning for algorithm selection�, ACMComputing Surveys, vol. 41(1), 2008.

Instance Spaces for Performance Evaluation 9 / 89

Page 32: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Applications of Rice's Framework

Rice and colleagues used this approach to predict theperformance of the many methods (A) for numerical solutionof elliptic partial di�erential equations (PDEs).

Reference

Weerana, Rice, et al., �PYTHIA: a knowledge-based system to select scienti�c algorithms�, ACM Trans.on Math. Software, vol. 22(4), pp. 447-468, 1996.

It has also been used for pre-conditioners for linear systemsolvers, and extensively for machine learning (meta-learning).

Reference

Smith-Miles, K. A., �Cross-disciplinary perspectives on meta-learning for algorithm selection�, ACMComputing Surveys, vol. 41(1), 2008.

Instance Spaces for Performance Evaluation 9 / 89

Page 33: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Applications of Rice's Framework

Rice and colleagues used this approach to predict theperformance of the many methods (A) for numerical solutionof elliptic partial di�erential equations (PDEs).

Reference

Weerana, Rice, et al., �PYTHIA: a knowledge-based system to select scienti�c algorithms�, ACM Trans.on Math. Software, vol. 22(4), pp. 447-468, 1996.

It has also been used for pre-conditioners for linear systemsolvers, and extensively for machine learning (meta-learning).

Reference

Smith-Miles, K. A., �Cross-disciplinary perspectives on meta-learning for algorithm selection�, ACMComputing Surveys, vol. 41(1), 2008.

Instance Spaces for Performance Evaluation 9 / 89

Page 34: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Applications to Optimisation

Represents a relatively new direction for the optimisationcommunity (combinatorial, continuous, black-box, etc.)

Much needed, givenI huge range of algorithmsI frequent statements like �currently there is still a strong lack of

. . . understanding of how exactly the relative performance ofdi�erent meta-heuristics depends on instance characteristics.�

Can also resolve longstanding debate about how instancechoice a�ects evaluation of algorithm performance

Reference

Hooker, J.N., �Testing heuristics: We have it all wrong�, Journal of Heuristics, vol. 1, pp. 33-42, 1995.

Instance Spaces for Performance Evaluation 10 / 89

Page 35: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Applications to Optimisation

Represents a relatively new direction for the optimisationcommunity (combinatorial, continuous, black-box, etc.)

Much needed, givenI huge range of algorithmsI frequent statements like �currently there is still a strong lack of

. . . understanding of how exactly the relative performance ofdi�erent meta-heuristics depends on instance characteristics.�

Can also resolve longstanding debate about how instancechoice a�ects evaluation of algorithm performance

Reference

Hooker, J.N., �Testing heuristics: We have it all wrong�, Journal of Heuristics, vol. 1, pp. 33-42, 1995.

Instance Spaces for Performance Evaluation 10 / 89

Page 36: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Applications to Optimisation

Represents a relatively new direction for the optimisationcommunity (combinatorial, continuous, black-box, etc.)

Much needed, givenI huge range of algorithmsI frequent statements like �currently there is still a strong lack of

. . . understanding of how exactly the relative performance ofdi�erent meta-heuristics depends on instance characteristics.�

Can also resolve longstanding debate about how instancechoice a�ects evaluation of algorithm performance

Reference

Hooker, J.N., �Testing heuristics: We have it all wrong�, Journal of Heuristics, vol. 1, pp. 33-42, 1995.

Instance Spaces for Performance Evaluation 10 / 89

Page 37: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Applications to Optimisation

Represents a relatively new direction for the optimisationcommunity (combinatorial, continuous, black-box, etc.)

Much needed, givenI huge range of algorithmsI frequent statements like �currently there is still a strong lack of

. . . understanding of how exactly the relative performance ofdi�erent meta-heuristics depends on instance characteristics.�

Can also resolve longstanding debate about how instancechoice a�ects evaluation of algorithm performance

Reference

Hooker, J.N., �Testing heuristics: We have it all wrong�, Journal of Heuristics, vol. 1, pp. 33-42, 1995.

Instance Spaces for Performance Evaluation 10 / 89

Page 38: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Applications to Optimisation

Represents a relatively new direction for the optimisationcommunity (combinatorial, continuous, black-box, etc.)

Much needed, givenI huge range of algorithmsI frequent statements like �currently there is still a strong lack of

. . . understanding of how exactly the relative performance ofdi�erent meta-heuristics depends on instance characteristics.�

Can also resolve longstanding debate about how instancechoice a�ects evaluation of algorithm performance

Reference

Hooker, J.N., �Testing heuristics: We have it all wrong�, Journal of Heuristics, vol. 1, pp. 33-42, 1995.

Instance Spaces for Performance Evaluation 10 / 89

Page 39: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Applications to Optimisation

Represents a relatively new direction for the optimisationcommunity (combinatorial, continuous, black-box, etc.)

Much needed, givenI huge range of algorithmsI frequent statements like �currently there is still a strong lack of

. . . understanding of how exactly the relative performance ofdi�erent meta-heuristics depends on instance characteristics.�

Can also resolve longstanding debate about how instancechoice a�ects evaluation of algorithm performance

Reference

Hooker, J.N., �Testing heuristics: We have it all wrong�, Journal of Heuristics, vol. 1, pp. 33-42, 1995.

Instance Spaces for Performance Evaluation 10 / 89

Page 40: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

MotivationAimsFramework

Extending Rice's Framework

Instance Spaces for Performance Evaluation 11 / 89

{I,F,Y,A} is the meta-data from which we learn

Page 41: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

STEP 1: Collect meta-data {I,F,Y,A}

What makes the problem hard?

What features capture the di�culty of instances?

Which instances show su�cient diversity in features as well asalgorithm performance?

Which algorithms will show su�cient diversity of performancethat we can learn something about the e�ectiveness of theirunderlying mechanism?

What performance metric(s) is most relevant?

Instance Spaces for Performance Evaluation 12 / 89

Page 42: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

STEP 1: Collect meta-data {I,F,Y,A}

What makes the problem hard?

What features capture the di�culty of instances?

Which instances show su�cient diversity in features as well asalgorithm performance?

Which algorithms will show su�cient diversity of performancethat we can learn something about the e�ectiveness of theirunderlying mechanism?

What performance metric(s) is most relevant?

Instance Spaces for Performance Evaluation 12 / 89

Page 43: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

STEP 1: Collect meta-data {I,F,Y,A}

What makes the problem hard?

What features capture the di�culty of instances?

Which instances show su�cient diversity in features as well asalgorithm performance?

Which algorithms will show su�cient diversity of performancethat we can learn something about the e�ectiveness of theirunderlying mechanism?

What performance metric(s) is most relevant?

Instance Spaces for Performance Evaluation 12 / 89

Page 44: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

STEP 1: Collect meta-data {I,F,Y,A}

What makes the problem hard?

What features capture the di�culty of instances?

Which instances show su�cient diversity in features as well asalgorithm performance?

Which algorithms will show su�cient diversity of performancethat we can learn something about the e�ectiveness of theirunderlying mechanism?

What performance metric(s) is most relevant?

Instance Spaces for Performance Evaluation 12 / 89

Page 45: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

STEP 1: Collect meta-data {I,F,Y,A}

What makes the problem hard?

What features capture the di�culty of instances?

Which instances show su�cient diversity in features as well asalgorithm performance?

Which algorithms will show su�cient diversity of performancethat we can learn something about the e�ectiveness of theirunderlying mechanism?

What performance metric(s) is most relevant?

Instance Spaces for Performance Evaluation 12 / 89

Page 46: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

STEP 2: Create instance space

Which dimension reduction method should be used to loseminimal information and create a visualisation that separateseasy and hard instances in interpretable ways?

Which features should be selected?

Can the selected features accurately predict algorithmperformance?

Instance Spaces for Performance Evaluation 13 / 89

Page 47: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

STEP 2: Create instance space

Which dimension reduction method should be used to loseminimal information and create a visualisation that separateseasy and hard instances in interpretable ways?

Which features should be selected?

Can the selected features accurately predict algorithmperformance?

Instance Spaces for Performance Evaluation 13 / 89

Page 48: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

STEP 2: Create instance space

Which dimension reduction method should be used to loseminimal information and create a visualisation that separateseasy and hard instances in interpretable ways?

Which features should be selected?

Can the selected features accurately predict algorithmperformance?

Instance Spaces for Performance Evaluation 13 / 89

Page 49: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

STEP 3: Measure algorithm footprints and gain insightsinto strengths and weaknesses

In which parts of the space is an algorithm expected toperform well or poorly?

How large is its footprint, relative to other algorithms?

Does its footprint overlap real-world instances?

Is it unique anywhere?

Instance Spaces for Performance Evaluation 14 / 89

Page 50: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

STEP 3: Measure algorithm footprints and gain insightsinto strengths and weaknesses

In which parts of the space is an algorithm expected toperform well or poorly?

How large is its footprint, relative to other algorithms?

Does its footprint overlap real-world instances?

Is it unique anywhere?

Instance Spaces for Performance Evaluation 14 / 89

Page 51: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

STEP 3: Measure algorithm footprints and gain insightsinto strengths and weaknesses

In which parts of the space is an algorithm expected toperform well or poorly?

How large is its footprint, relative to other algorithms?

Does its footprint overlap real-world instances?

Is it unique anywhere?

Instance Spaces for Performance Evaluation 14 / 89

Page 52: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

STEP 3: Measure algorithm footprints and gain insightsinto strengths and weaknesses

In which parts of the space is an algorithm expected toperform well or poorly?

How large is its footprint, relative to other algorithms?

Does its footprint overlap real-world instances?

Is it unique anywhere?

Instance Spaces for Performance Evaluation 14 / 89

Page 53: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

STEP 4: Generate new test instances to �ll gaps in theinstance space

Is there a theoretical boundary beyond which instances can'texist?

Where are the benchmark instances located?

How diverse and challenging are they?

How can we set target points in the instance space and evolvenew instances?

Which target points could provide important new informationto in�uence our assessment?

Return to STEP 1 to revisit if features distinguish newinstances

Instance Spaces for Performance Evaluation 15 / 89

Page 54: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

STEP 4: Generate new test instances to �ll gaps in theinstance space

Is there a theoretical boundary beyond which instances can'texist?

Where are the benchmark instances located?

How diverse and challenging are they?

How can we set target points in the instance space and evolvenew instances?

Which target points could provide important new informationto in�uence our assessment?

Return to STEP 1 to revisit if features distinguish newinstances

Instance Spaces for Performance Evaluation 15 / 89

Page 55: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring

Instance Spaces for Performance Evaluation 16 / 89

Given an undirected graph G (V ,E )with |V |= n, colour the vertices suchthat no two vertices connected by anedge share the same colour

Try to �nd the minimum number ofcolours needed to colour the graph(chromatic number)

NP-hard problem → numerousheuristics for large n

Many applications, such as timetablingwhere edges represent con�ictsbetween events

Page 56: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring

Instance Spaces for Performance Evaluation 16 / 89

Given an undirected graph G (V ,E )with |V |= n, colour the vertices suchthat no two vertices connected by anedge share the same colour

Try to �nd the minimum number ofcolours needed to colour the graph(chromatic number)

NP-hard problem → numerousheuristics for large n

Many applications, such as timetablingwhere edges represent con�ictsbetween events

Page 57: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring

Instance Spaces for Performance Evaluation 16 / 89

Given an undirected graph G (V ,E )with |V |= n, colour the vertices suchthat no two vertices connected by anedge share the same colour

Try to �nd the minimum number ofcolours needed to colour the graph(chromatic number)

NP-hard problem → numerousheuristics for large n

Many applications, such as timetablingwhere edges represent con�ictsbetween events

Page 58: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring

Instance Spaces for Performance Evaluation 16 / 89

Given an undirected graph G (V ,E )with |V |= n, colour the vertices suchthat no two vertices connected by anedge share the same colour

Try to �nd the minimum number ofcolours needed to colour the graph(chromatic number)

NP-hard problem → numerousheuristics for large n

Many applications, such as timetablingwhere edges represent con�ictsbetween events

Page 59: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

What makes graph colouring hard?

In total we have 18 features that describe a graph instanceG (V ,E )

5 features relating to the nodes and edgesI The number of nodes or vertices in a graph: n = |V |I The number of edges in a graph: m = |E |I The density of a graph: the ratio of the number of edges to

the number of possible edges.I Mean node degree: the degree of a node is the number of

connections a node has to other nodes.I SD of node degree: the average node degree and its standard

deviation can give us an idea of how connected a graph is.

Instance Spaces for Performance Evaluation 17 / 89

Page 60: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

What makes graph colouring hard?

In total we have 18 features that describe a graph instanceG (V ,E )

5 features relating to the nodes and edgesI The number of nodes or vertices in a graph: n = |V |I The number of edges in a graph: m = |E |I The density of a graph: the ratio of the number of edges to

the number of possible edges.I Mean node degree: the degree of a node is the number of

connections a node has to other nodes.I SD of node degree: the average node degree and its standard

deviation can give us an idea of how connected a graph is.

Instance Spaces for Performance Evaluation 17 / 89

Page 61: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

What makes graph colouring hard?

In total we have 18 features that describe a graph instanceG (V ,E )

5 features relating to the nodes and edgesI The number of nodes or vertices in a graph: n = |V |I The number of edges in a graph: m = |E |I The density of a graph: the ratio of the number of edges to

the number of possible edges.I Mean node degree: the degree of a node is the number of

connections a node has to other nodes.I SD of node degree: the average node degree and its standard

deviation can give us an idea of how connected a graph is.

Instance Spaces for Performance Evaluation 17 / 89

Page 62: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

What makes graph colouring hard?

In total we have 18 features that describe a graph instanceG (V ,E )

5 features relating to the nodes and edgesI The number of nodes or vertices in a graph: n = |V |I The number of edges in a graph: m = |E |I The density of a graph: the ratio of the number of edges to

the number of possible edges.I Mean node degree: the degree of a node is the number of

connections a node has to other nodes.I SD of node degree: the average node degree and its standard

deviation can give us an idea of how connected a graph is.

Instance Spaces for Performance Evaluation 17 / 89

Page 63: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

What makes graph colouring hard?

In total we have 18 features that describe a graph instanceG (V ,E )

5 features relating to the nodes and edgesI The number of nodes or vertices in a graph: n = |V |I The number of edges in a graph: m = |E |I The density of a graph: the ratio of the number of edges to

the number of possible edges.I Mean node degree: the degree of a node is the number of

connections a node has to other nodes.I SD of node degree: the average node degree and its standard

deviation can give us an idea of how connected a graph is.

Instance Spaces for Performance Evaluation 17 / 89

Page 64: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph features (continued)

8 features related to cycles and paths on the graphI The diameter of a graph: max shortest path distance between

any two nodes.I Average path length: average length of shortest paths for all

node pairs.I The girth of a graph: the length of the shortest cycle.I The clustering coe�cient: a measure of node clustering.I Mean betweenness centrality: average fraction of all shortest

paths connecting all pairs of nodes that pass through a givennode.

I SD of betweenness centrality: with the mean, the SD gives ameasure of how central the nodes are in a graph.

I Szeged index / revised Szeged index: generalisation of Wienernumber to cyclic graphs (correlates with bipartivity)

I Beta: proportion of even closed walks to all closed walks(correlates with bipartivity)

Instance Spaces for Performance Evaluation 18 / 89

Page 65: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph features (continued)

8 features related to cycles and paths on the graphI The diameter of a graph: max shortest path distance between

any two nodes.I Average path length: average length of shortest paths for all

node pairs.I The girth of a graph: the length of the shortest cycle.I The clustering coe�cient: a measure of node clustering.I Mean betweenness centrality: average fraction of all shortest

paths connecting all pairs of nodes that pass through a givennode.

I SD of betweenness centrality: with the mean, the SD gives ameasure of how central the nodes are in a graph.

I Szeged index / revised Szeged index: generalisation of Wienernumber to cyclic graphs (correlates with bipartivity)

I Beta: proportion of even closed walks to all closed walks(correlates with bipartivity)

Instance Spaces for Performance Evaluation 18 / 89

Page 66: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph features (continued)

8 features related to cycles and paths on the graphI The diameter of a graph: max shortest path distance between

any two nodes.I Average path length: average length of shortest paths for all

node pairs.I The girth of a graph: the length of the shortest cycle.I The clustering coe�cient: a measure of node clustering.I Mean betweenness centrality: average fraction of all shortest

paths connecting all pairs of nodes that pass through a givennode.

I SD of betweenness centrality: with the mean, the SD gives ameasure of how central the nodes are in a graph.

I Szeged index / revised Szeged index: generalisation of Wienernumber to cyclic graphs (correlates with bipartivity)

I Beta: proportion of even closed walks to all closed walks(correlates with bipartivity)

Instance Spaces for Performance Evaluation 18 / 89

Page 67: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph features (continued)

8 features related to cycles and paths on the graphI The diameter of a graph: max shortest path distance between

any two nodes.I Average path length: average length of shortest paths for all

node pairs.I The girth of a graph: the length of the shortest cycle.I The clustering coe�cient: a measure of node clustering.I Mean betweenness centrality: average fraction of all shortest

paths connecting all pairs of nodes that pass through a givennode.

I SD of betweenness centrality: with the mean, the SD gives ameasure of how central the nodes are in a graph.

I Szeged index / revised Szeged index: generalisation of Wienernumber to cyclic graphs (correlates with bipartivity)

I Beta: proportion of even closed walks to all closed walks(correlates with bipartivity)

Instance Spaces for Performance Evaluation 18 / 89

Page 68: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph features (continued)

8 features related to cycles and paths on the graphI The diameter of a graph: max shortest path distance between

any two nodes.I Average path length: average length of shortest paths for all

node pairs.I The girth of a graph: the length of the shortest cycle.I The clustering coe�cient: a measure of node clustering.I Mean betweenness centrality: average fraction of all shortest

paths connecting all pairs of nodes that pass through a givennode.

I SD of betweenness centrality: with the mean, the SD gives ameasure of how central the nodes are in a graph.

I Szeged index / revised Szeged index: generalisation of Wienernumber to cyclic graphs (correlates with bipartivity)

I Beta: proportion of even closed walks to all closed walks(correlates with bipartivity)

Instance Spaces for Performance Evaluation 18 / 89

Page 69: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph features (continued)

8 features related to cycles and paths on the graphI The diameter of a graph: max shortest path distance between

any two nodes.I Average path length: average length of shortest paths for all

node pairs.I The girth of a graph: the length of the shortest cycle.I The clustering coe�cient: a measure of node clustering.I Mean betweenness centrality: average fraction of all shortest

paths connecting all pairs of nodes that pass through a givennode.

I SD of betweenness centrality: with the mean, the SD gives ameasure of how central the nodes are in a graph.

I Szeged index / revised Szeged index: generalisation of Wienernumber to cyclic graphs (correlates with bipartivity)

I Beta: proportion of even closed walks to all closed walks(correlates with bipartivity)

Instance Spaces for Performance Evaluation 18 / 89

Page 70: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph features (continued)

8 features related to cycles and paths on the graphI The diameter of a graph: max shortest path distance between

any two nodes.I Average path length: average length of shortest paths for all

node pairs.I The girth of a graph: the length of the shortest cycle.I The clustering coe�cient: a measure of node clustering.I Mean betweenness centrality: average fraction of all shortest

paths connecting all pairs of nodes that pass through a givennode.

I SD of betweenness centrality: with the mean, the SD gives ameasure of how central the nodes are in a graph.

I Szeged index / revised Szeged index: generalisation of Wienernumber to cyclic graphs (correlates with bipartivity)

I Beta: proportion of even closed walks to all closed walks(correlates with bipartivity)

Instance Spaces for Performance Evaluation 18 / 89

Page 71: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph features (continued)

8 features related to cycles and paths on the graphI The diameter of a graph: max shortest path distance between

any two nodes.I Average path length: average length of shortest paths for all

node pairs.I The girth of a graph: the length of the shortest cycle.I The clustering coe�cient: a measure of node clustering.I Mean betweenness centrality: average fraction of all shortest

paths connecting all pairs of nodes that pass through a givennode.

I SD of betweenness centrality: with the mean, the SD gives ameasure of how central the nodes are in a graph.

I Szeged index / revised Szeged index: generalisation of Wienernumber to cyclic graphs (correlates with bipartivity)

I Beta: proportion of even closed walks to all closed walks(correlates with bipartivity)

Instance Spaces for Performance Evaluation 18 / 89

Page 72: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph features (continued)

8 features related to cycles and paths on the graphI The diameter of a graph: max shortest path distance between

any two nodes.I Average path length: average length of shortest paths for all

node pairs.I The girth of a graph: the length of the shortest cycle.I The clustering coe�cient: a measure of node clustering.I Mean betweenness centrality: average fraction of all shortest

paths connecting all pairs of nodes that pass through a givennode.

I SD of betweenness centrality: with the mean, the SD gives ameasure of how central the nodes are in a graph.

I Szeged index / revised Szeged index: generalisation of Wienernumber to cyclic graphs (correlates with bipartivity)

I Beta: proportion of even closed walks to all closed walks(correlates with bipartivity)

Instance Spaces for Performance Evaluation 18 / 89

Page 73: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph features (continued)

5 features related to the Adjacency and Laplacian matricesI Mean eigenvector centrality: the Perron-Frobenius eigenvector

of the adjacency matrix, averaged across all components.I SD of eigenvector centrality: together with the mean, the

standard deviation of eigenvector centrality gives us a measureof the importance of a node inside a graph.

I Mean spectrum: the mean of absolute values of eigenvalues ofthe adjacency matrix (a.k.a �energy� of the graph).

I SD of the set of absolute values of eigenvalues of theadjacency matrix.

I Algebraic connectivity: the 2nd smallest eigenvalue of theLaplacian matrix, re�ecting how well connected a graph is.Cheeger's constant, another important graph property, isbounded by half the algebraic connectivity.

Instance Spaces for Performance Evaluation 19 / 89

Page 74: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph features (continued)

5 features related to the Adjacency and Laplacian matricesI Mean eigenvector centrality: the Perron-Frobenius eigenvector

of the adjacency matrix, averaged across all components.I SD of eigenvector centrality: together with the mean, the

standard deviation of eigenvector centrality gives us a measureof the importance of a node inside a graph.

I Mean spectrum: the mean of absolute values of eigenvalues ofthe adjacency matrix (a.k.a �energy� of the graph).

I SD of the set of absolute values of eigenvalues of theadjacency matrix.

I Algebraic connectivity: the 2nd smallest eigenvalue of theLaplacian matrix, re�ecting how well connected a graph is.Cheeger's constant, another important graph property, isbounded by half the algebraic connectivity.

Instance Spaces for Performance Evaluation 19 / 89

Page 75: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph features (continued)

5 features related to the Adjacency and Laplacian matricesI Mean eigenvector centrality: the Perron-Frobenius eigenvector

of the adjacency matrix, averaged across all components.I SD of eigenvector centrality: together with the mean, the

standard deviation of eigenvector centrality gives us a measureof the importance of a node inside a graph.

I Mean spectrum: the mean of absolute values of eigenvalues ofthe adjacency matrix (a.k.a �energy� of the graph).

I SD of the set of absolute values of eigenvalues of theadjacency matrix.

I Algebraic connectivity: the 2nd smallest eigenvalue of theLaplacian matrix, re�ecting how well connected a graph is.Cheeger's constant, another important graph property, isbounded by half the algebraic connectivity.

Instance Spaces for Performance Evaluation 19 / 89

Page 76: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph features (continued)

5 features related to the Adjacency and Laplacian matricesI Mean eigenvector centrality: the Perron-Frobenius eigenvector

of the adjacency matrix, averaged across all components.I SD of eigenvector centrality: together with the mean, the

standard deviation of eigenvector centrality gives us a measureof the importance of a node inside a graph.

I Mean spectrum: the mean of absolute values of eigenvalues ofthe adjacency matrix (a.k.a �energy� of the graph).

I SD of the set of absolute values of eigenvalues of theadjacency matrix.

I Algebraic connectivity: the 2nd smallest eigenvalue of theLaplacian matrix, re�ecting how well connected a graph is.Cheeger's constant, another important graph property, isbounded by half the algebraic connectivity.

Instance Spaces for Performance Evaluation 19 / 89

Page 77: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph features (continued)

5 features related to the Adjacency and Laplacian matricesI Mean eigenvector centrality: the Perron-Frobenius eigenvector

of the adjacency matrix, averaged across all components.I SD of eigenvector centrality: together with the mean, the

standard deviation of eigenvector centrality gives us a measureof the importance of a node inside a graph.

I Mean spectrum: the mean of absolute values of eigenvalues ofthe adjacency matrix (a.k.a �energy� of the graph).

I SD of the set of absolute values of eigenvalues of theadjacency matrix.

I Algebraic connectivity: the 2nd smallest eigenvalue of theLaplacian matrix, re�ecting how well connected a graph is.Cheeger's constant, another important graph property, isbounded by half the algebraic connectivity.

Instance Spaces for Performance Evaluation 19 / 89

Page 78: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph features (continued)

5 features related to the Adjacency and Laplacian matricesI Mean eigenvector centrality: the Perron-Frobenius eigenvector

of the adjacency matrix, averaged across all components.I SD of eigenvector centrality: together with the mean, the

standard deviation of eigenvector centrality gives us a measureof the importance of a node inside a graph.

I Mean spectrum: the mean of absolute values of eigenvalues ofthe adjacency matrix (a.k.a �energy� of the graph).

I SD of the set of absolute values of eigenvalues of theadjacency matrix.

I Algebraic connectivity: the 2nd smallest eigenvalue of theLaplacian matrix, re�ecting how well connected a graph is.Cheeger's constant, another important graph property, isbounded by half the algebraic connectivity.

Instance Spaces for Performance Evaluation 19 / 89

Page 79: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring Instances

We use a set of 6788 instances from a variety of well-studiedsources, and others we have generated to explore bipartivity

DataSet # instances Description

B 1000 Bipartivity ControlledC1 1000 Culberson: cycle-drivenC2 932 Culberson: geometricC3 1000 Culberson: girth and degree inhibitedC4 1000 Culberson: IID edge probabilitiesC5 1000 Culberson: weight-biasedD 743 DIMACS instancesE 20 Social Network graphsF 80 Sports SchedulingG 13 Exam Timetabling

Instance Spaces for Performance Evaluation 20 / 89

Page 80: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring Algorithms

We use the same 8 algorithms considered by Lewis et al.I DSATUR: Brelaz's greedy algorithm (exact for bipartite

graphs)I RandGr: Simple greedy �rst-�t colouring of random

permutations of nodesI Bktr: a backtracking version of DSATUR (Culberson)I HillClimb: a hill-climbing improvement on initial DSATUR

solutionI HEA: Hybrid evolutionary algorithmI TabuCol: Tabu search algorithmI PartCol: Like TabuCol, but doesn't restricts to feasible spaceI AntCol: Ant Colony meta-heuristic

Reference

Lewis, R. et al. �A wide-ranging computational comparison of high-performance graphcolouring algorithms�. Computers & Operations Research 39(9), pp. 1933-1950, 2012.

Instance Spaces for Performance Evaluation 21 / 89

Page 81: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring Algorithms

We use the same 8 algorithms considered by Lewis et al.I DSATUR: Brelaz's greedy algorithm (exact for bipartite

graphs)I RandGr: Simple greedy �rst-�t colouring of random

permutations of nodesI Bktr: a backtracking version of DSATUR (Culberson)I HillClimb: a hill-climbing improvement on initial DSATUR

solutionI HEA: Hybrid evolutionary algorithmI TabuCol: Tabu search algorithmI PartCol: Like TabuCol, but doesn't restricts to feasible spaceI AntCol: Ant Colony meta-heuristic

Reference

Lewis, R. et al. �A wide-ranging computational comparison of high-performance graphcolouring algorithms�. Computers & Operations Research 39(9), pp. 1933-1950, 2012.

Instance Spaces for Performance Evaluation 21 / 89

Page 82: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring Algorithms

We use the same 8 algorithms considered by Lewis et al.I DSATUR: Brelaz's greedy algorithm (exact for bipartite

graphs)I RandGr: Simple greedy �rst-�t colouring of random

permutations of nodesI Bktr: a backtracking version of DSATUR (Culberson)I HillClimb: a hill-climbing improvement on initial DSATUR

solutionI HEA: Hybrid evolutionary algorithmI TabuCol: Tabu search algorithmI PartCol: Like TabuCol, but doesn't restricts to feasible spaceI AntCol: Ant Colony meta-heuristic

Reference

Lewis, R. et al. �A wide-ranging computational comparison of high-performance graphcolouring algorithms�. Computers & Operations Research 39(9), pp. 1933-1950, 2012.

Instance Spaces for Performance Evaluation 21 / 89

Page 83: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring Algorithms

We use the same 8 algorithms considered by Lewis et al.I DSATUR: Brelaz's greedy algorithm (exact for bipartite

graphs)I RandGr: Simple greedy �rst-�t colouring of random

permutations of nodesI Bktr: a backtracking version of DSATUR (Culberson)I HillClimb: a hill-climbing improvement on initial DSATUR

solutionI HEA: Hybrid evolutionary algorithmI TabuCol: Tabu search algorithmI PartCol: Like TabuCol, but doesn't restricts to feasible spaceI AntCol: Ant Colony meta-heuristic

Reference

Lewis, R. et al. �A wide-ranging computational comparison of high-performance graphcolouring algorithms�. Computers & Operations Research 39(9), pp. 1933-1950, 2012.

Instance Spaces for Performance Evaluation 21 / 89

Page 84: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring Algorithms

We use the same 8 algorithms considered by Lewis et al.I DSATUR: Brelaz's greedy algorithm (exact for bipartite

graphs)I RandGr: Simple greedy �rst-�t colouring of random

permutations of nodesI Bktr: a backtracking version of DSATUR (Culberson)I HillClimb: a hill-climbing improvement on initial DSATUR

solutionI HEA: Hybrid evolutionary algorithmI TabuCol: Tabu search algorithmI PartCol: Like TabuCol, but doesn't restricts to feasible spaceI AntCol: Ant Colony meta-heuristic

Reference

Lewis, R. et al. �A wide-ranging computational comparison of high-performance graphcolouring algorithms�. Computers & Operations Research 39(9), pp. 1933-1950, 2012.

Instance Spaces for Performance Evaluation 21 / 89

Page 85: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring Algorithms

We use the same 8 algorithms considered by Lewis et al.I DSATUR: Brelaz's greedy algorithm (exact for bipartite

graphs)I RandGr: Simple greedy �rst-�t colouring of random

permutations of nodesI Bktr: a backtracking version of DSATUR (Culberson)I HillClimb: a hill-climbing improvement on initial DSATUR

solutionI HEA: Hybrid evolutionary algorithmI TabuCol: Tabu search algorithmI PartCol: Like TabuCol, but doesn't restricts to feasible spaceI AntCol: Ant Colony meta-heuristic

Reference

Lewis, R. et al. �A wide-ranging computational comparison of high-performance graphcolouring algorithms�. Computers & Operations Research 39(9), pp. 1933-1950, 2012.

Instance Spaces for Performance Evaluation 21 / 89

Page 86: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring Algorithms

We use the same 8 algorithms considered by Lewis et al.I DSATUR: Brelaz's greedy algorithm (exact for bipartite

graphs)I RandGr: Simple greedy �rst-�t colouring of random

permutations of nodesI Bktr: a backtracking version of DSATUR (Culberson)I HillClimb: a hill-climbing improvement on initial DSATUR

solutionI HEA: Hybrid evolutionary algorithmI TabuCol: Tabu search algorithmI PartCol: Like TabuCol, but doesn't restricts to feasible spaceI AntCol: Ant Colony meta-heuristic

Reference

Lewis, R. et al. �A wide-ranging computational comparison of high-performance graphcolouring algorithms�. Computers & Operations Research 39(9), pp. 1933-1950, 2012.

Instance Spaces for Performance Evaluation 21 / 89

Page 87: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring Algorithms

We use the same 8 algorithms considered by Lewis et al.I DSATUR: Brelaz's greedy algorithm (exact for bipartite

graphs)I RandGr: Simple greedy �rst-�t colouring of random

permutations of nodesI Bktr: a backtracking version of DSATUR (Culberson)I HillClimb: a hill-climbing improvement on initial DSATUR

solutionI HEA: Hybrid evolutionary algorithmI TabuCol: Tabu search algorithmI PartCol: Like TabuCol, but doesn't restricts to feasible spaceI AntCol: Ant Colony meta-heuristic

Reference

Lewis, R. et al. �A wide-ranging computational comparison of high-performance graphcolouring algorithms�. Computers & Operations Research 39(9), pp. 1933-1950, 2012.

Instance Spaces for Performance Evaluation 21 / 89

Page 88: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring Algorithms

We use the same 8 algorithms considered by Lewis et al.I DSATUR: Brelaz's greedy algorithm (exact for bipartite

graphs)I RandGr: Simple greedy �rst-�t colouring of random

permutations of nodesI Bktr: a backtracking version of DSATUR (Culberson)I HillClimb: a hill-climbing improvement on initial DSATUR

solutionI HEA: Hybrid evolutionary algorithmI TabuCol: Tabu search algorithmI PartCol: Like TabuCol, but doesn't restricts to feasible spaceI AntCol: Ant Colony meta-heuristic

Reference

Lewis, R. et al. �A wide-ranging computational comparison of high-performance graphcolouring algorithms�. Computers & Operations Research 39(9), pp. 1933-1950, 2012.

Instance Spaces for Performance Evaluation 21 / 89

Page 89: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring Algorithms

We use the same 8 algorithms considered by Lewis et al.I DSATUR: Brelaz's greedy algorithm (exact for bipartite

graphs)I RandGr: Simple greedy �rst-�t colouring of random

permutations of nodesI Bktr: a backtracking version of DSATUR (Culberson)I HillClimb: a hill-climbing improvement on initial DSATUR

solutionI HEA: Hybrid evolutionary algorithmI TabuCol: Tabu search algorithmI PartCol: Like TabuCol, but doesn't restricts to feasible spaceI AntCol: Ant Colony meta-heuristic

Reference

Lewis, R. et al. �A wide-ranging computational comparison of high-performance graphcolouring algorithms�. Computers & Operations Research 39(9), pp. 1933-1950, 2012.

Instance Spaces for Performance Evaluation 21 / 89

Page 90: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring Algorithms

We use the same 8 algorithms considered by Lewis et al.I DSATUR: Brelaz's greedy algorithm (exact for bipartite

graphs)I RandGr: Simple greedy �rst-�t colouring of random

permutations of nodesI Bktr: a backtracking version of DSATUR (Culberson)I HillClimb: a hill-climbing improvement on initial DSATUR

solutionI HEA: Hybrid evolutionary algorithmI TabuCol: Tabu search algorithmI PartCol: Like TabuCol, but doesn't restricts to feasible spaceI AntCol: Ant Colony meta-heuristic

Reference

Lewis, R. et al. �A wide-ranging computational comparison of high-performance graphcolouring algorithms�. Computers & Operations Research 39(9), pp. 1933-1950, 2012.

Instance Spaces for Performance Evaluation 21 / 89

Page 91: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Graph Colouring Algorithms

We use the same 8 algorithms considered by Lewis et al.I DSATUR: Brelaz's greedy algorithm (exact for bipartite

graphs)I RandGr: Simple greedy �rst-�t colouring of random

permutations of nodesI Bktr: a backtracking version of DSATUR (Culberson)I HillClimb: a hill-climbing improvement on initial DSATUR

solutionI HEA: Hybrid evolutionary algorithmI TabuCol: Tabu search algorithmI PartCol: Like TabuCol, but doesn't restricts to feasible spaceI AntCol: Ant Colony meta-heuristic

Reference

Lewis, R. et al. �A wide-ranging computational comparison of high-performance graphcolouring algorithms�. Computers & Operations Research 39(9), pp. 1933-1950, 2012.

Instance Spaces for Performance Evaluation 21 / 89

HEA reported as bestoverall

Page 92: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Creating the Instance Space: Process

Examine correlations to eliminate useless features

Label instances as easy or hard based on algorithm portfolio

Project instances from Rm feature space to 2-d

Use a GA to select optimal subset of m features (for2≤m ≤ 18), that best separates easy/hard instances

Instance Spaces for Performance Evaluation 22 / 89

98% variation explainedby top 2 axes

Page 93: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Creating the Instance Space: Process

Examine correlations to eliminate useless features

Label instances as easy or hard based on algorithm portfolio

Project instances from Rm feature space to 2-d

Use a GA to select optimal subset of m features (for2≤m ≤ 18), that best separates easy/hard instances

Instance Spaces for Performance Evaluation 22 / 89

98% variation explainedby top 2 axes

Page 94: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Creating the Instance Space: Process

Examine correlations to eliminate useless features

Label instances as easy or hard based on algorithm portfolio

Project instances from Rm feature space to 2-d

Use a GA to select optimal subset of m features (for2≤m ≤ 18), that best separates easy/hard instances

Instance Spaces for Performance Evaluation 22 / 89

98% variation explainedby top 2 axes

Page 95: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Creating the Instance Space: Process

Examine correlations to eliminate useless features

Label instances as easy or hard based on algorithm portfolio

Project instances from Rm feature space to 2-d

Use a GA to select optimal subset of m features (for2≤m ≤ 18), that best separates easy/hard instances

Instance Spaces for Performance Evaluation 22 / 89

98% variation explainedby top 2 axes

Page 96: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Visualising the instance space

Instance Spaces for Performance Evaluation 23 / 89

Page 97: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

De�ning goodness of algorithm performance

Acknowledging the arbitrariness of this de�nition, here wede�ne an algorithm's performance to be �good� if the gapbetween the number of colors its needs to color the graphcompared to the portfolio's winner is less than ε% within a�xed computational budget of 5×1010 constraint checks.

We consider cases where ε = 0 (the algorithm is best) andε = 0.05 (within 5% of the best).

Instance Spaces for Performance Evaluation 24 / 89

Page 98: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

De�ning goodness of algorithm performance

Acknowledging the arbitrariness of this de�nition, here wede�ne an algorithm's performance to be �good� if the gapbetween the number of colors its needs to color the graphcompared to the portfolio's winner is less than ε% within a�xed computational budget of 5×1010 constraint checks.

We consider cases where ε = 0 (the algorithm is best) andε = 0.05 (within 5% of the best).

Instance Spaces for Performance Evaluation 24 / 89

Page 99: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Footprints with ε = 0 (blue is good)

Instance Spaces for Performance Evaluation 25 / 89

Page 100: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

De�ning di�culty of instances

If less than a given fraction β of the 8 algorithms �nd aninstance easy, then we label the instance as hard for theportfolio of algorithmsI e.g. if β = 0.5 then an instance will be labelled hard if less than

half (only 1, 2 or 3 of the total eight algorithms) �nd it easy

It is important that we understand where good algorithmperformance is uninteresting (if all algorithms �nd theinstances easy) or interesting (if other algorithms struggle)

Instance Spaces for Performance Evaluation 26 / 89

Page 101: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

De�ning di�culty of instances

If less than a given fraction β of the 8 algorithms �nd aninstance easy, then we label the instance as hard for theportfolio of algorithmsI e.g. if β = 0.5 then an instance will be labelled hard if less than

half (only 1, 2 or 3 of the total eight algorithms) �nd it easy

It is important that we understand where good algorithmperformance is uninteresting (if all algorithms �nd theinstances easy) or interesting (if other algorithms struggle)

Instance Spaces for Performance Evaluation 26 / 89

Page 102: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

How many algorithms �nd an instance hard? (α = 0)

Instance Spaces for Performance Evaluation 27 / 89

Page 103: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

De�ning Boundary of Algorithm Footprints

For a given algorithm, we consider points labelled as good, andI remove outliers through clustering,I calculate the convex hull to de�ne a generalised area of

expected good performanceI remove the convex hull of contradicting pointsI validate the accuracy of the remaining �footprint� through

out-of-sample testing

Instance Spaces for Performance Evaluation 28 / 89

Page 104: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

De�ning Boundary of Algorithm Footprints

For a given algorithm, we consider points labelled as good, andI remove outliers through clustering,I calculate the convex hull to de�ne a generalised area of

expected good performanceI remove the convex hull of contradicting pointsI validate the accuracy of the remaining �footprint� through

out-of-sample testing

Instance Spaces for Performance Evaluation 28 / 89

Page 105: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

De�ning Boundary of Algorithm Footprints

For a given algorithm, we consider points labelled as good, andI remove outliers through clustering,I calculate the convex hull to de�ne a generalised area of

expected good performanceI remove the convex hull of contradicting pointsI validate the accuracy of the remaining �footprint� through

out-of-sample testing

Instance Spaces for Performance Evaluation 28 / 89

Page 106: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

De�ning Boundary of Algorithm Footprints

For a given algorithm, we consider points labelled as good, andI remove outliers through clustering,I calculate the convex hull to de�ne a generalised area of

expected good performanceI remove the convex hull of contradicting pointsI validate the accuracy of the remaining �footprint� through

out-of-sample testing

Instance Spaces for Performance Evaluation 28 / 89

Page 107: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

De�ning Boundary of Algorithm Footprints

For a given algorithm, we consider points labelled as good, andI remove outliers through clustering,I calculate the convex hull to de�ne a generalised area of

expected good performanceI remove the convex hull of contradicting pointsI validate the accuracy of the remaining �footprint� through

out-of-sample testing

Instance Spaces for Performance Evaluation 28 / 89

Page 108: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Measuring the Area of Algorithm Footprints

Now we need only to calculate the area de�ning the footprintI our metric of the power of an algorithm is the ratio of this area

to the total area of the instance space

Area of Algorithm Footprint

Let H(S) be the convex hull of a region de�ned by a set ofpoints S = {(xi ,yi )∀i = 1, . . .η}

Area(H(S)) =1

2

k

∑j=1

(xjyj+1−yjxj+1)+(xky1−ykx1)

with the subset {(xj ,yj)∀j = 1, . . .k} and k ≤ η de�ning theextreme points of H(S)

Instance Spaces for Performance Evaluation 29 / 89

Page 109: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Measuring the Area of Algorithm Footprints

Now we need only to calculate the area de�ning the footprintI our metric of the power of an algorithm is the ratio of this area

to the total area of the instance space

Area of Algorithm Footprint

Let H(S) be the convex hull of a region de�ned by a set ofpoints S = {(xi ,yi )∀i = 1, . . .η}

Area(H(S)) =1

2

k

∑j=1

(xjyj+1−yjxj+1)+(xky1−ykx1)

with the subset {(xj ,yj)∀j = 1, . . .k} and k ≤ η de�ning theextreme points of H(S)

Instance Spaces for Performance Evaluation 29 / 89

Page 110: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Measuring the Area of Algorithm Footprints

Now we need only to calculate the area de�ning the footprintI our metric of the power of an algorithm is the ratio of this area

to the total area of the instance space

Area of Algorithm Footprint

Let H(S) be the convex hull of a region de�ned by a set ofpoints S = {(xi ,yi )∀i = 1, . . .η}

Area(H(S)) =1

2

k

∑j=1

(xjyj+1−yjxj+1)+(xky1−ykx1)

with the subset {(xj ,yj)∀j = 1, . . .k} and k ≤ η de�ning theextreme points of H(S)

Instance Spaces for Performance Evaluation 29 / 89

Page 111: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Measuring the Area of Algorithm Footprints

Now we need only to calculate the area de�ning the footprintI our metric of the power of an algorithm is the ratio of this area

to the total area of the instance space

Area of Algorithm Footprint

Let H(S) be the convex hull of a region de�ned by a set ofpoints S = {(xi ,yi )∀i = 1, . . .η}

Area(H(S)) =1

2

k

∑j=1

(xjyj+1−yjxj+1)+(xky1−ykx1)

with the subset {(xj ,yj)∀j = 1, . . .k} and k ≤ η de�ning theextreme points of H(S)

Instance Spaces for Performance Evaluation 29 / 89

Page 112: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Algorithm Footprint Areas (% of instance space)

Instance Spaces for Performance Evaluation 30 / 89

Page 113: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Learning to predict easy or hard instances for a given ε,β

Instance Spaces for Performance Evaluation 31 / 89

Naive Bayes classi�er inR2 is 85% accurate

Page 114: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Recommending algorithms

Instance Spaces for Performance Evaluation 32 / 89

Each SVM is 75-90% accurate but fails to identify winner in some regions

Page 115: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

On which instance classes is each algorithm best suited?

Instance Spaces for Performance Evaluation 33 / 89

Page 116: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Characterising algorithm suitability based on features

Enables us to see what properties (not instance class labels)explain algorithm performance.

Representation of instance space (location of instances)depends on feature set.

We have used a GA to select optimal feature subset tomaximise separability (reduce contradictions) in footprints toenable cleaner calculation of area of footprints.

Considering all 18 features again, some interesting featuredistributions clearly show the properties of instances thatcreate easy or hard instances for each algorithm.

Instance Spaces for Performance Evaluation 34 / 89

Page 117: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Characterising algorithm suitability based on features

Enables us to see what properties (not instance class labels)explain algorithm performance.

Representation of instance space (location of instances)depends on feature set.

We have used a GA to select optimal feature subset tomaximise separability (reduce contradictions) in footprints toenable cleaner calculation of area of footprints.

Considering all 18 features again, some interesting featuredistributions clearly show the properties of instances thatcreate easy or hard instances for each algorithm.

Instance Spaces for Performance Evaluation 34 / 89

Page 118: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Characterising algorithm suitability based on features

Enables us to see what properties (not instance class labels)explain algorithm performance.

Representation of instance space (location of instances)depends on feature set.

We have used a GA to select optimal feature subset tomaximise separability (reduce contradictions) in footprints toenable cleaner calculation of area of footprints.

Considering all 18 features again, some interesting featuredistributions clearly show the properties of instances thatcreate easy or hard instances for each algorithm.

Instance Spaces for Performance Evaluation 34 / 89

Page 119: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Characterising algorithm suitability based on features

Enables us to see what properties (not instance class labels)explain algorithm performance.

Representation of instance space (location of instances)depends on feature set.

We have used a GA to select optimal feature subset tomaximise separability (reduce contradictions) in footprints toenable cleaner calculation of area of footprints.

Considering all 18 features again, some interesting featuredistributions clearly show the properties of instances thatcreate easy or hard instances for each algorithm.

Instance Spaces for Performance Evaluation 34 / 89

Page 120: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Feature Distributions in Instance Space

Instance Spaces for Performance Evaluation 35 / 89

Page 121: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Instance Spaces for Performance Evaluation 36 / 89

Page 122: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Instance Spaces for Performance Evaluation 37 / 89

Page 123: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Instance Spaces for Performance Evaluation 38 / 89

Page 124: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Instance Spaces for Performance Evaluation 39 / 89

Page 125: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Instance Spaces for Performance Evaluation 40 / 89

Page 126: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Reference

Pisanski, T., & Randi¢, M. �Use of the Szeged index and the revised Szeged index formeasuring network bipartivity�. Disc. Appl. Math, vol. 158, pp. 1936-1944, 2010.

Instance Spaces for Performance Evaluation 41 / 89

Page 127: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Reference

Estrada, E., & Rodríguez-Velázquez, J. A. �Spectral measures of bipartivity in complexnetworks�. Physical Review E, vol. 72(4), 046105, 2005.

Instance Spaces for Performance Evaluation 42 / 89

Page 128: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

References

Balakrishnan, R. �The energy of a graph�. Linear Algebra and its applications, vol.387, pp. 287-295, 2004.

Instance Spaces for Performance Evaluation 43 / 89

Page 129: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

HEA is not best everywhere (NFL) ... why not?

References

Smith-Miles, K. A., Baatar, D., Wreford, B. and Lewis, R., �Towards ObjectiveMeasures of Algorithm Performance across Instance Space�, Computers & OperationsResearch, vol. 45, pp. 12-24, 2014.

Instance Spaces for Performance Evaluation 44 / 89

Page 130: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Where instances are, and are not, and why?

The instances are projected into the 2-d instance space by thelinear transformation[v1v2

]=

[0.559 0.614 0.557−0.702 −0.007 0.712

] densityalgebraic connectivityenergy

The upper and lower bounds on the features give us abounding region in the instance space in which a valid instancecould lie

We can select target points within this valid instance space,and use a GA to evolve random graphs so that we minimisetheir distance to the target point when projected

This is a new method for instance generation, enablingnon-trivial features to be controlled

Instance Spaces for Performance Evaluation 45 / 89

Page 131: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Where instances are, and are not, and why?

The instances are projected into the 2-d instance space by thelinear transformation[v1v2

]=

[0.559 0.614 0.557−0.702 −0.007 0.712

] densityalgebraic connectivityenergy

The upper and lower bounds on the features give us abounding region in the instance space in which a valid instancecould lie

We can select target points within this valid instance space,and use a GA to evolve random graphs so that we minimisetheir distance to the target point when projected

This is a new method for instance generation, enablingnon-trivial features to be controlled

Instance Spaces for Performance Evaluation 45 / 89

Page 132: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Where instances are, and are not, and why?

The instances are projected into the 2-d instance space by thelinear transformation[v1v2

]=

[0.559 0.614 0.557−0.702 −0.007 0.712

] densityalgebraic connectivityenergy

The upper and lower bounds on the features give us abounding region in the instance space in which a valid instancecould lie

We can select target points within this valid instance space,and use a GA to evolve random graphs so that we minimisetheir distance to the target point when projected

This is a new method for instance generation, enablingnon-trivial features to be controlled

Instance Spaces for Performance Evaluation 45 / 89

Page 133: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Where instances are, and are not, and why?

The instances are projected into the 2-d instance space by thelinear transformation[v1v2

]=

[0.559 0.614 0.557−0.702 −0.007 0.712

] densityalgebraic connectivityenergy

The upper and lower bounds on the features give us abounding region in the instance space in which a valid instancecould lie

We can select target points within this valid instance space,and use a GA to evolve random graphs so that we minimisetheir distance to the target point when projected

This is a new method for instance generation, enablingnon-trivial features to be controlled

Instance Spaces for Performance Evaluation 45 / 89

Page 134: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Evolving new instances at target points (n=100)

References

Smith-Miles, K. A. and Bowly, S., �Generating new test instances by evolving in instance space�,Computers & Operations Research, vol. 63, pp. 102-113, 2015.

Instance Spaces for Performance Evaluation 46 / 89

Page 135: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary

How do instance features help us understand the strengths andweaknesses of optimisation algorithms?I Provided we have the right feature set, we can create a

topology-preserving instance spaceI The boundary between good and bad performance can be seenI Feature selection methods may improve topology-preservation

How can we infer and visualise algorithm performance across ahuge �instance space�?I PCA has been used to visualise instances in 2-d (or 3-d)I More than 90% of variation in data was preserved, but some

important information (as well as noise) is naturally lostI If the 4th largest eigenvalue is still large, then we lose too much

detail, and other dimension reduction methods are needed

Instance Spaces for Performance Evaluation 47 / 89

Page 136: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary

How do instance features help us understand the strengths andweaknesses of optimisation algorithms?I Provided we have the right feature set, we can create a

topology-preserving instance spaceI The boundary between good and bad performance can be seenI Feature selection methods may improve topology-preservation

How can we infer and visualise algorithm performance across ahuge �instance space�?I PCA has been used to visualise instances in 2-d (or 3-d)I More than 90% of variation in data was preserved, but some

important information (as well as noise) is naturally lostI If the 4th largest eigenvalue is still large, then we lose too much

detail, and other dimension reduction methods are needed

Instance Spaces for Performance Evaluation 47 / 89

Page 137: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary

How do instance features help us understand the strengths andweaknesses of optimisation algorithms?I Provided we have the right feature set, we can create a

topology-preserving instance spaceI The boundary between good and bad performance can be seenI Feature selection methods may improve topology-preservation

How can we infer and visualise algorithm performance across ahuge �instance space�?I PCA has been used to visualise instances in 2-d (or 3-d)I More than 90% of variation in data was preserved, but some

important information (as well as noise) is naturally lostI If the 4th largest eigenvalue is still large, then we lose too much

detail, and other dimension reduction methods are needed

Instance Spaces for Performance Evaluation 47 / 89

Page 138: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary

How do instance features help us understand the strengths andweaknesses of optimisation algorithms?I Provided we have the right feature set, we can create a

topology-preserving instance spaceI The boundary between good and bad performance can be seenI Feature selection methods may improve topology-preservation

How can we infer and visualise algorithm performance across ahuge �instance space�?I PCA has been used to visualise instances in 2-d (or 3-d)I More than 90% of variation in data was preserved, but some

important information (as well as noise) is naturally lostI If the 4th largest eigenvalue is still large, then we lose too much

detail, and other dimension reduction methods are needed

Instance Spaces for Performance Evaluation 47 / 89

Page 139: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary

How do instance features help us understand the strengths andweaknesses of optimisation algorithms?I Provided we have the right feature set, we can create a

topology-preserving instance spaceI The boundary between good and bad performance can be seenI Feature selection methods may improve topology-preservation

How can we infer and visualise algorithm performance across ahuge �instance space�?I PCA has been used to visualise instances in 2-d (or 3-d)I More than 90% of variation in data was preserved, but some

important information (as well as noise) is naturally lostI If the 4th largest eigenvalue is still large, then we lose too much

detail, and other dimension reduction methods are needed

Instance Spaces for Performance Evaluation 47 / 89

Page 140: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary

How do instance features help us understand the strengths andweaknesses of optimisation algorithms?I Provided we have the right feature set, we can create a

topology-preserving instance spaceI The boundary between good and bad performance can be seenI Feature selection methods may improve topology-preservation

How can we infer and visualise algorithm performance across ahuge �instance space�?I PCA has been used to visualise instances in 2-d (or 3-d)I More than 90% of variation in data was preserved, but some

important information (as well as noise) is naturally lostI If the 4th largest eigenvalue is still large, then we lose too much

detail, and other dimension reduction methods are needed

Instance Spaces for Performance Evaluation 47 / 89

Page 141: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary

How do instance features help us understand the strengths andweaknesses of optimisation algorithms?I Provided we have the right feature set, we can create a

topology-preserving instance spaceI The boundary between good and bad performance can be seenI Feature selection methods may improve topology-preservation

How can we infer and visualise algorithm performance across ahuge �instance space�?I PCA has been used to visualise instances in 2-d (or 3-d)I More than 90% of variation in data was preserved, but some

important information (as well as noise) is naturally lostI If the 4th largest eigenvalue is still large, then we lose too much

detail, and other dimension reduction methods are needed

Instance Spaces for Performance Evaluation 47 / 89

Page 142: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary

How do instance features help us understand the strengths andweaknesses of optimisation algorithms?I Provided we have the right feature set, we can create a

topology-preserving instance spaceI The boundary between good and bad performance can be seenI Feature selection methods may improve topology-preservation

How can we infer and visualise algorithm performance across ahuge �instance space�?I PCA has been used to visualise instances in 2-d (or 3-d)I More than 90% of variation in data was preserved, but some

important information (as well as noise) is naturally lostI If the 4th largest eigenvalue is still large, then we lose too much

detail, and other dimension reduction methods are needed

Instance Spaces for Performance Evaluation 47 / 89

Page 143: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary, continued

How can we objectively measure algorithm performance?I relative size of the area of algorithm footprintsI Convex or concave hulls can be used depending on

generalisation comfort (out-of-sample testing can help)I The area of the footprint depends on the de�nition of �good�

How easy or hard are the benchmark instances?I Randomly generated instances tend to be in the middle

(average features), and are usually not discriminatingI Discriminating instances can be generated intentionally using

GA (�tness is algorithm performance, but this blows up forharder instances)

I Diversity of instances is critical for a meaningful instance space

Alternatively can we generate new test instances at targetpoints in the instance space (more scalable)

Instance Spaces for Performance Evaluation 48 / 89

Page 144: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary, continued

How can we objectively measure algorithm performance?I relative size of the area of algorithm footprintsI Convex or concave hulls can be used depending on

generalisation comfort (out-of-sample testing can help)I The area of the footprint depends on the de�nition of �good�

How easy or hard are the benchmark instances?I Randomly generated instances tend to be in the middle

(average features), and are usually not discriminatingI Discriminating instances can be generated intentionally using

GA (�tness is algorithm performance, but this blows up forharder instances)

I Diversity of instances is critical for a meaningful instance space

Alternatively can we generate new test instances at targetpoints in the instance space (more scalable)

Instance Spaces for Performance Evaluation 48 / 89

Page 145: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary, continued

How can we objectively measure algorithm performance?I relative size of the area of algorithm footprintsI Convex or concave hulls can be used depending on

generalisation comfort (out-of-sample testing can help)I The area of the footprint depends on the de�nition of �good�

How easy or hard are the benchmark instances?I Randomly generated instances tend to be in the middle

(average features), and are usually not discriminatingI Discriminating instances can be generated intentionally using

GA (�tness is algorithm performance, but this blows up forharder instances)

I Diversity of instances is critical for a meaningful instance space

Alternatively can we generate new test instances at targetpoints in the instance space (more scalable)

Instance Spaces for Performance Evaluation 48 / 89

Page 146: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary, continued

How can we objectively measure algorithm performance?I relative size of the area of algorithm footprintsI Convex or concave hulls can be used depending on

generalisation comfort (out-of-sample testing can help)I The area of the footprint depends on the de�nition of �good�

How easy or hard are the benchmark instances?I Randomly generated instances tend to be in the middle

(average features), and are usually not discriminatingI Discriminating instances can be generated intentionally using

GA (�tness is algorithm performance, but this blows up forharder instances)

I Diversity of instances is critical for a meaningful instance space

Alternatively can we generate new test instances at targetpoints in the instance space (more scalable)

Instance Spaces for Performance Evaluation 48 / 89

Page 147: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary, continued

How can we objectively measure algorithm performance?I relative size of the area of algorithm footprintsI Convex or concave hulls can be used depending on

generalisation comfort (out-of-sample testing can help)I The area of the footprint depends on the de�nition of �good�

How easy or hard are the benchmark instances?I Randomly generated instances tend to be in the middle

(average features), and are usually not discriminatingI Discriminating instances can be generated intentionally using

GA (�tness is algorithm performance, but this blows up forharder instances)

I Diversity of instances is critical for a meaningful instance space

Alternatively can we generate new test instances at targetpoints in the instance space (more scalable)

Instance Spaces for Performance Evaluation 48 / 89

Page 148: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary, continued

How can we objectively measure algorithm performance?I relative size of the area of algorithm footprintsI Convex or concave hulls can be used depending on

generalisation comfort (out-of-sample testing can help)I The area of the footprint depends on the de�nition of �good�

How easy or hard are the benchmark instances?I Randomly generated instances tend to be in the middle

(average features), and are usually not discriminatingI Discriminating instances can be generated intentionally using

GA (�tness is algorithm performance, but this blows up forharder instances)

I Diversity of instances is critical for a meaningful instance space

Alternatively can we generate new test instances at targetpoints in the instance space (more scalable)

Instance Spaces for Performance Evaluation 48 / 89

Page 149: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary, continued

How can we objectively measure algorithm performance?I relative size of the area of algorithm footprintsI Convex or concave hulls can be used depending on

generalisation comfort (out-of-sample testing can help)I The area of the footprint depends on the de�nition of �good�

How easy or hard are the benchmark instances?I Randomly generated instances tend to be in the middle

(average features), and are usually not discriminatingI Discriminating instances can be generated intentionally using

GA (�tness is algorithm performance, but this blows up forharder instances)

I Diversity of instances is critical for a meaningful instance space

Alternatively can we generate new test instances at targetpoints in the instance space (more scalable)

Instance Spaces for Performance Evaluation 48 / 89

Page 150: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary, continued

How can we objectively measure algorithm performance?I relative size of the area of algorithm footprintsI Convex or concave hulls can be used depending on

generalisation comfort (out-of-sample testing can help)I The area of the footprint depends on the de�nition of �good�

How easy or hard are the benchmark instances?I Randomly generated instances tend to be in the middle

(average features), and are usually not discriminatingI Discriminating instances can be generated intentionally using

GA (�tness is algorithm performance, but this blows up forharder instances)

I Diversity of instances is critical for a meaningful instance space

Alternatively can we generate new test instances at targetpoints in the instance space (more scalable)

Instance Spaces for Performance Evaluation 48 / 89

Page 151: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary, continued

How can we objectively measure algorithm performance?I relative size of the area of algorithm footprintsI Convex or concave hulls can be used depending on

generalisation comfort (out-of-sample testing can help)I The area of the footprint depends on the de�nition of �good�

How easy or hard are the benchmark instances?I Randomly generated instances tend to be in the middle

(average features), and are usually not discriminatingI Discriminating instances can be generated intentionally using

GA (�tness is algorithm performance, but this blows up forharder instances)

I Diversity of instances is critical for a meaningful instance space

Alternatively can we generate new test instances at targetpoints in the instance space (more scalable)

Instance Spaces for Performance Evaluation 48 / 89

Page 152: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary, continued

How can we objectively measure algorithm performance?I relative size of the area of algorithm footprintsI Convex or concave hulls can be used depending on

generalisation comfort (out-of-sample testing can help)I The area of the footprint depends on the de�nition of �good�

How easy or hard are the benchmark instances?I Randomly generated instances tend to be in the middle

(average features), and are usually not discriminatingI Discriminating instances can be generated intentionally using

GA (�tness is algorithm performance, but this blows up forharder instances)

I Diversity of instances is critical for a meaningful instance space

Alternatively can we generate new test instances at targetpoints in the instance space (more scalable)

Instance Spaces for Performance Evaluation 48 / 89

Page 153: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataCreating the instance spaceMeasuring algorithm footprints and gaining insightsEvolving New Instances

Summary, continued

How can we objectively measure algorithm performance?I relative size of the area of algorithm footprintsI Convex or concave hulls can be used depending on

generalisation comfort (out-of-sample testing can help)I The area of the footprint depends on the de�nition of �good�

How easy or hard are the benchmark instances?I Randomly generated instances tend to be in the middle

(average features), and are usually not discriminatingI Discriminating instances can be generated intentionally using

GA (�tness is algorithm performance, but this blows up forharder instances)

I Diversity of instances is critical for a meaningful instance space

Alternatively can we generate new test instances at targetpoints in the instance space (more scalable)

Instance Spaces for Performance Evaluation 48 / 89

Page 154: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Black Box Optimisation

We are given only a sample of points from the continuousdecision (input) space, and known objective function values(output space)

We have no analytical expression of the objective function

We need to �nd the best point in the decision space tominimise the objective function with minimal functionevaluationsI Input space, X ⊂ RD

I Output space, Y ⊂ RI Problem dimensionality, D ∈ R+

I Candidate solutions, x ∈XI Candidate cost, y ∈ YI Target solution, xt ∈XI Target cost, yt ∈ Y

Instance Spaces for Performance Evaluation 49 / 89

Page 155: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Black Box Optimisation

We are given only a sample of points from the continuousdecision (input) space, and known objective function values(output space)

We have no analytical expression of the objective function

We need to �nd the best point in the decision space tominimise the objective function with minimal functionevaluationsI Input space, X ⊂ RD

I Output space, Y ⊂ RI Problem dimensionality, D ∈ R+

I Candidate solutions, x ∈XI Candidate cost, y ∈ YI Target solution, xt ∈XI Target cost, yt ∈ Y

Instance Spaces for Performance Evaluation 49 / 89

Page 156: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Black Box Optimisation

We are given only a sample of points from the continuousdecision (input) space, and known objective function values(output space)

We have no analytical expression of the objective function

We need to �nd the best point in the decision space tominimise the objective function with minimal functionevaluationsI Input space, X ⊂ RD

I Output space, Y ⊂ RI Problem dimensionality, D ∈ R+

I Candidate solutions, x ∈XI Candidate cost, y ∈ YI Target solution, xt ∈XI Target cost, yt ∈ Y

Instance Spaces for Performance Evaluation 49 / 89

Page 157: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Black Box Optimisation

We are given only a sample of points from the continuousdecision (input) space, and known objective function values(output space)

We have no analytical expression of the objective function

We need to �nd the best point in the decision space tominimise the objective function with minimal functionevaluationsI Input space, X ⊂ RD

I Output space, Y ⊂ RI Problem dimensionality, D ∈ R+

I Candidate solutions, x ∈XI Candidate cost, y ∈ YI Target solution, xt ∈XI Target cost, yt ∈ Y

Instance Spaces for Performance Evaluation 49 / 89

Page 158: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

What makes BBO hard?

We depend on a sample to provide knowledge of the landscape

Algorithms perform di�erently and can struggle with certainlandscape characteristicsI multimodality, poor-conditioning, deceptiveness, etc.

We use sample-based Exploratory Landscape Analysis (ELA)metrics to learn what makes BBO hard

These features will also form our instance space, and enablealgorithm footprints to be seen, and new test instances to begenerated

Instance Spaces for Performance Evaluation 50 / 89

Page 159: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

What makes BBO hard?

We depend on a sample to provide knowledge of the landscape

Algorithms perform di�erently and can struggle with certainlandscape characteristicsI multimodality, poor-conditioning, deceptiveness, etc.

We use sample-based Exploratory Landscape Analysis (ELA)metrics to learn what makes BBO hard

These features will also form our instance space, and enablealgorithm footprints to be seen, and new test instances to begenerated

Instance Spaces for Performance Evaluation 50 / 89

Page 160: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

What makes BBO hard?

We depend on a sample to provide knowledge of the landscape

Algorithms perform di�erently and can struggle with certainlandscape characteristicsI multimodality, poor-conditioning, deceptiveness, etc.

We use sample-based Exploratory Landscape Analysis (ELA)metrics to learn what makes BBO hard

These features will also form our instance space, and enablealgorithm footprints to be seen, and new test instances to begenerated

Instance Spaces for Performance Evaluation 50 / 89

Page 161: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

What makes BBO hard?

We depend on a sample to provide knowledge of the landscape

Algorithms perform di�erently and can struggle with certainlandscape characteristicsI multimodality, poor-conditioning, deceptiveness, etc.

We use sample-based Exploratory Landscape Analysis (ELA)metrics to learn what makes BBO hard

These features will also form our instance space, and enablealgorithm footprints to be seen, and new test instances to begenerated

Instance Spaces for Performance Evaluation 50 / 89

Page 162: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

BBO meta-data: instances

The noiseless COCO benchmark set is used: 24 basis functionsde�ned within X = [−5,5]D

The functions are divided into �ve categories:I Separable (f1 � f5)I Low or moderately conditioned (f6 � f9)I Unimodal with high conditioning (f10 � f14)I Multimodal with adequate global structure (f15 � f19)I Multimodal with weak global structure (f20 � f24)

New instances are generated by scaling and transforming thebasis functions (translations, rotations, oscillations)I We generated instances [1, . . . ,15] at D = 2,5,10,20, resulting

in 1440 problem instances.

Instance Spaces for Performance Evaluation 51 / 89

Page 163: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

BBO meta-data: instances

The noiseless COCO benchmark set is used: 24 basis functionsde�ned within X = [−5,5]D

The functions are divided into �ve categories:I Separable (f1 � f5)I Low or moderately conditioned (f6 � f9)I Unimodal with high conditioning (f10 � f14)I Multimodal with adequate global structure (f15 � f19)I Multimodal with weak global structure (f20 � f24)

New instances are generated by scaling and transforming thebasis functions (translations, rotations, oscillations)I We generated instances [1, . . . ,15] at D = 2,5,10,20, resulting

in 1440 problem instances.

Instance Spaces for Performance Evaluation 51 / 89

Page 164: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

BBO meta-data: instances

The noiseless COCO benchmark set is used: 24 basis functionsde�ned within X = [−5,5]D

The functions are divided into �ve categories:I Separable (f1 � f5)I Low or moderately conditioned (f6 � f9)I Unimodal with high conditioning (f10 � f14)I Multimodal with adequate global structure (f15 � f19)I Multimodal with weak global structure (f20 � f24)

New instances are generated by scaling and transforming thebasis functions (translations, rotations, oscillations)I We generated instances [1, . . . ,15] at D = 2,5,10,20, resulting

in 1440 problem instances.

Instance Spaces for Performance Evaluation 51 / 89

Page 165: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

BBO meta-data: features

Sample based on X⊂X , of size D×103 using LHDFeature selection applied to 18 features (chose 9) to maximiseperformance prediction accuracy using SVM

Instance Spaces for Performance Evaluation 52 / 89

Page 166: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

BBO meta-data: features

Sample based on X⊂X , of size D×103 using LHDFeature selection applied to 18 features (chose 9) to maximiseperformance prediction accuracy using SVM

Instance Spaces for Performance Evaluation 52 / 89

Page 167: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

BBO meta-data: features

Sample based on X⊂X , of size D×103 using LHDFeature selection applied to 18 features (chose 9) to maximiseperformance prediction accuracy using SVM

Instance Spaces for Performance Evaluation 52 / 89

Page 168: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

BBO meta-data: features

Sample based on X⊂X , of size D×103 using LHDFeature selection applied to 18 features (chose 9) to maximiseperformance prediction accuracy using SVM

Instance Spaces for Performance Evaluation 52 / 89

Method Feature Description Transformations

Surrogate models R̄2

LI Fit of linear regression model Unit scaling

R̄2

Q Fit of quadratic regression model Unit scaling

CN Ratio of min to max quadratic coe�. Unit scaling

Signi�cance ξ (D) Signi�cance of D-th order z-score, tanh

ξ (1) Signi�cance of �rst order z-score, tanhCost distribution γ (Y) Skewness of the cost distribution z-score, tanh

κ (Y) Kurtosis of the cost distribution log10, z-scoreH (Y) Entropy of the cost distribution log10, z-score

Fitness sequences Hmax Maximum information content withnearest neighbor sorting

z-score

Page 169: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

BBO Algorithms

We consider a variety of algorithms selected using ICARUS toavoid overlapping performance:

Reference

Muñoz, M. (2013). Decision support systems for the automatic selection of algorithmsfor continuous optimization problems. PhD thesis, The University of Melbourne.

Instance Spaces for Performance Evaluation 53 / 89

Page 170: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Visualising the instance space

Instance Spaces for Performance Evaluation 54 / 89

Page 171: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Algorithm Footprints

Instance Spaces for Performance Evaluation 55 / 89

Solved if at least 1 of 15 runs comes within 10−8 of yt within budget 104×D function evaluations

Page 172: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Recommended algorithms

Instance Spaces for Performance Evaluation 56 / 89

Page 173: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Feature Distributions in Instance Space

Instance Spaces for Performance Evaluation 57 / 89

Page 174: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Methodology - Evolving New Instances

We focus on 2-d functions for ease of visualisationWe generate 720 instances ([1, . . . ,30] at D = 2, of the 24basis functions)Sample based on X⊂X , of size 2×104 using LHDEach function summarised as 9-d feature vector then projectedusing PCA to 2-d

Instance Spaces for Performance Evaluation 58 / 89

Page 175: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Methodology - Evolving New Instances

We focus on 2-d functions for ease of visualisationWe generate 720 instances ([1, . . . ,30] at D = 2, of the 24basis functions)Sample based on X⊂X , of size 2×104 using LHDEach function summarised as 9-d feature vector then projectedusing PCA to 2-d

Instance Spaces for Performance Evaluation 58 / 89

Page 176: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Methodology - Evolving New Instances

We focus on 2-d functions for ease of visualisationWe generate 720 instances ([1, . . . ,30] at D = 2, of the 24basis functions)Sample based on X⊂X , of size 2×104 using LHDEach function summarised as 9-d feature vector then projectedusing PCA to 2-d

Instance Spaces for Performance Evaluation 58 / 89

Page 177: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Methodology - Evolving New Instances

We focus on 2-d functions for ease of visualisationWe generate 720 instances ([1, . . . ,30] at D = 2, of the 24basis functions)Sample based on X⊂X , of size 2×104 using LHDEach function summarised as 9-d feature vector then projectedusing PCA to 2-d

Instance Spaces for Performance Evaluation 58 / 89

Page 178: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Methodology - Evolving New Instances

We use Genetic Programming to evolve a program (function),represented as a binary treeI leaves are variables or constantsI nodes are operations {×,+,−,(.)2,sin, cos, tanh, exp}

Used GPTIPS v1.0 in MATLAB (GP for symbolic regression)I Population size: 400I Number of generations: 100I Tournament size: 7I Elite fraction: 0.1I Target cost:

√ε, where ε is the machine precision

I Number of inputs: D = 2I Max tree depth: 10I Constant range: [−1000,1000]I Tournament selection: lexicographic

Instance Spaces for Performance Evaluation 59 / 89

Page 179: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Methodology - Evolving New Instances

We use Genetic Programming to evolve a program (function),represented as a binary treeI leaves are variables or constantsI nodes are operations {×,+,−,(.)2,sin, cos, tanh, exp}

Used GPTIPS v1.0 in MATLAB (GP for symbolic regression)I Population size: 400I Number of generations: 100I Tournament size: 7I Elite fraction: 0.1I Target cost:

√ε, where ε is the machine precision

I Number of inputs: D = 2I Max tree depth: 10I Constant range: [−1000,1000]I Tournament selection: lexicographic

Instance Spaces for Performance Evaluation 59 / 89

Page 180: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Recreating Existing Functions (S1)

We attempt to generate a known function from COCO byselecting a target point coinciding with a known functionWe perform 5 iterations for each of 50 randomly selectedtarget instancesA few examples ...

Instance Spaces for Performance Evaluation 60 / 89

Page 181: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Recreating Existing Functions - Sphere

Instance Spaces for Performance Evaluation 61 / 89

Sphere - unimodal

Page 182: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Recreating Existing Functions - Discus

Instance Spaces for Performance Evaluation 62 / 89

Discus - poor conditioning

Page 183: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Recreating Existing Functions - Katsuura

Instance Spaces for Performance Evaluation 63 / 89

Katsuura - highly multimodal with periodic structure

Page 184: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Generating Functions across the Instance Space (S2)

Instance Spaces for Performance Evaluation 64 / 89

rugged instances in top left corner

Page 185: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Generating Functions across the Instance Space (S2)

Instance Spaces for Performance Evaluation 64 / 89

conditioning worsens from left to rightrugged instances in top left corner

Page 186: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

Generating Functions across the Instance Space (S2)

Instance Spaces for Performance Evaluation 64 / 89

conditioning worsens from left to rightrugged instances in top left corner

large plateaus at bottom of space

Page 187: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

New Test Functions - Examples

Instance Spaces for Performance Evaluation 65 / 89

Page 188: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

How hard are these new test functions?

Comparing BIBOP-CMA-ES on COCO, evolved COCO-like(S1) and evolved diverse (S2) functions

Probability of solving within budget function evaluations isI 0.94 for COCOI 0.67 for S1I 0.61 for S2

Instance Spaces for Performance Evaluation 66 / 89

solid line - FEs to reach experimental optimum

dashed line - FEs to reach within 10−8 of ex-perimental optimum

Page 189: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Meta-DataApplying the methodologyVisualising Strengths and WeaknessesEvolving New Test Functions

How hard are these new test functions?

Comparing BIBOP-CMA-ES on COCO, evolved COCO-like(S1) and evolved diverse (S2) functions

Probability of solving within budget function evaluations isI 0.94 for COCOI 0.67 for S1I 0.61 for S2

Instance Spaces for Performance Evaluation 66 / 89

solid line - FEs to reach experimental optimum

dashed line - FEs to reach within 10−8 of ex-perimental optimum

Page 190: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Returning to Machine Learning

The UCI repository needs to be re-evaluatedI does it support insights into algorithm performance?I where are the really challenging (not just large) instances that

stress the best algorithms?I data quality has also been questioned

Reference

N. Macià, and E. Bernadó-Mansilla (2014). �Towards UCI+: A mindful repository design�, InformationSciences, vol. 261, pp. 237�262.

Salzberg, S. L. (1997), "On comparing classi�ers: Pitfalls to avoid and a recommended approach." DataMining and knowledge discovery vol. 1, no. 3, pp. 317-328.

Instance Spaces for Performance Evaluation 67 / 89

Page 191: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Returning to Machine Learning

The UCI repository needs to be re-evaluatedI does it support insights into algorithm performance?I where are the really challenging (not just large) instances that

stress the best algorithms?I data quality has also been questioned

Reference

N. Macià, and E. Bernadó-Mansilla (2014). �Towards UCI+: A mindful repository design�, InformationSciences, vol. 261, pp. 237�262.

Salzberg, S. L. (1997), "On comparing classi�ers: Pitfalls to avoid and a recommended approach." DataMining and knowledge discovery vol. 1, no. 3, pp. 317-328.

Instance Spaces for Performance Evaluation 67 / 89

Page 192: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Returning to Machine Learning

The UCI repository needs to be re-evaluatedI does it support insights into algorithm performance?I where are the really challenging (not just large) instances that

stress the best algorithms?I data quality has also been questioned

Reference

N. Macià, and E. Bernadó-Mansilla (2014). �Towards UCI+: A mindful repository design�, InformationSciences, vol. 261, pp. 237�262.

Salzberg, S. L. (1997), "On comparing classi�ers: Pitfalls to avoid and a recommended approach." DataMining and knowledge discovery vol. 1, no. 3, pp. 317-328.

Instance Spaces for Performance Evaluation 67 / 89

Page 193: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Returning to Machine Learning

The UCI repository needs to be re-evaluatedI does it support insights into algorithm performance?I where are the really challenging (not just large) instances that

stress the best algorithms?I data quality has also been questioned

Reference

N. Macià, and E. Bernadó-Mansilla (2014). �Towards UCI+: A mindful repository design�, InformationSciences, vol. 261, pp. 237�262.

Salzberg, S. L. (1997), "On comparing classi�ers: Pitfalls to avoid and a recommended approach." DataMining and knowledge discovery vol. 1, no. 3, pp. 317-328.

Instance Spaces for Performance Evaluation 67 / 89

Page 194: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Problem Instances I

We use a total of 236 classi�cation instances (binary andmulticlass) comprisingI 211 UCI instances (University of California Irvine)I 19 KEEL instances (Knowledge Extraction Evolutionary

Learning)I 6 DCol instances (Data Complexity Library)

Instances contain up to 11,055 observations and 1,558attributesI larger ones have been excluded for this study due to

computational budget

Instances with missing values are retained, and also duplicatedwith the missing values estimated with means for the class.

Instance Spaces for Performance Evaluation 68 / 89

Page 195: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Problem Instances I

We use a total of 236 classi�cation instances (binary andmulticlass) comprisingI 211 UCI instances (University of California Irvine)I 19 KEEL instances (Knowledge Extraction Evolutionary

Learning)I 6 DCol instances (Data Complexity Library)

Instances contain up to 11,055 observations and 1,558attributesI larger ones have been excluded for this study due to

computational budget

Instances with missing values are retained, and also duplicatedwith the missing values estimated with means for the class.

Instance Spaces for Performance Evaluation 68 / 89

Page 196: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Problem Instances I

We use a total of 236 classi�cation instances (binary andmulticlass) comprisingI 211 UCI instances (University of California Irvine)I 19 KEEL instances (Knowledge Extraction Evolutionary

Learning)I 6 DCol instances (Data Complexity Library)

Instances contain up to 11,055 observations and 1,558attributesI larger ones have been excluded for this study due to

computational budget

Instances with missing values are retained, and also duplicatedwith the missing values estimated with means for the class.

Instance Spaces for Performance Evaluation 68 / 89

Page 197: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Problem Instances I

We use a total of 236 classi�cation instances (binary andmulticlass) comprisingI 211 UCI instances (University of California Irvine)I 19 KEEL instances (Knowledge Extraction Evolutionary

Learning)I 6 DCol instances (Data Complexity Library)

Instances contain up to 11,055 observations and 1,558attributesI larger ones have been excluded for this study due to

computational budget

Instances with missing values are retained, and also duplicatedwith the missing values estimated with means for the class.

Instance Spaces for Performance Evaluation 68 / 89

Page 198: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Problem Instances I

We use a total of 236 classi�cation instances (binary andmulticlass) comprisingI 211 UCI instances (University of California Irvine)I 19 KEEL instances (Knowledge Extraction Evolutionary

Learning)I 6 DCol instances (Data Complexity Library)

Instances contain up to 11,055 observations and 1,558attributesI larger ones have been excluded for this study due to

computational budget

Instances with missing values are retained, and also duplicatedwith the missing values estimated with means for the class.

Instance Spaces for Performance Evaluation 68 / 89

Page 199: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Problem Instances I

We use a total of 236 classi�cation instances (binary andmulticlass) comprisingI 211 UCI instances (University of California Irvine)I 19 KEEL instances (Knowledge Extraction Evolutionary

Learning)I 6 DCol instances (Data Complexity Library)

Instances contain up to 11,055 observations and 1,558attributesI larger ones have been excluded for this study due to

computational budget

Instances with missing values are retained, and also duplicatedwith the missing values estimated with means for the class.

Instance Spaces for Performance Evaluation 68 / 89

Page 200: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Problem Instances I

We use a total of 236 classi�cation instances (binary andmulticlass) comprisingI 211 UCI instances (University of California Irvine)I 19 KEEL instances (Knowledge Extraction Evolutionary

Learning)I 6 DCol instances (Data Complexity Library)

Instances contain up to 11,055 observations and 1,558attributesI larger ones have been excluded for this study due to

computational budget

Instances with missing values are retained, and also duplicatedwith the missing values estimated with means for the class.

Instance Spaces for Performance Evaluation 68 / 89

Page 201: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Algorithms A

We consider 10 supervised learners:I Naive Bayes (NB)I Linear Discriminant (LD)I Quadratic Discriminant (QD)I Classi�cation and Regression Trees (CART)I J48 Decision Tree (J48)I k-Nearest Neighbor (kNN)I Support Vector Machines with linear (L-SVM), polynomial

(poly-SVM) and radial basis (RB-SVM) kernelsI Random Forests (RF)

R packages used were e1071, MASS, rpart, RWeka, kknn, withdefault parameters

Instance Spaces for Performance Evaluation 69 / 89

Page 202: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Algorithms A

We consider 10 supervised learners:I Naive Bayes (NB)I Linear Discriminant (LD)I Quadratic Discriminant (QD)I Classi�cation and Regression Trees (CART)I J48 Decision Tree (J48)I k-Nearest Neighbor (kNN)I Support Vector Machines with linear (L-SVM), polynomial

(poly-SVM) and radial basis (RB-SVM) kernelsI Random Forests (RF)

R packages used were e1071, MASS, rpart, RWeka, kknn, withdefault parameters

Instance Spaces for Performance Evaluation 69 / 89

Page 203: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Algorithms A

We consider 10 supervised learners:I Naive Bayes (NB)I Linear Discriminant (LD)I Quadratic Discriminant (QD)I Classi�cation and Regression Trees (CART)I J48 Decision Tree (J48)I k-Nearest Neighbor (kNN)I Support Vector Machines with linear (L-SVM), polynomial

(poly-SVM) and radial basis (RB-SVM) kernelsI Random Forests (RF)

R packages used were e1071, MASS, rpart, RWeka, kknn, withdefault parameters

Instance Spaces for Performance Evaluation 69 / 89

Page 204: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Algorithms A

We consider 10 supervised learners:I Naive Bayes (NB)I Linear Discriminant (LD)I Quadratic Discriminant (QD)I Classi�cation and Regression Trees (CART)I J48 Decision Tree (J48)I k-Nearest Neighbor (kNN)I Support Vector Machines with linear (L-SVM), polynomial

(poly-SVM) and radial basis (RB-SVM) kernelsI Random Forests (RF)

R packages used were e1071, MASS, rpart, RWeka, kknn, withdefault parameters

Instance Spaces for Performance Evaluation 69 / 89

Page 205: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Algorithms A

We consider 10 supervised learners:I Naive Bayes (NB)I Linear Discriminant (LD)I Quadratic Discriminant (QD)I Classi�cation and Regression Trees (CART)I J48 Decision Tree (J48)I k-Nearest Neighbor (kNN)I Support Vector Machines with linear (L-SVM), polynomial

(poly-SVM) and radial basis (RB-SVM) kernelsI Random Forests (RF)

R packages used were e1071, MASS, rpart, RWeka, kknn, withdefault parameters

Instance Spaces for Performance Evaluation 69 / 89

Page 206: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Algorithms A

We consider 10 supervised learners:I Naive Bayes (NB)I Linear Discriminant (LD)I Quadratic Discriminant (QD)I Classi�cation and Regression Trees (CART)I J48 Decision Tree (J48)I k-Nearest Neighbor (kNN)I Support Vector Machines with linear (L-SVM), polynomial

(poly-SVM) and radial basis (RB-SVM) kernelsI Random Forests (RF)

R packages used were e1071, MASS, rpart, RWeka, kknn, withdefault parameters

Instance Spaces for Performance Evaluation 69 / 89

Page 207: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Algorithms A

We consider 10 supervised learners:I Naive Bayes (NB)I Linear Discriminant (LD)I Quadratic Discriminant (QD)I Classi�cation and Regression Trees (CART)I J48 Decision Tree (J48)I k-Nearest Neighbor (kNN)I Support Vector Machines with linear (L-SVM), polynomial

(poly-SVM) and radial basis (RB-SVM) kernelsI Random Forests (RF)

R packages used were e1071, MASS, rpart, RWeka, kknn, withdefault parameters

Instance Spaces for Performance Evaluation 69 / 89

Page 208: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Algorithms A

We consider 10 supervised learners:I Naive Bayes (NB)I Linear Discriminant (LD)I Quadratic Discriminant (QD)I Classi�cation and Regression Trees (CART)I J48 Decision Tree (J48)I k-Nearest Neighbor (kNN)I Support Vector Machines with linear (L-SVM), polynomial

(poly-SVM) and radial basis (RB-SVM) kernelsI Random Forests (RF)

R packages used were e1071, MASS, rpart, RWeka, kknn, withdefault parameters

Instance Spaces for Performance Evaluation 69 / 89

Page 209: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Algorithms A

We consider 10 supervised learners:I Naive Bayes (NB)I Linear Discriminant (LD)I Quadratic Discriminant (QD)I Classi�cation and Regression Trees (CART)I J48 Decision Tree (J48)I k-Nearest Neighbor (kNN)I Support Vector Machines with linear (L-SVM), polynomial

(poly-SVM) and radial basis (RB-SVM) kernelsI Random Forests (RF)

R packages used were e1071, MASS, rpart, RWeka, kknn, withdefault parameters

Instance Spaces for Performance Evaluation 69 / 89

Page 210: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Algorithms A

We consider 10 supervised learners:I Naive Bayes (NB)I Linear Discriminant (LD)I Quadratic Discriminant (QD)I Classi�cation and Regression Trees (CART)I J48 Decision Tree (J48)I k-Nearest Neighbor (kNN)I Support Vector Machines with linear (L-SVM), polynomial

(poly-SVM) and radial basis (RB-SVM) kernelsI Random Forests (RF)

R packages used were e1071, MASS, rpart, RWeka, kknn, withdefault parameters

Instance Spaces for Performance Evaluation 69 / 89

Page 211: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Performance Metric Y

For each algorithm running on each instances, we record:I error rate (classi�cation accuracy)I precisionI recallI F-measure

Instance Spaces for Performance Evaluation 70 / 89

Page 212: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Performance Metric Y

For each algorithm running on each instances, we record:I error rate (classi�cation accuracy)I precisionI recallI F-measure

Instance Spaces for Performance Evaluation 70 / 89

Page 213: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Performance Metric Y

For each algorithm running on each instances, we record:I error rate (classi�cation accuracy)I precisionI recallI F-measure

Instance Spaces for Performance Evaluation 70 / 89

Page 214: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Performance Metric Y

For each algorithm running on each instances, we record:I error rate (classi�cation accuracy)I precisionI recallI F-measure

Instance Spaces for Performance Evaluation 70 / 89

Page 215: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Performance Metric Y

For each algorithm running on each instances, we record:I error rate (classi�cation accuracy)I precisionI recallI F-measure

Instance Spaces for Performance Evaluation 70 / 89

Page 216: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Possible Features

We generate a set of 509 candidate features from 8 categories:I simple (dimensionality, types of attributes, missing values,

outliers, class attributes)I statistical (descriptive statistics and canonical correlations,

PCA, etc.)I information theoretic (entropy, mutual information, etc.)I landmarking (performance of simple landmarkers such as NB

or single node trees)I model-based (properties of decision trees such as shape and

size of tree, width and depth)I concept characterization (measures of sparsity of input space

and irregularity in input-output distributions)I complexity (separability, geometry, topology and density of

manifolds)I itemsets & association rules (attribute & class relationships)

Instance Spaces for Performance Evaluation 71 / 89

Page 217: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Possible Features

We generate a set of 509 candidate features from 8 categories:I simple (dimensionality, types of attributes, missing values,

outliers, class attributes)I statistical (descriptive statistics and canonical correlations,

PCA, etc.)I information theoretic (entropy, mutual information, etc.)I landmarking (performance of simple landmarkers such as NB

or single node trees)I model-based (properties of decision trees such as shape and

size of tree, width and depth)I concept characterization (measures of sparsity of input space

and irregularity in input-output distributions)I complexity (separability, geometry, topology and density of

manifolds)I itemsets & association rules (attribute & class relationships)

Instance Spaces for Performance Evaluation 71 / 89

Page 218: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Possible Features

We generate a set of 509 candidate features from 8 categories:I simple (dimensionality, types of attributes, missing values,

outliers, class attributes)I statistical (descriptive statistics and canonical correlations,

PCA, etc.)I information theoretic (entropy, mutual information, etc.)I landmarking (performance of simple landmarkers such as NB

or single node trees)I model-based (properties of decision trees such as shape and

size of tree, width and depth)I concept characterization (measures of sparsity of input space

and irregularity in input-output distributions)I complexity (separability, geometry, topology and density of

manifolds)I itemsets & association rules (attribute & class relationships)

Instance Spaces for Performance Evaluation 71 / 89

Page 219: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Possible Features

We generate a set of 509 candidate features from 8 categories:I simple (dimensionality, types of attributes, missing values,

outliers, class attributes)I statistical (descriptive statistics and canonical correlations,

PCA, etc.)I information theoretic (entropy, mutual information, etc.)I landmarking (performance of simple landmarkers such as NB

or single node trees)I model-based (properties of decision trees such as shape and

size of tree, width and depth)I concept characterization (measures of sparsity of input space

and irregularity in input-output distributions)I complexity (separability, geometry, topology and density of

manifolds)I itemsets & association rules (attribute & class relationships)

Instance Spaces for Performance Evaluation 71 / 89

Page 220: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Possible Features

We generate a set of 509 candidate features from 8 categories:I simple (dimensionality, types of attributes, missing values,

outliers, class attributes)I statistical (descriptive statistics and canonical correlations,

PCA, etc.)I information theoretic (entropy, mutual information, etc.)I landmarking (performance of simple landmarkers such as NB

or single node trees)I model-based (properties of decision trees such as shape and

size of tree, width and depth)I concept characterization (measures of sparsity of input space

and irregularity in input-output distributions)I complexity (separability, geometry, topology and density of

manifolds)I itemsets & association rules (attribute & class relationships)

Instance Spaces for Performance Evaluation 71 / 89

Page 221: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Possible Features

We generate a set of 509 candidate features from 8 categories:I simple (dimensionality, types of attributes, missing values,

outliers, class attributes)I statistical (descriptive statistics and canonical correlations,

PCA, etc.)I information theoretic (entropy, mutual information, etc.)I landmarking (performance of simple landmarkers such as NB

or single node trees)I model-based (properties of decision trees such as shape and

size of tree, width and depth)I concept characterization (measures of sparsity of input space

and irregularity in input-output distributions)I complexity (separability, geometry, topology and density of

manifolds)I itemsets & association rules (attribute & class relationships)

Instance Spaces for Performance Evaluation 71 / 89

Page 222: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Possible Features

We generate a set of 509 candidate features from 8 categories:I simple (dimensionality, types of attributes, missing values,

outliers, class attributes)I statistical (descriptive statistics and canonical correlations,

PCA, etc.)I information theoretic (entropy, mutual information, etc.)I landmarking (performance of simple landmarkers such as NB

or single node trees)I model-based (properties of decision trees such as shape and

size of tree, width and depth)I concept characterization (measures of sparsity of input space

and irregularity in input-output distributions)I complexity (separability, geometry, topology and density of

manifolds)I itemsets & association rules (attribute & class relationships)

Instance Spaces for Performance Evaluation 71 / 89

Page 223: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Possible Features

We generate a set of 509 candidate features from 8 categories:I simple (dimensionality, types of attributes, missing values,

outliers, class attributes)I statistical (descriptive statistics and canonical correlations,

PCA, etc.)I information theoretic (entropy, mutual information, etc.)I landmarking (performance of simple landmarkers such as NB

or single node trees)I model-based (properties of decision trees such as shape and

size of tree, width and depth)I concept characterization (measures of sparsity of input space

and irregularity in input-output distributions)I complexity (separability, geometry, topology and density of

manifolds)I itemsets & association rules (attribute & class relationships)

Instance Spaces for Performance Evaluation 71 / 89

Page 224: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Possible Features

We generate a set of 509 candidate features from 8 categories:I simple (dimensionality, types of attributes, missing values,

outliers, class attributes)I statistical (descriptive statistics and canonical correlations,

PCA, etc.)I information theoretic (entropy, mutual information, etc.)I landmarking (performance of simple landmarkers such as NB

or single node trees)I model-based (properties of decision trees such as shape and

size of tree, width and depth)I concept characterization (measures of sparsity of input space

and irregularity in input-output distributions)I complexity (separability, geometry, topology and density of

manifolds)I itemsets & association rules (attribute & class relationships)

Instance Spaces for Performance Evaluation 71 / 89

Page 225: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

What makes classi�cation hard?

Instance Spaces for Performance Evaluation 72 / 89

Page 226: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Sensitivity Analysis and Feature Selection

We construct perturbed datasets that intentionally increase ordecrease the presence of the challenge

For each instance, 6108 statistical signi�cance test wereconducted (509 x 12) with Bonferroni correctionI setting give 99% chance to correctly discard a feature, and

90% chance to correctly select a feature with a cause-e�ectrelationship to the challenge

Repeat this procedure for 6 small instances (balloons, blogger,breast, breast with 2 attributes, iris, iris with 2 attributes)

For each challenge, we select the features that consistentlycaptured the challenge across the 6 instances

Correlations between features (> 0.7) and between featuresand algorithm performance (< 0.3) were used to eliminatefeatures

Instance Spaces for Performance Evaluation 73 / 89

Page 227: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Sensitivity Analysis and Feature Selection

We construct perturbed datasets that intentionally increase ordecrease the presence of the challenge

For each instance, 6108 statistical signi�cance test wereconducted (509 x 12) with Bonferroni correctionI setting give 99% chance to correctly discard a feature, and

90% chance to correctly select a feature with a cause-e�ectrelationship to the challenge

Repeat this procedure for 6 small instances (balloons, blogger,breast, breast with 2 attributes, iris, iris with 2 attributes)

For each challenge, we select the features that consistentlycaptured the challenge across the 6 instances

Correlations between features (> 0.7) and between featuresand algorithm performance (< 0.3) were used to eliminatefeatures

Instance Spaces for Performance Evaluation 73 / 89

Page 228: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Sensitivity Analysis and Feature Selection

We construct perturbed datasets that intentionally increase ordecrease the presence of the challenge

For each instance, 6108 statistical signi�cance test wereconducted (509 x 12) with Bonferroni correctionI setting give 99% chance to correctly discard a feature, and

90% chance to correctly select a feature with a cause-e�ectrelationship to the challenge

Repeat this procedure for 6 small instances (balloons, blogger,breast, breast with 2 attributes, iris, iris with 2 attributes)

For each challenge, we select the features that consistentlycaptured the challenge across the 6 instances

Correlations between features (> 0.7) and between featuresand algorithm performance (< 0.3) were used to eliminatefeatures

Instance Spaces for Performance Evaluation 73 / 89

Page 229: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Sensitivity Analysis and Feature Selection

We construct perturbed datasets that intentionally increase ordecrease the presence of the challenge

For each instance, 6108 statistical signi�cance test wereconducted (509 x 12) with Bonferroni correctionI setting give 99% chance to correctly discard a feature, and

90% chance to correctly select a feature with a cause-e�ectrelationship to the challenge

Repeat this procedure for 6 small instances (balloons, blogger,breast, breast with 2 attributes, iris, iris with 2 attributes)

For each challenge, we select the features that consistentlycaptured the challenge across the 6 instances

Correlations between features (> 0.7) and between featuresand algorithm performance (< 0.3) were used to eliminatefeatures

Instance Spaces for Performance Evaluation 73 / 89

Page 230: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Sensitivity Analysis and Feature Selection

We construct perturbed datasets that intentionally increase ordecrease the presence of the challenge

For each instance, 6108 statistical signi�cance test wereconducted (509 x 12) with Bonferroni correctionI setting give 99% chance to correctly discard a feature, and

90% chance to correctly select a feature with a cause-e�ectrelationship to the challenge

Repeat this procedure for 6 small instances (balloons, blogger,breast, breast with 2 attributes, iris, iris with 2 attributes)

For each challenge, we select the features that consistentlycaptured the challenge across the 6 instances

Correlations between features (> 0.7) and between featuresand algorithm performance (< 0.3) were used to eliminatefeatures

Instance Spaces for Performance Evaluation 73 / 89

Page 231: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Sensitivity Analysis and Feature Selection

We construct perturbed datasets that intentionally increase ordecrease the presence of the challenge

For each instance, 6108 statistical signi�cance test wereconducted (509 x 12) with Bonferroni correctionI setting give 99% chance to correctly discard a feature, and

90% chance to correctly select a feature with a cause-e�ectrelationship to the challenge

Repeat this procedure for 6 small instances (balloons, blogger,breast, breast with 2 attributes, iris, iris with 2 attributes)

For each challenge, we select the features that consistentlycaptured the challenge across the 6 instances

Correlations between features (> 0.7) and between featuresand algorithm performance (< 0.3) were used to eliminatefeatures

Instance Spaces for Performance Evaluation 73 / 89

Page 232: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Selected Features F

The �nal set of 10 features is:

Instance Spaces for Performance Evaluation 74 / 89

Page 233: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Performance Prediction using F

Regression predicts error rate of each algorithm

Classi�cation labels each instance as easy or hard for thealgorithm (easy if ER<0.2, else hard)

SVM used, parameters optimised via 10FCV grid-search

Instance Spaces for Performance Evaluation 75 / 89

Page 234: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Performance Prediction using F

Regression predicts error rate of each algorithm

Classi�cation labels each instance as easy or hard for thealgorithm (easy if ER<0.2, else hard)

SVM used, parameters optimised via 10FCV grid-search

Instance Spaces for Performance Evaluation 75 / 89

Page 235: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Performance Prediction using F

Regression predicts error rate of each algorithm

Classi�cation labels each instance as easy or hard for thealgorithm (easy if ER<0.2, else hard)

SVM used, parameters optimised via 10FCV grid-search

Instance Spaces for Performance Evaluation 75 / 89

Page 236: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

A new projection algorithm

PCA maximises variance retained, but this isn't exactly wantwe need to support insights through visualisationWe want a projection that creates linear trends (interpretable)in both the feature distribution and algorithm performance

We solve numerically using BIPOP-CMA-ES (note: PCA givesa locally optimal solution only)

Instance Spaces for Performance Evaluation 76 / 89

Page 237: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

A new projection algorithm

PCA maximises variance retained, but this isn't exactly wantwe need to support insights through visualisationWe want a projection that creates linear trends (interpretable)in both the feature distribution and algorithm performance

We solve numerically using BIPOP-CMA-ES (note: PCA givesa locally optimal solution only)

Instance Spaces for Performance Evaluation 76 / 89

Page 238: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

A new projection algorithm

PCA maximises variance retained, but this isn't exactly wantwe need to support insights through visualisationWe want a projection that creates linear trends (interpretable)in both the feature distribution and algorithm performance

We solve numerically using BIPOP-CMA-ES (note: PCA givesa locally optimal solution only)

Instance Spaces for Performance Evaluation 76 / 89

Page 239: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Instance Space (feature distribution)

Instance Spaces for Performance Evaluation 77 / 89

Page 240: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Instance Space (performance distribution)

Instance Spaces for Performance Evaluation 78 / 89

Page 241: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Size features

Instance Spaces for Performance Evaluation 79 / 89

Page 242: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Algorithm Footprints (good is ER<20%)

Instance Spaces for Performance Evaluation 80 / 89

Page 243: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Footprint Area Calculations

Instance Spaces for Performance Evaluation 81 / 89

Page 244: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Other views: who is best, where are easy/hard instances?

Instance Spaces for Performance Evaluation 82 / 89

Page 245: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

The need for new test instances

The current instances don't enable us to see much di�erencein algorithm footprints, despite fundamentally di�erentalgorithm mechanisms (e.g. kNN, RF, RBF-SVM)

There are areas of the instance space unexplored, or verysparseI e.g. at [0.744, 2.833] there is only one instance in the area for

which J48 was the only algorithm with ER<20%. More data isneeded to support conclusions about strengths and weaknesses

The boundary of possible instances in the space can beestimated using projections of the min and max features(either theoretical or observed)

Instance Spaces for Performance Evaluation 83 / 89

Page 246: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

The need for new test instances

The current instances don't enable us to see much di�erencein algorithm footprints, despite fundamentally di�erentalgorithm mechanisms (e.g. kNN, RF, RBF-SVM)

There are areas of the instance space unexplored, or verysparseI e.g. at [0.744, 2.833] there is only one instance in the area for

which J48 was the only algorithm with ER<20%. More data isneeded to support conclusions about strengths and weaknesses

The boundary of possible instances in the space can beestimated using projections of the min and max features(either theoretical or observed)

Instance Spaces for Performance Evaluation 83 / 89

Page 247: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

The need for new test instances

The current instances don't enable us to see much di�erencein algorithm footprints, despite fundamentally di�erentalgorithm mechanisms (e.g. kNN, RF, RBF-SVM)

There are areas of the instance space unexplored, or verysparseI e.g. at [0.744, 2.833] there is only one instance in the area for

which J48 was the only algorithm with ER<20%. More data isneeded to support conclusions about strengths and weaknesses

The boundary of possible instances in the space can beestimated using projections of the min and max features(either theoretical or observed)

Instance Spaces for Performance Evaluation 83 / 89

Page 248: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

The need for new test instances

The current instances don't enable us to see much di�erencein algorithm footprints, despite fundamentally di�erentalgorithm mechanisms (e.g. kNN, RF, RBF-SVM)

There are areas of the instance space unexplored, or verysparseI e.g. at [0.744, 2.833] there is only one instance in the area for

which J48 was the only algorithm with ER<20%. More data isneeded to support conclusions about strengths and weaknesses

The boundary of possible instances in the space can beestimated using projections of the min and max features(either theoretical or observed)

Instance Spaces for Performance Evaluation 83 / 89

Page 249: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

A procedure to generate new instances at target points

We use a Gaussian Mixture Model (GMM) to generate adataset with κ classes on q attributes

The probability of an observation x being sampled from theGMM is:

pr(x) =κ

∑k=1

φkN (µk ,Σk) where{φk ∈ R,µk ∈ Rq,Σk ∈ Rq×q}

We tune the parameter vector of the GMM so that thedistance of its feature vector to the target feature vector isminimised

Tuning is a continuous black-box optimisation problem, andwe use BIPOP-CMA-ES to optimise parameters

Instance Spaces for Performance Evaluation 84 / 89

Page 250: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

A procedure to generate new instances at target points

We use a Gaussian Mixture Model (GMM) to generate adataset with κ classes on q attributes

The probability of an observation x being sampled from theGMM is:

pr(x) =κ

∑k=1

φkN (µk ,Σk) where{φk ∈ R,µk ∈ Rq,Σk ∈ Rq×q}

We tune the parameter vector of the GMM so that thedistance of its feature vector to the target feature vector isminimised

Tuning is a continuous black-box optimisation problem, andwe use BIPOP-CMA-ES to optimise parameters

Instance Spaces for Performance Evaluation 84 / 89

Page 251: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

A procedure to generate new instances at target points

We use a Gaussian Mixture Model (GMM) to generate adataset with κ classes on q attributes

The probability of an observation x being sampled from theGMM is:

pr(x) =κ

∑k=1

φkN (µk ,Σk) where{φk ∈ R,µk ∈ Rq,Σk ∈ Rq×q}

We tune the parameter vector of the GMM so that thedistance of its feature vector to the target feature vector isminimised

Tuning is a continuous black-box optimisation problem, andwe use BIPOP-CMA-ES to optimise parameters

Instance Spaces for Performance Evaluation 84 / 89

Page 252: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

A procedure to generate new instances at target points

We use a Gaussian Mixture Model (GMM) to generate adataset with κ classes on q attributes

The probability of an observation x being sampled from theGMM is:

pr(x) =κ

∑k=1

φkN (µk ,Σk) where{φk ∈ R,µk ∈ Rq,Σk ∈ Rq×q}

We tune the parameter vector of the GMM so that thedistance of its feature vector to the target feature vector isminimised

Tuning is a continuous black-box optimisation problem, andwe use BIPOP-CMA-ES to optimise parameters

Instance Spaces for Performance Evaluation 84 / 89

Page 253: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Two initial experiments

Reproduce a dataset that lives at the location of Iris (Iris sizeand features)?Generate datasets elsewhere (Iris size, di�erent features)?

Instance Spaces for Performance Evaluation 85 / 89

Page 254: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Two initial experiments

Reproduce a dataset that lives at the location of Iris (Iris sizeand features)?Generate datasets elsewhere (Iris size, di�erent features)?

Instance Spaces for Performance Evaluation 85 / 89

Page 255: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Discussion

Computational e�ciency issues (is there a better encoding of aproblem instances than GMM?)

Boundary of all instances is not the same as boundary ofinstances of a given size (since size can a�ect feature ranges)

We need some theoretical work on these boundaries like wehave drawn upon in graph theory for other work

There is much value in generating challenging smallerinstances to understand how structural properties a�ectcomplexity, not just size

Instance space depends on chosen features, which wereselected based on current instances. So iteration is required aswe generate new instances.

Instance Spaces for Performance Evaluation 86 / 89

Page 256: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Discussion

Computational e�ciency issues (is there a better encoding of aproblem instances than GMM?)

Boundary of all instances is not the same as boundary ofinstances of a given size (since size can a�ect feature ranges)

We need some theoretical work on these boundaries like wehave drawn upon in graph theory for other work

There is much value in generating challenging smallerinstances to understand how structural properties a�ectcomplexity, not just size

Instance space depends on chosen features, which wereselected based on current instances. So iteration is required aswe generate new instances.

Instance Spaces for Performance Evaluation 86 / 89

Page 257: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Discussion

Computational e�ciency issues (is there a better encoding of aproblem instances than GMM?)

Boundary of all instances is not the same as boundary ofinstances of a given size (since size can a�ect feature ranges)

We need some theoretical work on these boundaries like wehave drawn upon in graph theory for other work

There is much value in generating challenging smallerinstances to understand how structural properties a�ectcomplexity, not just size

Instance space depends on chosen features, which wereselected based on current instances. So iteration is required aswe generate new instances.

Instance Spaces for Performance Evaluation 86 / 89

Page 258: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Discussion

Computational e�ciency issues (is there a better encoding of aproblem instances than GMM?)

Boundary of all instances is not the same as boundary ofinstances of a given size (since size can a�ect feature ranges)

We need some theoretical work on these boundaries like wehave drawn upon in graph theory for other work

There is much value in generating challenging smallerinstances to understand how structural properties a�ectcomplexity, not just size

Instance space depends on chosen features, which wereselected based on current instances. So iteration is required aswe generate new instances.

Instance Spaces for Performance Evaluation 86 / 89

Page 259: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Collecting Meta-DataCreating the Instance SpaceAlgorithm FootprintsGenerating New Test Instances

Discussion

Computational e�ciency issues (is there a better encoding of aproblem instances than GMM?)

Boundary of all instances is not the same as boundary ofinstances of a given size (since size can a�ect feature ranges)

We need some theoretical work on these boundaries like wehave drawn upon in graph theory for other work

There is much value in generating challenging smallerinstances to understand how structural properties a�ectcomplexity, not just size

Instance space depends on chosen features, which wereselected based on current instances. So iteration is required aswe generate new instances.

Instance Spaces for Performance Evaluation 86 / 89

Page 260: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Conclusions

The proposed methodology is a �rst step towards providingresearchers with a tool toI report the strengths and weaknesses of their algorithmsI show the relative power of an algorithm either

across the entire instance space, orin a particular region of interest (e.g. real world problems)

I evaluate the suitability of existing benchmark instancesI evolve new interesting and challenging test instances

Instance Spaces for Performance Evaluation 87 / 89

Page 261: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Conclusions

The proposed methodology is a �rst step towards providingresearchers with a tool toI report the strengths and weaknesses of their algorithmsI show the relative power of an algorithm either

across the entire instance space, orin a particular region of interest (e.g. real world problems)

I evaluate the suitability of existing benchmark instancesI evolve new interesting and challenging test instances

Instance Spaces for Performance Evaluation 87 / 89

Page 262: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Conclusions

The proposed methodology is a �rst step towards providingresearchers with a tool toI report the strengths and weaknesses of their algorithmsI show the relative power of an algorithm either

across the entire instance space, orin a particular region of interest (e.g. real world problems)

I evaluate the suitability of existing benchmark instancesI evolve new interesting and challenging test instances

Instance Spaces for Performance Evaluation 87 / 89

Page 263: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Conclusions

The proposed methodology is a �rst step towards providingresearchers with a tool toI report the strengths and weaknesses of their algorithmsI show the relative power of an algorithm either

across the entire instance space, orin a particular region of interest (e.g. real world problems)

I evaluate the suitability of existing benchmark instancesI evolve new interesting and challenging test instances

Instance Spaces for Performance Evaluation 87 / 89

Page 264: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Next Steps

We are currently developing the key components of themethodology (evolved instances, feature sets) for a number ofbroad classes of optimization problems, as well as machinelearning, time series forecasting, etc.

We are planning a web resource where researchers candownload instances that span the instance space, upload theiralgorithm performance results, and download footprint metricsand visualisations to support their analysis

The approach generalises to parameter selection withinalgorithms as well, and to choice of formulation.

We hope to be providing a free lunch for researchers soon!

Instance Spaces for Performance Evaluation 88 / 89

Page 265: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Next Steps

We are currently developing the key components of themethodology (evolved instances, feature sets) for a number ofbroad classes of optimization problems, as well as machinelearning, time series forecasting, etc.

We are planning a web resource where researchers candownload instances that span the instance space, upload theiralgorithm performance results, and download footprint metricsand visualisations to support their analysis

The approach generalises to parameter selection withinalgorithms as well, and to choice of formulation.

We hope to be providing a free lunch for researchers soon!

Instance Spaces for Performance Evaluation 88 / 89

Page 266: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Next Steps

We are currently developing the key components of themethodology (evolved instances, feature sets) for a number ofbroad classes of optimization problems, as well as machinelearning, time series forecasting, etc.

We are planning a web resource where researchers candownload instances that span the instance space, upload theiralgorithm performance results, and download footprint metricsand visualisations to support their analysis

The approach generalises to parameter selection withinalgorithms as well, and to choice of formulation.

We hope to be providing a free lunch for researchers soon!

Instance Spaces for Performance Evaluation 88 / 89

Page 267: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Next Steps

We are currently developing the key components of themethodology (evolved instances, feature sets) for a number ofbroad classes of optimization problems, as well as machinelearning, time series forecasting, etc.

We are planning a web resource where researchers candownload instances that span the instance space, upload theiralgorithm performance results, and download footprint metricsand visualisations to support their analysis

The approach generalises to parameter selection withinalgorithms as well, and to choice of formulation.

We hope to be providing a free lunch for researchers soon!

Instance Spaces for Performance Evaluation 88 / 89

Page 268: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Further ReadingMethodologyI K. Smith-Miles and S. Bowly, �Generating new test instances by evolving in instance

space�, Comp. & Oper. Res., vol. 63, pp. 102-113, 2015.

I K. Smith-Miles et al., �Towards Objective Measures of Algorithm Performance acrossInstance Space�, Comp. & Oper. Res., vol. 45, pp. 12-24, 2014.

I L. Lopes and K. Smith-Miles, �Generating Applicable Synthetic Instances for BranchProblems�, Operations Research, vol. 61, no. 3, pp. 563-577, 2013.

I K. Smith-Miles & L. Lopes, �Measuring Instance Di�culty for CombinatorialOptimization Problems�, Comp. & Oper. Res., vol. 39(5), pp. 875-889, 2012.

I K. Smith-Miles, �Cross-disciplinary perspectives on meta-learning for algorithm selection�,ACM Computing Surveys, vol. 41, no. 1, article 6, 2008.

ApplicationsI Machine Learning: L. Villanova, M. A. Muñoz, D. Baatar, and K. Smith-Miles, �Instance

Spaces for Machine Learning Classi�cation�, Machine Learning, vol. 107, no. 1, pp.109-147, 2018.

I Time Series Forecasting: Kang, Y., Hyndman, R. and Smith-Miles, K., "VisualisingForecasting Algorithm Performance using Time Series Instance Spaces", InternationalJournal of Forecasting, vol. 33, no. 2, pp. 345-358, 2017.

I Continuous Optimisation: M. A. Muñoz and K. Smith-Miles, "Performance analysis ofcontinuous black-box optimization algorithms via footprints in instance space",Evolutionary Computation, vol, 25, no. 4, pp. 529-554, 2017.

I Travelling Salesman Problem: K. Smith-Miles and J. van Hemert, �Discovering theSuitability of Optimisation Algorithms by Learning from Evolved Instances�, Annals ofMathematics and Arti�cial Intelligence, vol. 61, no. 2, pp. 87-104, 2011.

I and others on Quadratic Assignment Problem, Job Shop Scheduling , Timetabling , GraphColouring : see kate.smithmiles.wixsite.com/home

Instance Spaces for Performance Evaluation 89 / 89

Page 269: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Further ReadingMethodologyI K. Smith-Miles and S. Bowly, �Generating new test instances by evolving in instance

space�, Comp. & Oper. Res., vol. 63, pp. 102-113, 2015.

I K. Smith-Miles et al., �Towards Objective Measures of Algorithm Performance acrossInstance Space�, Comp. & Oper. Res., vol. 45, pp. 12-24, 2014.

I L. Lopes and K. Smith-Miles, �Generating Applicable Synthetic Instances for BranchProblems�, Operations Research, vol. 61, no. 3, pp. 563-577, 2013.

I K. Smith-Miles & L. Lopes, �Measuring Instance Di�culty for CombinatorialOptimization Problems�, Comp. & Oper. Res., vol. 39(5), pp. 875-889, 2012.

I K. Smith-Miles, �Cross-disciplinary perspectives on meta-learning for algorithm selection�,ACM Computing Surveys, vol. 41, no. 1, article 6, 2008.

ApplicationsI Machine Learning: L. Villanova, M. A. Muñoz, D. Baatar, and K. Smith-Miles, �Instance

Spaces for Machine Learning Classi�cation�, Machine Learning, vol. 107, no. 1, pp.109-147, 2018.

I Time Series Forecasting: Kang, Y., Hyndman, R. and Smith-Miles, K., "VisualisingForecasting Algorithm Performance using Time Series Instance Spaces", InternationalJournal of Forecasting, vol. 33, no. 2, pp. 345-358, 2017.

I Continuous Optimisation: M. A. Muñoz and K. Smith-Miles, "Performance analysis ofcontinuous black-box optimization algorithms via footprints in instance space",Evolutionary Computation, vol, 25, no. 4, pp. 529-554, 2017.

I Travelling Salesman Problem: K. Smith-Miles and J. van Hemert, �Discovering theSuitability of Optimisation Algorithms by Learning from Evolved Instances�, Annals ofMathematics and Arti�cial Intelligence, vol. 61, no. 2, pp. 87-104, 2011.

I and others on Quadratic Assignment Problem, Job Shop Scheduling , Timetabling , GraphColouring : see kate.smithmiles.wixsite.com/home

Instance Spaces for Performance Evaluation 89 / 89

Page 270: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Further ReadingMethodologyI K. Smith-Miles and S. Bowly, �Generating new test instances by evolving in instance

space�, Comp. & Oper. Res., vol. 63, pp. 102-113, 2015.

I K. Smith-Miles et al., �Towards Objective Measures of Algorithm Performance acrossInstance Space�, Comp. & Oper. Res., vol. 45, pp. 12-24, 2014.

I L. Lopes and K. Smith-Miles, �Generating Applicable Synthetic Instances for BranchProblems�, Operations Research, vol. 61, no. 3, pp. 563-577, 2013.

I K. Smith-Miles & L. Lopes, �Measuring Instance Di�culty for CombinatorialOptimization Problems�, Comp. & Oper. Res., vol. 39(5), pp. 875-889, 2012.

I K. Smith-Miles, �Cross-disciplinary perspectives on meta-learning for algorithm selection�,ACM Computing Surveys, vol. 41, no. 1, article 6, 2008.

ApplicationsI Machine Learning: L. Villanova, M. A. Muñoz, D. Baatar, and K. Smith-Miles, �Instance

Spaces for Machine Learning Classi�cation�, Machine Learning, vol. 107, no. 1, pp.109-147, 2018.

I Time Series Forecasting: Kang, Y., Hyndman, R. and Smith-Miles, K., "VisualisingForecasting Algorithm Performance using Time Series Instance Spaces", InternationalJournal of Forecasting, vol. 33, no. 2, pp. 345-358, 2017.

I Continuous Optimisation: M. A. Muñoz and K. Smith-Miles, "Performance analysis ofcontinuous black-box optimization algorithms via footprints in instance space",Evolutionary Computation, vol, 25, no. 4, pp. 529-554, 2017.

I Travelling Salesman Problem: K. Smith-Miles and J. van Hemert, �Discovering theSuitability of Optimisation Algorithms by Learning from Evolved Instances�, Annals ofMathematics and Arti�cial Intelligence, vol. 61, no. 2, pp. 87-104, 2011.

I and others on Quadratic Assignment Problem, Job Shop Scheduling , Timetabling , GraphColouring : see kate.smithmiles.wixsite.com/home

Instance Spaces for Performance Evaluation 89 / 89

Page 271: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Further ReadingMethodologyI K. Smith-Miles and S. Bowly, �Generating new test instances by evolving in instance

space�, Comp. & Oper. Res., vol. 63, pp. 102-113, 2015.

I K. Smith-Miles et al., �Towards Objective Measures of Algorithm Performance acrossInstance Space�, Comp. & Oper. Res., vol. 45, pp. 12-24, 2014.

I L. Lopes and K. Smith-Miles, �Generating Applicable Synthetic Instances for BranchProblems�, Operations Research, vol. 61, no. 3, pp. 563-577, 2013.

I K. Smith-Miles & L. Lopes, �Measuring Instance Di�culty for CombinatorialOptimization Problems�, Comp. & Oper. Res., vol. 39(5), pp. 875-889, 2012.

I K. Smith-Miles, �Cross-disciplinary perspectives on meta-learning for algorithm selection�,ACM Computing Surveys, vol. 41, no. 1, article 6, 2008.

ApplicationsI Machine Learning: L. Villanova, M. A. Muñoz, D. Baatar, and K. Smith-Miles, �Instance

Spaces for Machine Learning Classi�cation�, Machine Learning, vol. 107, no. 1, pp.109-147, 2018.

I Time Series Forecasting: Kang, Y., Hyndman, R. and Smith-Miles, K., "VisualisingForecasting Algorithm Performance using Time Series Instance Spaces", InternationalJournal of Forecasting, vol. 33, no. 2, pp. 345-358, 2017.

I Continuous Optimisation: M. A. Muñoz and K. Smith-Miles, "Performance analysis ofcontinuous black-box optimization algorithms via footprints in instance space",Evolutionary Computation, vol, 25, no. 4, pp. 529-554, 2017.

I Travelling Salesman Problem: K. Smith-Miles and J. van Hemert, �Discovering theSuitability of Optimisation Algorithms by Learning from Evolved Instances�, Annals ofMathematics and Arti�cial Intelligence, vol. 61, no. 2, pp. 87-104, 2011.

I and others on Quadratic Assignment Problem, Job Shop Scheduling , Timetabling , GraphColouring : see kate.smithmiles.wixsite.com/home

Instance Spaces for Performance Evaluation 89 / 89

Page 272: Instance Spaces for Objective Assessment of Algorithms and ... · Kate Smith-Miles School of Mathematics and Statistics University of Melbourne Instance Spaces for Performance Evaluation

IntroductionMethodology

Case Study: Graph ColouringCase Study: Black-Box Optimisation

Case Study: Machine LearningConclusions

Further ReadingMethodologyI K. Smith-Miles and S. Bowly, �Generating new test instances by evolving in instance

space�, Comp. & Oper. Res., vol. 63, pp. 102-113, 2015.

I K. Smith-Miles et al., �Towards Objective Measures of Algorithm Performance acrossInstance Space�, Comp. & Oper. Res., vol. 45, pp. 12-24, 2014.

I L. Lopes and K. Smith-Miles, �Generating Applicable Synthetic Instances for BranchProblems�, Operations Research, vol. 61, no. 3, pp. 563-577, 2013.

I K. Smith-Miles & L. Lopes, �Measuring Instance Di�culty for CombinatorialOptimization Problems�, Comp. & Oper. Res., vol. 39(5), pp. 875-889, 2012.

I K. Smith-Miles, �Cross-disciplinary perspectives on meta-learning for algorithm selection�,ACM Computing Surveys, vol. 41, no. 1, article 6, 2008.

ApplicationsI Machine Learning: L. Villanova, M. A. Muñoz, D. Baatar, and K. Smith-Miles, �Instance

Spaces for Machine Learning Classi�cation�, Machine Learning, vol. 107, no. 1, pp.109-147, 2018.

I Time Series Forecasting: Kang, Y., Hyndman, R. and Smith-Miles, K., "VisualisingForecasting Algorithm Performance using Time Series Instance Spaces", InternationalJournal of Forecasting, vol. 33, no. 2, pp. 345-358, 2017.

I Continuous Optimisation: M. A. Muñoz and K. Smith-Miles, "Performance analysis ofcontinuous black-box optimization algorithms via footprints in instance space",Evolutionary Computation, vol, 25, no. 4, pp. 529-554, 2017.

I Travelling Salesman Problem: K. Smith-Miles and J. van Hemert, �Discovering theSuitability of Optimisation Algorithms by Learning from Evolved Instances�, Annals ofMathematics and Arti�cial Intelligence, vol. 61, no. 2, pp. 87-104, 2011.

I and others on Quadratic Assignment Problem, Job Shop Scheduling , Timetabling , GraphColouring : see kate.smithmiles.wixsite.com/home

Instance Spaces for Performance Evaluation 89 / 89