Directed Exploration of Complex Systems International Conference on Machine Learning Montreal,...

Directed Exploration of Complex Systems

International Conference on Machine LearningMontreal, Canada

2009/06/16

Michael C. Burl*,Esther Wang^

Collaborators: Brian Enke+, William J. Merline+

*Jet Propulsion Laboratory

+Southwest Research Institute

^California Institute of Technology

OverviewMotivation: Physics-based simulations are widely used to model complex systems

+ high-fidelity representation of actual system behavior-- cumbersome and computationally expensive

1,1:ˆ q

Goal: Build a simplified predictive model of system behavior and use this model to understand which input parameters produce a desired output (+1) from the simulator.

Simulator Grading Script

θ2

θ1

q(θ)

+ 1

- 1

θ

Assumptions: the simulator is deterministic (and non-chaotic) and the learner is only informed of a binary-valued outcome.

Really want a function:

such that agrees with q over most of domainq̂

Example: Asteroid Collisions

Movie courtesy B. Enke, D. Durda, SwRI.

Currently working with asteroid collision simulator:

SPH + N-body gravity code

Various science questions can be studied with this simulator:

Under what conditions is a collision “catastrophic”?

How are asteroid satellites (e.g., Ida-Dactyl) generated?

How were particular asteroid families (e.g., Karin) formed?

But … Input space is 5-dimensional Tgt body size (km) Impactor velocity (km/s) Impactor angle (degrees)

Impactor size (km) Target composition (rubble/solid)

And … Each trial takes 1 CPU-day!

Grid sampling scales ~kd where k is number of steps along each dimension

Directed Exploration Approach

Grey: Points with unknown labelsWhite: Points with known labels + 1Black: Points with known labels - 1

1. Learn a model from current knowledge.

2. Decide which unlabeled point would be most valuable to label.

3. Get the label for the selected point from the oracle (by running the simulator/ grading script) and add that point and its label into “current knowledge”.

4. Repeat until we have a good understanding of the oracle.

Current Knowledge

Two Key Steps

1. Learn a model from current knowledge.• Support Vector Machines (SVM)• Kernel Density Estimation (KDE)• Gaussian Process Classifier (GP)

2. Decide which unlabeled point would be most valuable to label. • Passive – randomly select an unlabeled point.• Most-Confused Point (MCP) – select the unlabeled point

whose label is most uncertain.• Most-Informative Point (MIP) – select the unlabeled point

such that the expected information gain about the entire set of unlabeled points is maximized. [Holub et al, 2008]

• Meta-strategy – treat the base active learner’s valuation as expressing a preference over points; choose randomly while honoring the strength of the preference. (Similar to the epsilon-greedy approach used in RL to balance exploration and exploitation.) [Burl et al, 2006]

Active Learning• Determine which new point(s) in input space, if

labeled, would be most informative for refining the boundaries of the solution region

• Imagine discretized input space: L(n) = set of labeled instances at end of trial n (red or green) U(n) = set of unlabeled instances at end of trial n (black)

• Ideally, choose (n+1) from U(n) such that the classifier learned from

L(n+1) = L(n) + (n+1)

will bring q closer to q.

A defect with MCP

•MCP would choose Point A, but knowing the label of this point would not reveal much about the labels of the other unlabeled points.

•On the other hand, knowing the label of Point B would probably reveal the labels of the other nearby unlabeled points.

A

Figure from [Holub et al, 2008].

B

Most Informative Point

Prediction given Current Knowledge

Probability of labels being +1 (assuming yh =

+1)

Probability of the labels being +1

(assuming yh = -1)

I gain = H0 – (H+p+ + H-p-)

H(Y) = -Σ p(y i) log p(y i)H(Y): Entropy/measure of uncertaintyp(yi): Probabilities of label i being +1

Current Knowledge

Expected Information Gain

Grey: Points with unknown labelsWhite: Points with known +1 labelsBlack: Points with known -1 labels

Experimental Setup

LevelSet 2D Oracles (9)LevelSet 3D Oracles (3)

Inverted Pendulum on Cart (1)

SVM = Support Vector MachineKDE = Kernel Density EstimationGP = Gaussian ProcessPAS = Passive Learning (random)MCP = Most Confused PointMIP = Most Informative PointMETA = Randomize based on MCP

PAS MCP MIP META

SVM X X --- X

KDE X X X ---

GP X X X ---

Algorithms (9)

Oracles (13)

•Difficult to evaluate effectiveness with real simulator since the true q() function is unknown.•Use synthetic oracles instead.•Added benefit: allow other researchers to replicate or improve upon results.

Set of Synthetic Oracles1 2 3 4 5 6

7 8 9

10 11 12

13

Results

GP-MCP ROC Curves

Oracle 7

Run each algorithm-oracle combination for 400 rounds of active learning.

Repeat seven times to get handle on variance.

Use classifier learned after r rounds to make a probabilistic prediction at each point in a fine discretization of the domain.

Threshold probabilities and compare against ground truth to generate ROC curves.

Active

Passive

SVM-MCP

•Active outperforms passive•Exception: SVM-MCP

•KDE-MCP reaches 90% at round 100•KDE-PAS doesn’t until after round 300

=> Speedup over 3x

Pd @ Pfa = 0.05 on Oracle 9

Oracle 9

Head-to-HeadKDE-MIP vs KDE-PAS @ Round 50

•Active wins on 9 oracles•Passive wins on 2 oracles (#11, #12)•Toss-up on 2 oracles (#9, #13)

Head-to-HeadKDE-MIP vs KDE-PAS @ Round 100

•Active is equal or better on almost all oracles

Oracle KDE-PAS KDE-MCP KDE-MIP

1 0.880 +/- 0.030 0.988 +/- 0.006 0.962 +/- 0.021

2 0.865 +/- 0.045 0.972 +/- 0.009 0.981 +/- 0.005

3 0.772 +/- 0.043 0.947 +/- 0.018 0.929 +/- 0.019

4 0.713 +/- 0.035 0.813 +/- 0.048 0.822 +/- 0.015

5 0.642 +/- 0.047 0.795 +/- 0.040 0.832 +/- 0.033

6 0.763 +/- 0.051 0.890 +/- 0.015 0.885 +/- 0.017

7 0.588 +/- 0.052 0.731 +/- 0.023 0.634 +/- 0.028

8 0.674 +/- 0.055 0.793 +/- 0.023 0.815 +/- 0.015

9 0.622 +/- 0.038 0.725 +/- 0.035 0.615 +/- 0.035

10 0.603 +/ - 0.079 0.781 +/- 0.050 0.881 +/- 0.029

11 0.264 +/- 0.039 0.235 +/- 0.057 0.204 +/- 0.026

12 0.332 +/- 0.071 0.513 +/- 0.036 0.167 +/- 0.028

13 0.246 +/- 0.079 0.140 +/- 0.054 0.235 +/- 0.114

MCP vs. MIP?

•MCP appears to be better than the more sophisticated MIP•Similar result holds for GP-MCP vs GP-MIP•MIP more sensitive to errors in estimated probabilities?

Head-to-HeadGP-MCP vs SVM-META @ Round 50

•GP-MCP and SVM-META perform similarly.

Decision Surface

Solution region obtained for a 3D-version of the problem in prior work [Burl et al, 2006] using SVM-META.

Conclusion•Directed exploration shows promise for efficiently understanding the behavior of complex systems modeled by numerical simulations.

Use simulator as an oracle to sequentially generate labeled training data.Learn predictive model from currently available training data.Use active learning to choose which simulation trials to run next.Get new labeled example(s) and repeat.

•Performance was systematically evaluated over synthetic oracles.

•Active learning yielded significant improvement over passive learning.•In some cases, 3x reduction in number of trials to reach 90% PD at PFA = 0.05

•Exploiting the time-savings•Same final result can be obtained with fewer simulation trials.•More simulation trials can be conducted in a given amount of time.•Higher-fidelity (e.g., finer spatio-temporal resolution) simulation trials can be used.•Concentrate trials at the region between interesting and non-interesting regions.

•There was no clear-cut winner among the different supervised learning techniques and active learning algorithms.

•MCP method or SVM-META may be preferable to MIP due to lower computation.•Further study needed.

Future Work•Kernel contraction

•Shrink kernel bandwidth parameter as more data is acquired•Hill-climbing for point selection•Maximize throughput on computer cluster

•choose new point while other runs in-progress•Scalability of MIP

AcknowledgementsW.J. Merline, B. Enke, D. Nesvornoy, D. DurdaA. Holub, P. Perona, D. Decoste, D. Mazzoni, L. ScharenbroichErika C. Vote SURF Fellowship, The Housner FundNASA Applied Information Systems Research Program, J. Bedekamp

Directed Exploration of Complex Systems International Conference on Machine Learning Montreal,...

Documents

Transcript of Directed Exploration of Complex Systems International Conference on Machine Learning Montreal,...