Scientific Domain-Informed Machine Learning

24
www.anl.gov Scientific Domain-Informed Machine Learning ALCF Simulation, Data, and Learning Workshop October 3, 2018 Prasanna Balaprakash Computer Scientist Mathematics and Computer Science Division and Leadership Computing Facility Argonne National Laboratory Joint work with P. Malakar, V. Vishwanath, K. Kumaran, V. Morozov, J. Wang, R. Kotamarthi

Transcript of Scientific Domain-Informed Machine Learning

Page 1: Scientific Domain-Informed Machine Learning

www.anl.gov

Scientific Domain-Informed Machine Learning

ALCF Simulation, Data, and Learning Workshop October 3, 2018

Prasanna Balaprakash Computer Scientist

Mathematics and Computer Science Division and Leadership Computing Facility Argonne National Laboratory

Joint work with P. Malakar, V. Vishwanath, K. Kumaran, V. Morozov, J. Wang, R. Kotamarthi

Page 2: Scientific Domain-Informed Machine Learning

DOEscien*ficdataformachinelearning

Regulardomain

Spa*al Temporal

Irregulardomain

Tabular Graphs Manifolds Pointclouds

2

Page 3: Scientific Domain-Informed Machine Learning

Challengesforirregulardomains

3

MachineLearning

Irregulardomains

Domainknowledge

Automa*on

Scalability

Page 4: Scientific Domain-Informed Machine Learning

Casestudies•  Scien*ficapplica*onperformancemodeling•  Surrogatemodelinginweathersimula*on

4

Page 5: Scientific Domain-Informed Machine Learning

Predic*vemodelsinHPCapplica*ons

•  Performance(run*me)predic*ons*llchallenging•  ML-basedperformancemodelingtobridgethegap•  Insightsonimportantknobsthatimpactsperformance•  Helpprunelargesearchspacesinperformancetuning

[H.Hoffmann,WorldChangingIdeas,SA2009] [S.Williamsetal.,ACM2009]

5

Page 6: Scientific Domain-Informed Machine Learning

Biasvariancetradeoff

6

•  Nofreelunch:nosinglemethodwillworkwellonalldataset•  Allsupervisedlearningalgorithmsseektoreducebiasandvarianceina

differentway

Fortmann-Roe,2012

Deeplearningsummerschoollecture,CIFAR,2016

Page 7: Scientific Domain-Informed Machine Learning

Supervisedlearningmethods

Classicalmethods

Regulariza*on

Instance-based

Recursivepar**oning

Kernel-based

Bagging

Boos*ng

7

Page 8: Scientific Domain-Informed Machine Learning

Deepneuralnetworks

x0

x1

x2

xn

h10

h11

h12

h13

h14

h15

h16

h17

h18

h20

h21

h22

h23

h24

h25

O0

8

Page 9: Scientific Domain-Informed Machine Learning

Applica*onsandpla`orms

•  Miniapps(#noofdatapoints):•  miniMD(<2K);O(1024)nodes•  miniAMR(<1K);O(4096)nodes•  miniFE(6Kto15K);O(8192)nodes•  LAMMPS(<1K);O(1024)nodes

9

Page 10: Scientific Domain-Informed Machine Learning

Impactoffeatureengineering

•  NoFeatureEngineering(No-FE)•  applica*oninputparameters

•  FeatureEngineering(FE)•  applica*oninputparameters•  ra*ooftheapplica*onproblemsizeandthe

numberofnumberofprocesses•  inverseofthenumberofprocesses•  binarylogarithmofnumberofprocesses

10X20:80crossvalida*on

10

Featureengineeringhasasignificantimpactontheaccuracy

Page 11: Scientific Domain-Informed Machine Learning

Impactofhardwarepla`orms

Algorithmiccomplexityhasmoreimpactthanhardwarepla`orms

11

Page 12: Scientific Domain-Informed Machine Learning

Impactoftrainingdatasizeonaccuracy

Nonlinearmethodsleveragelargetrainingdatasize 12

Page 13: Scientific Domain-Informed Machine Learning

Transferlearning

Pla`orm1 Pla`orm2

Freezeweights

Retrainweights

13

Page 14: Scientific Domain-Informed Machine Learning

Transferlearning

Transferlearningsignificantlyimprovespredic*onaccuracy 14

Page 15: Scientific Domain-Informed Machine Learning

Extrapola*onfromsmalltolargeproblemsizes

Incorpora*ngdomainknowledgehelpsinexplora*on 15

Page 16: Scientific Domain-Informed Machine Learning

Casestudies•  Scien*ficapplica*onperformancemodeling•  Surrogatemodelinginweathersimula*on

16

Page 17: Scientific Domain-Informed Machine Learning

Planetaryboundarylayer•  Planetaryboundarylayer(PBL)

–  lowestpartoftheatmosphere–  directlyinfluencedbyitscontactwitha

planetarysurface–  respondstochangesinsurfaceradia*ve

forcing–  flowvelocity,temperature,moisture,

etc.,displayrapidfluctua*ons–  computa*onallyexpensiveinweather

researchandforecas*ngmodel

17

hhps://en.wikipedia.org/wiki/Planetary_boundary_layer

Page 18: Scientific Domain-Informed Machine Learning

Planetaryboundarylayer

Input:Surfaceproper*es,fluxes,andgroundtemperature

Output:profilesofwind,temperature,moisture

18

Page 19: Scientific Domain-Informed Machine Learning

Surrogateneuralnetwork

Input:Surfaceproper*es,fluxes,andgroundtemperature

Output:profilesofwind,temperature,moisture 19

Page 20: Scientific Domain-Informed Machine Learning

Predic*onresults(level1:closetosurface) (level16:~2km)

20

Page 21: Scientific Domain-Informed Machine Learning

Predic*onresults

21

meridionalwind(south-northwind) ver*calvelocity(upanddownmo*ons)

Page 22: Scientific Domain-Informed Machine Learning

Predic*onresults

22

meridionalwind(south-northwind) ver*calvelocity(upanddownmo*ons)

Page 23: Scientific Domain-Informed Machine Learning

Challengesforirregulardomains

23

MachineLearning

Irregulardomains

Domainknowledge

Automa*on

Scalability

Page 24: Scientific Domain-Informed Machine Learning

Acknowledgements•  U.S.DOE

– ANLLDRD– ALCF– ASCR,EarlyCareerResearchProgram

24