Using Small-Scale History Data to Predict Large-Scale ...

Using Small-Scale History Data to Predict Large-Scale Performance of HPC Application

Wenju Zhou

University of Science and Technology of China

2020.05.12

Outline

ustc 2

Background

Two-level model

Experiment results and analysis

Conclusions

Background

High Performance Computing (HPC)

ustc 4

- architecture:

computing nodes,

interconnect, ...

- usage:

physical simulations,

molecular modeling, ...

- effect:

provide computing power,

reduce experimental risk, ...fig 1. architecture of HPC

Performance Modeling

ustc 5

∙ tuning

∙ scheduling

∙ energy saving

···

features

· system settings

· application input

· execution options

· compiler flags

···

performance

· execution time

· memory

· throughput

· energy

···

fig 2. architecture of HPC

Performance Modeling

ustc 6

categories desciption advantaes limitations

simulative methods simulating execution

by simulators

concept machine overhead

analytical methods analysis of code and

system by experts

accuracy &

Interpretability

overhead

empirical methods statistical analysis of

historical data

automation cold start

∙ system complexity

∙ application complexity

∙ interaction between

system and application

attention on

empirical methods

Machine Learning in Performance Modeling

ustc 7

A technology to learn

knowledge and experience

from historical data.

A powerful approach for

empirical modeling

methods.

D. N. Hieu, T. T. Minh, T. Van Quang, B. X. Giang, and T. Van Hoai, “A machine learning-

based approach for predicting the execution time of cfd applications on cloud computing

environment,” in International Conference on Future Data and Security Engineering, pp. 40–52,

Springer, 2016

P. Malakar, P. Balaprakash, V. Vishwanath, V. Morozov, and K. Kumaran, “Benchmarking

machine learning methods for performance modeling of scientific applications,” in 2018

IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance

Computer Systems (PMBS), pp. 33–44, IEEE, 2018.

Machine Learning

Extrapolation Issue of Machine Learning

ustc 8

- Extrapolation:

testset feature subspace outside trainset feature

subspace.

- Issue

high-accuracy interpolations but low-accuracy

extrapolations —— machine learning can achieve

high-accuracy predictions in trainset subspace, but

prediction accuracy reduces drastically outside

trainset subspace.

Two-level Model

Problem Description

ustc 10

),,...,,,(: X 321 pxxxx n

Features Label

execution time,

is known

from historical

logs, is to

be predicted.

input parameters,

Identically and

Independent

Distributed in

.and testtrain XX

number of

processors,

,X [c,d] in p

,X [a,b] in p

trainy

Machine Learning Mechanism

ustc 11

Approximate model:

Target model:

Independent and

Identically Distributed

(IID) hypothesis

Motivation of Two-Level Model

ustc 12

Issues of one-level model

- Simple algorithms:

· own extrapolation ability in

some way

· cannot learn complex

relations between input

parameters and performance

- Sophisticated algorithms:

· learn accurate relations

between input parameters and

performance

· overfitting on small-scale data

Overview of two-level model

- Interpolation level:

learn accurate interpolation

model and predict small-scale

performance under input

parameters in

- Extrapolation level:

construct model owning

extrapolation ability and predict

with corresponding small-

scale performance predictions

Workflow of Two-Level Model

ustc 13

fig 3. workflow of two-level model

Prediction Specification

ustc 14

𝑋𝑡𝑒𝑠𝑡𝑖 ∶ (𝑥1𝑖 , 𝑥2𝑖 , 𝑥3𝑖 , . . . , 𝑥𝑛𝑖, 𝑝𝑖)

𝑋𝑡𝑒𝑠𝑡𝑖1 ∶ 𝑥1𝑖 , 𝑥2𝑖 , 𝑥3𝑖 , . . . , 𝑥𝑛𝑖, 𝑝 − 𝑙𝑖𝑠𝑡1

𝑋𝑡𝑒𝑠𝑡𝑖2 ∶ 𝑥1𝑖 , 𝑥2𝑖 , 𝑥3𝑖 , . . . , 𝑥𝑛𝑖, 𝑝 − 𝑙𝑖𝑠𝑡2

𝑋𝑡𝑒𝑠𝑡𝑖𝑡 ∶ 𝑥1𝑖 , 𝑥2𝑖 , 𝑥3𝑖 , . . . , 𝑥𝑛𝑖, 𝑝 − 𝑙𝑖𝑠𝑡𝑡

𝑌𝑡𝑒𝑠𝑡𝑖1

𝑌𝑡𝑒𝑠𝑡𝑖2...

𝑌𝑡𝑒𝑠𝑡𝑖𝑡

𝑦𝑡𝑒𝑠𝑡𝑖

interpolation

extrapolation𝑝 − 𝑙𝑖𝑠𝑡

Analysis of Interpolation Level

ustc 15

- Task:

small-scale performance predictions as extrapolation

level training data

- Requirement:

· high accuracy

· random error distribution

fig 4. influence of error distribution

Interpoaltion Level Model

ustc 16

· feature randomness

· sample randomness

· high accuracy

· random error distribution

fig 5. random forest

Analysis of Extrapolation Level

ustc 17

- Task:

· predict large-scale performance with small-scale

predictions

- Chanllenge:

· extrapolation (scalability) —— scalability may change

with input parameters

· interpolation error —— only several data points for

every input parameter combinations, easy to overfit

causes error amplification

Extrapolation Level Model

ustc 18

Performance Modeling Normal Form(PMNF):

Interpolation Error

Scalability

Multi-Task Lasso:

Experiment Results & Analysis

Experiment

ustc 20

Platform NodeMonte Carlo Benchmark (MCB) Features

Kripke Features

Evaluation

ustc 21

Baseline· Random Forest

· Multi-Layer Perceptron

· EPMNF

· Log Rgression

Metrics· MAE

· MAPE

· RMSE

Result: Comparison of Different Methods

ustc 22

· extrapolations are harder than interpolations

· the performance of the same method in diffrent applications

varies greatly

· two-level model perform better than one-level model

Result: Comparison of Different Methods

ustc 23

fig 6. case study of different methods

Result: Single-Task v.s. Multi-Task

ustc 24

Multi-Task Learning

· data amplification

· feature selection

fig 7. residuals of random forest

Result: Single-Task v.s. Multi-Task

ustc 25

fig 8. case study of multi-task learning

Result: Clustering or Not

ustc 26

K-means Cluster- distance:

- effect:

partition tasks into cluster by distance (relatedness) to

learn high-related tasks jointly.

reasons for insignificance· input sensitivity

· experimental feature values range

· clustering algorithm

Result: Clustering or Not

ustc 27

fig 9. case study of clustering

Conclusions

ustc 29

- analyze the extrapolation problem and issues of one-

level model

- propose a two-level model to predict large-scale

performance with only small-scale historical data

- conduct experiment to validate the effectiveness of

two-level model

Future Work

ustc 30

- improve two-level model by choosing more fitting

clustering and multi-task learing algorithms

- improve scalability models with considering system

information to model cross-platform performance

- research whether two-level model works for

extrapoaltion problem caused by input parameter

Thanks.Welcome further communication.

Wenju Zhou

zhouwenj@mail.ustc.edu.cn

Using Small-Scale History Data to Predict Large-Scale ...

Documents

Transcript of Using Small-Scale History Data to Predict Large-Scale ...

Understanding large-scale instabilities of …gershwin.ens.fr/.../beamer-lectures_cambridge_2012.pdflayered models V. Zeitlin Unstable large-scale ﬂows Modeling large-scale processes

Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.

LEVERAGING LARGE-SCALE HUMAN HISTORY TO PREDICT THE ECONOMIC FUTURE AND HEALTH OF THE HUMAN RACE [INBOUND 2014]

The inﬂ uence of large offshore wind farms on the North ... · projections predict a capacity of 40 GW by the year 2020. Despite expectations for the construction of large-scale

Large Scale PHP

RoSS.pFTU Large-Scale,RoSS.pFTU Large-Scale

WARM-UP QUESTIONS WHAT IS MAP SCALE?(DOK 1) PREDICT- WHAT ARE LARGE AND SMALL SCALE MAPS?(DOK 2) HOW DO YOU MEASURE DISTANCE ON A MAP?(DOK 2) Map.

Large Scale Artwork

A Multi-scale Computational Model to predict the ...

Large-Scale Image Classification using High Performance ...grids.ucs.indiana.edu/ptliupages/publications/Large-Scale Image... · Large-Scale Image Classification using High Performance

Scaling Up Accurate Electronic Structure Calculations by ... · developed here, with the aim to predict electronic structure of large-scale (strongly) correlated materials completely

Propagation Models Large scale models predict behavior averaged over distances >> Function of distance & significant environmental features, roughly frequency.

Human Genome Sciences’ Large Scale Manufacturing · PDF fileHuman Genome Sciences’ Large Scale Manufacturing Facility ... HUMAN GENOME SCIENCES LARGE SCALE MANUFACTURING ... Above

Development of a Scale to Predict Patterns of Cognitive ...

Damage modelling of large and small scale composite panels … · 2019-12-05 · envisages full-scale modeling capabilities of its vessels and the ability to predict damage and survivability

SysFera and ROMEO Make Large-Scale CFD Simulations Only … · 2017-12-23 · SysFera and ROMEO Make Large-Scale CFD Simulations Only 3 Clicks Away ... Numerical methods to predict

Large Scale

PREDIcT: Towards Predicting the Runtime of Large Scale Iterative Analytics · 2019-07-12 · PREDIcT: Towards Predicting the Runtime of Large Scale Iterative Analytics Adrian Daniel

1 large scale vs small scale

Can Large Scale Climate Controls Help Predict Seasonal ... Perrin-High... · Can Large Scale Climate Controls Help Predict Seasonal Tornado Activity? Brittany Perrin University of