Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling &...

13
Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling & Simulation Group Computational Sciences & Engineering

Transcript of Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling &...

Page 1: Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling & Simulation Group Computational Sciences & Engineering.

Presented by

Parallel Discrete Event Simulation (PDES) at ORNL

Kalyan S. Perumalla, Ph.D.Modeling & Simulation Group

Computational Sciences & Engineering

Page 2: Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling & Simulation Group Computational Sciences & Engineering.

2 Perumalla_PDES_SC07

PDES: Selected application areas

Network simulation Internet protocols, security,

P2P designs, …

Traffic simulation Emergency planning/response,

environmental policy analysis, urban planning, …

Social dynamics simulation Operations planning, foreign

policy, marketing, …

Sensor simulations Wide area monitoring,

situational awareness, border surveillance, …

Organization simulations Command and control, business

processes, …

Emergencies

Current and future defense systems

Protection and awareness systems

Global and local events

Page 3: Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling & Simulation Group Computational Sciences & Engineering.

3 Perumalla_PDES_SC07

High-performance PDES kernel requirements

Global time synchronization Total time-stamped ordering of events Paramount for accuracy

Fast synchronization Scalable, application-independent, time-advance mechanisms Critical for real-time and as-fast-as-possible execution

Support for fine-grained events Minimal overhead relative to event processing times Application computation is typically only 5 µs to 50 µs per event

Conservative, optimistic, and mixed modes Need support for the principal synchronization approaches Useful to choose mode on per-entity basis at initialization Desirable to vary mode dynamically during simulation

General-purpose API Reusable across multiple applications Accommodates multiple techniques

Lookahead, state saving, reverse computation, multicast, etc.

Page 4: Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling & Simulation Group Computational Sciences & Engineering.

4 Perumalla_PDES_SC07

µsik—unique PDES “micro-kernel”

Some recent results of fine-grained PDES benchmark (phold)

Among the largest/fastest scalability results in parallel discrete event simulation

LP = 1 thousand, MSG = 1 millionLP = 1 million, MSG = 100 millionLP = 1 million, MSG = 1 billion

LP = 1 thousand, MSG = 1 millionLP = 1 million, MSG = 100 millionLP = 1 million, MSG = 1 billion

Unique mixed-mode kernel The only scalable mixed-mode

kernel in the world Supports conservative, optimistic,

and mixed modes in a single kernel

Used in a variety of applications DES-based vehicular traffic models DES-based plasma physics models DES-based neurological models Largest Internet simulations

0

10

20

30

40

50

60

70

128 256 512 1024

Ag

gre

ga

te e

ve

nts

/s (

M)

Number of processors

0

5

10

15

20

25

30

35

0 128 256 384 512 640 768 896 1024

µs

pe

r e

ve

nt

Number of processors

Page 5: Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling & Simulation Group Computational Sciences & Engineering.

5 Perumalla_PDES_SC07

µsik scaled to more than 104 processors

Some recent results of fine-grained PDES benchmark On Blue Gene Watson (BGW) at IBM TJ Watson Research Center Well-known PHOLD benchmark, with 1 million logical processes, 10 million pucks

The largest and fastest scalability results in PDES recorded to date

0

100

200

2,048 4,096 8,192 16,384

Events/s (millions)

Number of processors

300

400

500

600ConservativeMixedOptimistic

1000

5000

9000

13000

17000

0 2 4 6 8 10 12 14 16 18Number of processors (thousands)

Speedup

Conservative

Mixed

Optimistic

0

10

20

30

40

50

0 2 4 6 8 10 12 14 16 18

µs per event

Conservative

Mixed

Optimistic

Number of processors (thousands)

0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

Parallel efficiency

MixedOptimis

tic

Conservative

0 2 4 6 8 10 12 14 16 18Number of processors (thousands)

Page 6: Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling & Simulation Group Computational Sciences & Engineering.

6 Perumalla_PDES_SC07

µsik micro-kernel internals

Processable

Pp

User LPs

Kernel LPs

Micro-kernel Future event listProcessed event listLocal virtual time

When update kernel Qs?

• New LP added or deleted• LP executes an event• LP receives an event

LP = Logical processKP = Kernel processECTS = Earliest committable time stampEPTS = Earliest processable time stampEETS = Earliest emittable time stampPEL = Processed event listFEL = Future event listLVT = Local virtual time

EPTS Q

Committable

Pc

ECTS Q

Emittable

Pe

EETS Q

LP LP

LP LP

LP PEL→t LVTFEL→t

KP

KP

KP

KP

Page 7: Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling & Simulation Group Computational Sciences & Engineering.

7 Perumalla_PDES_SC07

libSynk: µsik’s synchronization core

OS/Hardware

libSynk

µsikProcess

µsik

µsikProcess

µsikProcess

TM

TM Red TM Null

RMRM Bar

FM Myr FM TCP

FMFM

FM ShM FM MPI

X Y Implies X uses Y

Network

Page 8: Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling & Simulation Group Computational Sciences & Engineering.

8 Perumalla_PDES_SC07

µsik micro-kernel capabilities μsik is currently able to support the following:

Lookahead-based conservative and/or optimistic execution Reverse computation-based optimistic execution Checkpointing-based optimistic execution Resilient optimistic execution (zero rollbacks)

Constrained, out-of-order execution Preemptive event processing

Any combinations of the above Automated, network-throttled flow control User-level event retraction Process-specific limits to optimism Dynamic process addition/deletion Shared and/or distributed memory execution Process-oriented views

It accommodates addition of the following: Synchronized multicast Optimistic dynamic memory allocation Automated load-balancing

Page 9: Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling & Simulation Group Computational Sciences & Engineering.

9 Perumalla_PDES_SC07

SensorNet: Parallel simulation/ immersive test-bed

Seamless integrated testbed to incorporate a variety of important simulations, stimulations, and live devices

Achieves unified capabilities and significant fidelity for test and evaluation of CB sensor device-based designs, concepts, and operations

TOSSIMSimDriver

SimComm

Tiny Viz

Script Interpreter

External Event Buffer

Tython

ReflectedSimulation

Script

Python

Environ-mental

Model Proxy

SCIPUFF

Weather Model

Plume ModelsComm. Models

PlumeModelsPlumeModels

Sensor Net Simulator

Sensor Net Simulator

Weather SimulatorWeather

SimulatorOperations Simulator

Operations Simulator

LiveDevices

LiveDevices

CB Sensor Network Testbed Framework

Current Envisioned

Page 10: Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling & Simulation Group Computational Sciences & Engineering.

10 Perumalla_PDES_SC07

SensorNet: Simulation-based analysis for plume tracking

Environmental phenomenon exhibits high variability.

Phenomenon drives the sensor network’s computation and communication.

Trace gathered at base station of sensed phenomenon reflects high variability.

Communication effects induce unpredictable gaps in series.

Accurate, integrated simulation of phenomenon and communication captures complex interdependencies.

1

Base Station

Source Release

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

0Sensor Network Layout

-96.0 -91.2 -86.4Longitude

40.6

42.6

44.6

46.6

48.550.5 Turbulent Winds

Surface DosageCHEM at T = 17.0 h

40.6

42.6

44.6

46.6

48.550.5 Light Winds

Surface DosageCHEM at T = 17.0 h

40.6

42.6

44.6

46.6

48.550.5 No Winds

Surface DosageCHEM at T = 17.0 h

Kgs/cm3

1.00E-021.00E-031.00E-041.00E-051.00E-061.00E-071.00E-08

Latit

ude

300 400 500Time (s)

0

10

20

No WindsWinds

Turbulent30

40

600

Mea

n C

once

ntra

tion

(10-

6 kg

s/m

3 )

0

10

20

30

40 No WindsWinds

Turbulent

Page 11: Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling & Simulation Group Computational Sciences & Engineering.

11 Perumalla_PDES_SC07

SCATTER: Ultra-scale PDES-based mobility simulations Scalable tool for transportation and

energy/event/emergency research Regional scale: multiple states

106–107 intersections

Current tool capabilities At most 104 intersections

Faster than real time is very useful

Higheraccuracy

Loweraccuracy

Fidelity

Speed

Faster

Slower

Desirable

Network flowmethods

OREMS

CORSIM

MITSIM

TRANSIMS

SCATTER

Good for emissions estimates, traffic analysis, . . .

Good for rough estimates of evacuation delay, . . .

Our approach: SCATTER DES models

vs time-stepped Parallel execution

vs sequential Scalability to high-

performance computing 102–103 CPUs

Important behaviors kinetic + non-kinetic

Page 12: Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling & Simulation Group Computational Sciences & Engineering.

12 Perumalla_PDES_SC07

SCATTER: Benchmark performance

1

10

100

1000

10000 100000 1000000

µs/

even

t

Number of vehicles

Nodes = 9

Nodes = 1089

Event processing speed

1

10

100

1000

10000

10000 100000 1000000

Rea

l tim

e/si

mu

late

d t

ime

Number of vehicles

Nodes = 9

Nodes = 1089

Speedup over real-time

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

1 2 3 4

Sp

eed

up

ove

r se

qu

enti

al r

un

Number of processors

Parallel Speedup

Significant speedup with parallel execution!

Significantly faster than real time with 1 million vehicles!

Very low event processing overhead (ms)!

Page 13: Presented by Parallel Discrete Event Simulation (PDES) at ORNL Kalyan S. Perumalla, Ph.D. Modeling & Simulation Group Computational Sciences & Engineering.

13 Perumalla_PDES_SC0713 Perumalla_PDES_SC07

Contact

Kalyan S. PerumallaModeling & Simulation GroupComputational Sciences & Engineering (865) [email protected]