Secrets of supercomputing

52
1 Secrets of Secrets of Supercomputing Supercomputing The Conservation Laws The Conservation Laws Supercomputing Challenge Kickoff Supercomputing Challenge Kickoff October 21-23, 2007 October 21-23, 2007 I. Background to Supercomputing I. Background to Supercomputing II. Get Wet! With the Shallow Water II. Get Wet! With the Shallow Water Equations Equations Bob Robey - Los Alamos National Laboratory Bob Robey - Los Alamos National Laboratory Randy Roberts – Los Alamos National Laboratory Randy Roberts – Los Alamos National Laboratory Cleve Moler -- Mathworks Cleve Moler -- Mathworks LA-UR-07-6793 Approved for public release; distribution is unlimited

description

 

Transcript of Secrets of supercomputing

Page 1: Secrets of supercomputing

11

Secrets of SupercomputingSecrets of SupercomputingThe Conservation LawsThe Conservation Laws

Supercomputing Challenge KickoffSupercomputing Challenge KickoffOctober 21-23, 2007October 21-23, 2007

I. Background to SupercomputingI. Background to Supercomputing

II. Get Wet! With the Shallow Water EquationsII. Get Wet! With the Shallow Water EquationsBob Robey - Los Alamos National LaboratoryBob Robey - Los Alamos National Laboratory

Randy Roberts – Los Alamos National LaboratoryRandy Roberts – Los Alamos National Laboratory

Cleve Moler -- MathworksCleve Moler -- Mathworks

LA-UR-07-6793Approved for public release;distribution is unlimited

Page 2: Secrets of supercomputing

22

IntroductionsIntroductions• Bob Robey -- Los Alamos National Lab, X division

[email protected], 665-9052 or home: [email protected], 662-2018

– 3D Hydrocodes and parallel numerical software– Helped found UNM and Maui High Performance Computing

Centers and Supercomputing Tutorials

• Randy Roberts -- Los Alamos National Lab, D Division– Java, C++, Numerical and Agent Based Modeling– [email protected]

• Cleve Moler– Matlab Founder– Former UNM CS Dept Chair– SIAM President– Author of “Numerical Computing with Matlab” and “Experiments with

Matlab”

Page 3: Secrets of supercomputing

33

Conservation LawsConservation Laws• Formulated as a conserved quantity

– mass– momentum– energy

• Good reference is Leveque’s book and his freely available software package CLAWPACK (Fortran/MPI) and a 2D shallow water version Tsunamiclaw

Conserved variable

Change

Leveque, Randall, Numerical Methods for Conservation Laws

Leveque, Randall, Finite Volume Methods for Hyperbolic Problems

CLAWPACK http://www.amath.washington.edu/~claw/

Tsunamiclaw http://www.math.utah.edu/~george/tsunamiclaw.html

Page 4: Secrets of supercomputing

44

I. Intro to SupercomputingI. Intro to Supercomputing

• Classical Definition of Supercomputing– Harnessing lots of processors to do lots of small

calculations

• There are many other definitions which usually include any computing beyond the norm– Includes new techniques in modeling, visualization,

and higher level languages.

• Question for thought: With greater CPU resources is it better to save programmer work or to make the computer do bigger problems?

Page 5: Secrets of supercomputing

55

II. Calculus QuickstartII. Calculus Quickstart

Decoding the Language of Decoding the Language of WizardsWizards

Page 6: Secrets of supercomputing

66

Calculus Quickstart GoalsCalculus Quickstart Goals

• Calculus is a language of mathematical wizards. It is convenient shorthand, but not easy to understand until you learn the secrets to the code.

• Our goal is for you to be able to READ calculus and TALK calculus.

• Goal is not to ANALYTICALLY SOLVE calculus using traditional methods. In supercomputing we generally solve problems by brute force.

Page 7: Secrets of supercomputing

77

Calculus TerminologyCalculus Terminology

• Two branches of Calculus – Integral Calculus– Derivative Calculus

• P = f(x, y, t)– Population is a function of x, y, and t

• ∫f(x)dx – definite integral, area under the curve, or summation

• dP/dx – derivative, instantaneous rate of change, or slope of a function

• ∂P/∂x – partial derivative implying that P is a function of more than one variable

Page 8: Secrets of supercomputing

88

Matrix NotationMatrix Notation

0

xtd

c

b

a

The first set of terms are state variables at time t and usually called U. The second set of terms are the flux variables in space x and usually referred to as F.

UF

This is just a system of equations

a + c = 0

b + d = 0

Page 9: Secrets of supercomputing

99

Parallel Algorithms• Data Parallel -– most

common with MPI• Master/Worker – one

process hands out the work to the other processes – great load balance, good with threads

• Pipeline – bucket brigade

Implementation Patterns• Message Passing• Threads• Shared Memory• Distributed Arrays, Global

Arrays

Patterns for Parallel ProgrammingPatterns for Parallel Programming

Patterns for Parallel Programming, Mattson, Sanders, and Massingill, 2005

Page 10: Secrets of supercomputing

1010

Writing a ProgramWriting a ProgramData Parallel ModelData Parallel Model

P(400) – distributed

Ptot -- replicated

Proc 1

P(1-100)

Ptot

Proc 2

P(101-200)

Ptot

Proc 3

P(201–300)

Ptot

Proc 4

P(301-400)

Ptot

Serial operations are done on every processor so that replicated data is the same on every processor.

This may seem like a waste of work, but it is easier than synchronizing data values.

Sections of distributed data are “owned” by each processor. This is where the parallel speedups occur.

Often ghost cells around each processor’s data is a way to handle communication.

Page 11: Secrets of supercomputing

1111

2007-2008 Sample2007-2008 SampleSupercomputing ProjectSupercomputing Project

• Evaluation Criteria – Expo (Report slightly different). Use these to evaluate the following project.– 15% Problem Statement– 25% Mathematical/Algorithmic Model– 25% Computational Model– 15% Results and Conclusions– 10% Code– 10% Display Evaluate Us!!

Page 12: Secrets of supercomputing

1212

Get Wet!Get Wet! With the Shallow Water Equations With the Shallow Water Equations

• The shallow water model for wave motion is important for water flow, seashore waves, and flooding

• Goal of this project is to model the wave motion in the shallow water tank

• With slight modifications this model can be applied to:– ocean or lake currents– weather– glacial movement

Page 13: Secrets of supercomputing

1313

The water experiences 5 splashes which generate surface gravity waves that propagate away from the splash locations and reflect off of the bathtub walls. Wikipedia commons, Author Dan Copsey

Go to shallow water movie. http://en.wikipedia.org/wiki/Image:Shallow_water_waves.gif

Output from a shallow water equation Output from a shallow water equation model of water in a bathtub.model of water in a bathtub.

Page 14: Secrets of supercomputing

1414

Mathematical EquationsMathematical EquationsMathematical ModelMathematical Model

0)( xt huhConservation of Mass

0)()( 2212 xt ghhuhu

Conservation of Momentum

Shallow Water Equations

Notes: mass equals height because width, depth and density are all constant

h -> heightu -> velocityg -> gravity

References: Leveque, Randall, Finite Volume Methods for Hyperbolic Problems, p. 254

Note: Force term, Pressure P=½gh2

Page 15: Secrets of supercomputing

1515

Shallow Water EquationsShallow Water EquationsMatrix NotationMatrix Notation

.02

212

xtghhu

uh

hu

h

0ghspeedwave The maximum time step is calculated so as to keep a wave from completely crossing a cell.

Page 16: Secrets of supercomputing

1616

Numerical ModelNumerical Model

• Lax-Wendroff two-step, a predictor-corrector method– Predictor step estimates the values at the zone

boundaries at half a time step advanced in time– Corrector step fluxes the variables using the predictor

step values• Mathematical Notes for next slide:

– U is a state variable such as mass or height.– F is a flux term – the velocity times the state variable

at the interface– superscripts are time– subscripts are space

Page 17: Secrets of supercomputing

1717

The Lax-Wendroff MethodThe Lax-Wendroff Method

)(2

)(5.0 112

1

2

1ni

ni

ni

ni

n

iFF

x

tUUU

)( 2

1

2

1

2

1

2

1

1

n

i

n

i

n

i

n

i FFxt

UU

Half Step

Whole Step

Explanation graphic courtesy of Jon Robey and Dov Shlacter, 2006-2007 Supercomputing Challenge

Page 18: Secrets of supercomputing

1818

Explanation of Lax-Wendroff ModelExplanation of Lax-Wendroff Model

Physical model

Original

Half-step

Full step

Ghost cell

ti

t+1i

t+.5i+.5

Data assumed to beat the center of cell.

Space index

Explanation graphic courtesy of Jon Robey and Dov Shlacter, 2006-2007 Supercomputing Challenge. See appendix for 2D index explanation.

Page 19: Secrets of supercomputing

1919

Extension to 2DExtension to 2D

• The extension of the shallow water equations to 2D is shown in the following slides.– First slide shows the matrix form of the 2D

shallow water equations– Second slide shows the 2D form of the Lax-

Wendroff numerical method

Page 20: Secrets of supercomputing

2020

2D Shallow Water Equations2D Shallow Water Equations

.02

212

2212

yxt ghhv

huv

vh

huv

ghhu

uh

hv

hu

h

Note the addition of fluxes in the y direction and a flux cross term in the momentum equation. The U, F, and G are shorthand for the numerical equations on the next slide. The U terms are the state variables. F and G are the flux terms in x and y.

U F G

Page 21: Secrets of supercomputing

2121

The Lax-Wendroff MethodThe Lax-Wendroff Method

)(2

)(5.0

)(2

)(5.0

,1,,1,

,,1,,1

2

1

2

1,

2

1

,2

1

nji

nji

nji

nji

n

i

nji

nji

nji

nji

n

i

GGy

tUUU

FFx

tUUU

j

j

)()( 2

1

2

12

1

2

12

1

,2

12

1

,2

1 ,,,1

,

n

ji

n

ji

n

i

n

i

nji

nji GG

y

tFF

x

tUU

jj

Half Step

Whole Step

Page 22: Secrets of supercomputing

2222

2D Shallow Water Equations2D Shallow Water EquationsTransformed for ProgrammingTransformed for Programming

.0

/

/

/

/2

212

2212

yxt gHHV

HUV

V

HUV

gHHU

U

V

U

H

Letting H = h, U = hu and V = hv so that our main variables are the state variables in the first column gives the following set of equations.

H is height (same as mass for constant width, depth and density) U is x momentum (x velocity times mass)V is y momentum (y velocity times mass)

Page 23: Secrets of supercomputing

2323

Sample ProgramsSample Programs

• The numerical method was extracted from the McCurdy team’s model (team 62) from last year and reprogrammed from serial Fortran to C/MPI using the programming style from one of the Los Alamos team’s project (team 51) with permission from both teams.

• Additional versions of the program were made in Java/Threads and Matlab

Page 24: Secrets of supercomputing

2424

Programming ToolsProgramming ToolsThree optionsThree options

1. Matlab– Computation and graphics integrated into Matlab desktop

2. Java/Threads– Eclipse or Netbeans workbench– Graphics via Java 2D and Java Free Chart

3. C/MPI– Eclipse workbench -- An open-source Programmers

Workbench http://www.eclipse.org.– PTP (parallel tools plug-in) – adds MPI support to Eclipse

(developed partly at LANL)– OpenMPI – a MPI implementation (developed partly at LANL)– MPE -- graphics calls that come with MPICH. Graphics calls

are done in parallel from each processor!

Page 25: Secrets of supercomputing

2525

Initial Conditions and Boundary Initial Conditions and Boundary ConditionsConditions

• Initial conditions– velocity (u and v) are 0 throughout the mesh– height is 2 with a ramp to the height of 10 at the right

hand boundary starting at the mid-point in the x dimension

• Boundary conditions are reflective, slip– hbound=hinterior; uxbound=0; vxbound=vinterior

– hbound=hinterior; uybound=uinterior; vybound=0

– If using ghost cells, force zero velocity at the boundary by setting Uxghost= -Uinterior

Page 26: Secrets of supercomputing

2626

Results/ConclusionsResults/Conclusions

• The Lax-Wendroff model accurately models the experimental wave tank– matches wave speed across the tank

• Some of the oscillations in the simulation are an artifact of the numerical model– OK as long as initial wave is not too steep– numerical damping technique could be added

but is beyond the scope of this effort

Page 27: Secrets of supercomputing

2727

AcknowledgementsAcknowledgements

Work used by permission:• Awash: Modeling Wave Movement in a Ripple Tank,

Team 62, McCurdy High School, 2006-2007 Supercomputing Challenges

• A Lot of Hot Air: Modeling Compressible Fluid Dynamics, Team 51, Los Alamos High School, 2006-2007 Supercomputing Challenge

We all have bugs and thanks to those who found mine• Randy Roberts and Jon Robey for finding and fixing a

bug in the second pass• Randy Leveque for finding a missing square in the

gravity forcing term

Page 28: Secrets of supercomputing

2828

Lab ExercisesLab Exercises

• TsunamiClaw

• Matlab • Experimental demonstration

• Java Serial• Java Parallel

• C/MPI

Page 29: Secrets of supercomputing

Java Wave StructureJava Wave Structure

• Wave class does most of the work– main(String[] args) calls start()– start() creates a WaveProblemSetup– start() calls methods to do initialization and

boundary conditions– start() calls methods to iterate and update the

display

Page 30: Secrets of supercomputing

Java Wave Structure (continued)Java Wave Structure (continued)

• WaveProblemSetup stores the new and old arrays

• swaps the new and old arrays when asked to by Wave

Page 31: Secrets of supercomputing

Java Wave Program FlowJava Wave Program Flow

• Create arrays for new, old, and temporary data

• Initialize data

• Set boundary data to represent correct boundary conditions

• Iterate for the given number of iterations

Page 32: Secrets of supercomputing

Java Wave Iteration FlowJava Wave Iteration Flow

• Update physics into new arrays from data in old arrays

• Set boundary data to represent correct boundary conditions with updated arrays

• Update display

• Swap new arrays with old arrays

Page 33: Secrets of supercomputing

Java ThreadsJava Threads

• How do you take advantage of new Multi-Core processors?

• Run parts of the problem on different cores at the same time!

Page 34: Secrets of supercomputing

Java Threads (continued)Java Threads (continued)

• WaveThreaded program– partitions the problem into domains using

SubWaveProblemSetup objects– runs calculations on each domain in separate

threads using WaveWorker objects– adds complexity with synchronization of

thread's access to data

Page 35: Secrets of supercomputing

3535

C/MPI Program DiagramC/MPI Program Diagram

Update Boundary CellsMPI CommunicationExternal Boundaries

First Pass x half step y half step

Second Pass

Swap new/oldGraphics Output

Conservation Check

Calculate RuntimeClose Display, MPI & exit

Allocate memorySet Initial Conditions

Initial Display

Repeat

Page 36: Secrets of supercomputing

3636

MPI Quick StartMPI Quick Start• #include <mpi.h>• MPI_Init(&argc, &argv)

• MPI_Comm_size(Comm, &nprocs) // get number of processors• MPI_Comm_rank(Comm, &myrank) // get processor rank 0 to nproc-1

• // Broadcast from source processor to all processors• MPI_Bcast(buffer, count, MPI_type, source, Comm)

• // Used to update ghost cells• MPI_ISend(buffer, count, MPI_type, dest, tag, Comm, req)• MPI_IRecv(buffer, count, MPI_type, source, tag, Comm, req+1)• MPI_Waitall(num, req, status)

• // Used for sum, max, and min such as total mass or minimum timestep• MPI_Allreduce(&num_local, &num_global, count, MPI_type, MPI_op, Comm)

• MPI_Finalize()

• Web pages for MPI and MPE at Argonne National Lab (ANL) -- http://www-unix.mcs.anl.gov/mpi/www/

Page 37: Secrets of supercomputing

3737

SetupSetup

• The software is already setup on the computers

• For setup on home computers, there are two parts. First download the files from the Supercomputing Challenge website for the lab in C/MPI if you haven’t already done that.

• Untar the lab files with “tar –xzvf Wave_Lab.tgz”

Page 38: Secrets of supercomputing

3838

Setting up SoftwareSetting up SoftwareInstructions in the README fileInstructions in the README file

Setting up System Software• Need Java, OpenMPI and

MPE package from MPICH

• Download and install according to instructions in openmpi_setup.sh

• Can install in user’s directory with some modifications

Setting up User’s workspace

• Download eclipse software including eclipse, PTP and PLDT

• Install according to instructions in eclipse_setup.sh

• Import wave source files and setup eclipse according to instructions in eclipse_setup.sh

Page 39: Secrets of supercomputing

3939

Lab ExercisesLab Exercises

• Try modifying the sample program (Java and/or C versions)– Change initial distribution. How sharp can it be before it goes

unstable?– Change number of cells– Change graphics output– Try running 1, 2, or 4 processes and time the runs. Note that you

can run 4 processes even if you are on a one processor system.– Switch to PTP debug or Java debug perspective and try

stepping through the program• Comparing to data is critical

– Are there other unrealistic behaviors of the model?– Design an experiment to isolate variable effects. This can greatly

improve your model.

Page 40: Secrets of supercomputing

4040

Appendix A.Appendix A.Calculus and SupercomputingCalculus and Supercomputing

• Calculus and Supercomputing are intertwined. Why?

• Here is a simple problem – Add up the volume of earth above sea-level for an island 500 ft high by half a mile wide and twenty miles long.

• Typical science homework problem using simple algebra. Can be done by hand. Not appropriate for supercomputing. Not enough complexity.

Page 41: Secrets of supercomputing

4141

Add ComplexityAdd Complexity

• The island profile is a jagged mountainous terrain cut by deep canyons. How do we add up the volume?

• Calculus – language of complexity– Addition – summing numbers– Multiplication – summing numbers with a constant

magnitude– Integration – summing numbers with an irregular

magnitude

Page 42: Secrets of supercomputing

4242

Divide and ConquerDivide and Conquer

• In discrete form

• Divide the island into small pieces and sum up the volume of each piece.

• Approaches the solution as the size of the intervals grows smaller for a jagged profile.

∑ -- Summation symbol

∆ -- delta symbol or x2-x1

Page 43: Secrets of supercomputing

4343

Divide and ConquerDivide and Conquer

• In Continuous Form – Integration

• Think of the integral symbols as describing a shape that is continuously varying

• The accuracy of the solution can be improved by summing over smaller increments

• Lots of arithmetic operations – now you have a “computing” problem. Add more work and you have a “supercomputing” problem.

Page 44: Secrets of supercomputing

4444

Derivative CalculusDerivative CalculusDescribing ChangeDescribing Change

• Derivatives describe the change in a variable (numerator or top variable) relative to another variable (denominator or bottom). These three derivatives describe the change in population versus time, x-direction and y-direction.

y

Pandx

P

t

P

,

Page 45: Secrets of supercomputing

4545

Appendix B. Appendix B. Computational MethodsComputational Methods

1.1. Eulerian and LagrangianEulerian and Lagrangian

2.2. Explicit and ImplicitExplicit and Implicit

Page 46: Secrets of supercomputing

4646

Two Main Approaches to Divide up Two Main Approaches to Divide up ProblemProblem

• Eulerian – divide up by spatial coordinates– Track populations in a location– Observer frame of reference

• Lagrangian – divide up by objects– Object frame of reference– Easier to track attributes of population since they

travel with the objects– Agent based modeling of Star Logo uses this

approach– Can tangle mesh in 2 and 3 dimensions

Page 47: Secrets of supercomputing

4747

EulerianEulerian

Eulerian – The area stays fixed and has a Population per area. We observe the change in population across the boundaries of the area.

Lagrangian – The population stays constant. The population moves with velocity vx and vy and we move with them. The size of the area will change if the four vertexes of the rectangle move at different velocities. Changes in area will result in different densities.

Eulerian

Population moves out of cell

Lagrangian

Population moves and so does region

Page 48: Secrets of supercomputing

4848

Explicit versus ImplicitExplicit versus Implicit

• Explicit – In mathematical shorthand, Un+1= f(Un). This means that the next timestep values can be expressed entirely on the previous timestep values.

• Implicit – Un+1=f(Un+1,Un). Next timestep values must be solved iteratively. Often uses a matrix or iterative solver.

• We will stick with explicit methods here. You need more math to attempt implicit methods.

Page 49: Secrets of supercomputing

4949

Appendix CAppendix C

Index Explanation for 2D Lax Index Explanation for 2D Lax WendroffWendroff

Page 50: Secrets of supercomputing

5050

ProgrammingProgramming

• Most difficult part of programming this method is to keep track of indices – half step grid indices cannot be represented by ½ in the code so they have to be offset one way or the other.

• Errors are very difficult to find so it is important to be very methodical in the coding.

• Next two slides show the different sizes of the staggered half-step grid and the relationships between the indices in the calculation (courtesy Jon Robey).

Page 51: Secrets of supercomputing

5151

y yy

y

y

y y

y y

yy y

x

x

x x x x

x

x x

x x

x

0

1

2

3

4

j

0 1 2 3 4

i

0,0 -- 1,0 | 1,1

j,i -- j+1,i | j+1,i+1

X step grid Main grid

0,0 -- 0,1 | 1,1

j,i -- j,i+1 | j+1,i+1

Y step grid Main grid

1st Pass

Page 52: Secrets of supercomputing

5252

y yy

y

y

y y

y y

yy y

x

x

x x x x

x

x x

x x

x

1,1

1,1

-- 0,0 | 1,0

-- 0,0 | 0,1

j,i -- j-1,i-1 | j,i-1

0

1

2

3

4

j

0 1 2 3 4

i

j,i -- j-1,i-1 | j-1,i

Main grid X step grid

Main grid Y step grid

2nd Pass