Introduction to Parallel Computing

38
1 Introduction to Parallel Computing

description

Introduction to Parallel Computing. Presentation Outline. Doing science and engineering using HPC Basic concepts of parallel computing Discussion of HPC hardware Programming approaches (HPC software): Library-based approaches Language-based approaches HPC facilities at NIIT. - PowerPoint PPT Presentation

Transcript of Introduction to Parallel Computing

Page 1: Introduction to Parallel Computing

1

Introduction to Parallel Computing

Page 2: Introduction to Parallel Computing

2

Presentation Outline• Doing science and engineering using HPC• Basic concepts of parallel computing• Discussion of HPC hardware• Programming approaches (HPC software):

• Library-based approaches• Language-based approaches

• HPC facilities at NIIT

Page 3: Introduction to Parallel Computing

3

High Performance Computing (HPC)• The prime focus of HPC is performance—the ability to

solve biggest possible problems in the least possible time

• Also called “Parallel Computing”: • The use of multiple processors, used in parallel, to solve an

application

• Normally such computing is used to solve challenging scientific problems by doing simulations: • For this reason, it is also called “Scientific Computing”:

• Computational science

• HPC is a highly specialized area:• Probably our best chance to work for world’s top research and

commercial organizations: • NASA, European Agency (ESA) …• Google is known to have immense computational power—the

quantity remains unknown!

Page 4: Introduction to Parallel Computing

4

Doing science and engineering using HPC

• HPC is aiding to solve some of the most important problems in science today by pushing software and hardware technology to its limits

• Scientific Computing (or computational science) is the field of study concerned with:• Constructing mathematical models and numerical solution

techniques• Using computers to analyze and solve scientific and engineering

problems

• Applications areas: • Computer-aided Engineering• Weather forecast simulations• Animated movies (Hollywood!)• Image processing• Cryptography • Hurricane forecasts:

• Path as well intensity (Katrina)

Page 5: Introduction to Parallel Computing

5

HPC driving science? • The Millennium Simulation:

• Computational Astrophysics• Heralded as “the” largest ever model

of the Universe• Follows the evolution of ten billion

“dark matter” particles• The simulation ran on a

supercomputer for almost a month

• The Blue Brain Project:• Computational Neuroscience• An effort to simulate the working of a

mammalian brain• One of the fastest supercomputers in

the world is used for the simulations

Arguably these projects cannot be done without HPC

Page 6: Introduction to Parallel Computing

6

PAM CRASH—A Case Study from Automobile Industry

• PAM CRASH is parallel application for studying structural deformation, employed in simulations of automotive crashes and other situations:• An effective alternative to physical crashes, which are

expensive and time-consuming

• Modern simulations take into account millions of elements:• Such compute-intensive simulations can only be

studied on parallel hardware

• Automobile giants including Audi, BMW, Volkswagen and others are conducting crash simulations using PAM CRASH

Page 7: Introduction to Parallel Computing

7

Page 8: Introduction to Parallel Computing

8

Presentation Outline• Doing science and engineering using HPC• Basic concepts of parallel computing• Discussion of HPC hardware• Programming approaches (HPC software):

• Library-based approaches• Language-based approaches

• HPC facilities at NIIT

Page 9: Introduction to Parallel Computing

9

Serial Computation • Traditionally, software has been written for

serial computation:• To be run on a single computer having a single

Central Processing Unit (CPU)• A problem is broken into a discrete series of

instructions

Page 10: Introduction to Parallel Computing

10

Parallel Computation• Parallel computing is the simultaneous use of

multiple compute resources to solve a computational problem:• To be run using multiple CPUs• A problem is broken into discrete parts that can be

solved concurrently

Page 11: Introduction to Parallel Computing

11

Flynn’s Taxonomy• There is no authoritative classification of parallel

computers! • Flynn’s taxonomy is one such classification based on

number of instruction and data stream processed by a parallel computer: • Single Instruction Single Data (SISD)• Multiple Instruction Single Data (MISD)• Single Instruction Multiple Data (SIMD)• Multiple Instruction Multiple Data (MIMD)

• Almost all modern computers fall in this category

Page 12: Introduction to Parallel Computing

12

Flynn’s Taxonomy• Extensions to Flynn’s taxonomy:

• Single Program Multiple Data (SPMD)—a programming model

• This classification is largely outdated!

Page 13: Introduction to Parallel Computing

13

Presentation Outline• Doing science and engineering using HPC• Basic concepts of parallel computing• Discussion of HPC hardware• Programming approaches (HPC software):

• Library-based approaches• Language-based approaches

• HPC facilities at NIIT

Page 14: Introduction to Parallel Computing

14

HPC Hardware• Traditionally HPC has adopted expensive

parallel hardware: • Massively Parallel Processors (MPP)• Symmetric Multi-Processors (SMP)

• Cluster Computers: • A group of PCs connected through a fast (private)

network

• Other classifications:• Distributed Memory Machines• Shared Memory Machines

Page 15: Introduction to Parallel Computing

15

Massively Parallel Processors (MPP)

• A large parallel processing computer with a shared-nothing approach: • The term signifies that each computer has its own

cache and memory

• Examples include Cray XT3, T3E, T3D, IBM SP/2

Page 16: Introduction to Parallel Computing

16

Symmetric Multi-Processors (SMP)• A SMP is a parallel processing system with a

shared-everything approach:• The term signifies that each processor shares the

main memory and possibly the cache

• Typically a SMP can have 2 to 256 processors• Examples include AMD Athlon, AMD Opteron

200 and 2000 series, Intel XEON etc

Page 17: Introduction to Parallel Computing

17

Cluster Computers• A group of PCs or workstations or Macs (called nodes)

connected to each other via a fast (and private) interconnect: • Each node is an independent computer

• Each cluster has one head-node and multiple compute-nodes:• Users logon to head-node and start parallel jobs on compute-

nodes

• Such cluster can be made with Commodity-Off-The-Shelf (COTS) components: • A major breakthrough in HPC was the adoption of commodity

clusters: • Economics• Fast interconnects like Myrinet, Infiniband, Quadrics

• Two popular cluster classifications: • Beowulf Clusters (http://www.beowulf.org)• Rocks Clusters (http://www.rocksclusters.org)

Page 18: Introduction to Parallel Computing

18

Proc 6

Proc 0

Proc 1

Proc 3

Proc 2

Proc 4

Proc 5

Proc 7

message

CPU

Memory LANEthernetMyrinet

Infiniband etc

Cluster Computer

Page 19: Introduction to Parallel Computing

19

Beowulf History• At the most fundamental level, when two or more

computers are used together to solve a problem, it is considered a cluster

• In 1993, Donald Becker and Thomas Sterling started sketching the details of commodity-based cluster system: • The aim was to come up with a cost-effective alternative to

large supercomputers

• The initial prototype was a cluster computer consisting of 16 DX4 processors connected by channel bonded Ethernet

• The idea was an instant success!• Largely due to economics• Open-source software like Linux, GNU compilers, PVM, and

MPI, were a major factor

Page 20: Introduction to Parallel Computing

20

Thomas Sterling with Naegling, Caltech's Beowulf Cluster

Page 21: Introduction to Parallel Computing

21

SMP and Multi-core clusters• Most modern commodity clusters have SMP

and/or multi-core nodes: • Processors not only communicate via interconnect,

but shared memory programming is also required

• This trend is likely to continue: • Even a new name “constellations” has been

proposed

Page 22: Introduction to Parallel Computing

22

Distributed Memory• Each processor has its own local memory• Processors communicate with each other via an

interconnect

Page 23: Introduction to Parallel Computing

23

Shared Memory• All processors have access to shared memory:

• Notion of “Global Address Space”

Page 24: Introduction to Parallel Computing

24

Hybrid• Modern clusters have hybrid architecture:

• Distributed memory for inter-node (between nodes) communications

• Shared memory for intra-node (within a node) communications

Page 25: Introduction to Parallel Computing

25

The TOP500• The TOP500 project was started in 1993:

• Aim is to provide a reliable basis for tracking and detecting trends in HPC

• Twice a year, a list of the sites operating the 500 most powerful computer systems is assembled and released

• The best performance on the Linpack benchmark is used as performance measure for ranking the computer systems

• The latest list was released at Supercomputing 2006 held at Tampa Florida

• The fastest supercomputer is IBM Blue Gene/L at Lawrence Livermore National Lab (LLNL):• Theoretical peak performance: 280.6 TeraFLOPS • Number of Processors: 131072• Main memory: 32768 GB

Page 26: Introduction to Parallel Computing

26

Page 27: Introduction to Parallel Computing

27

The Top 51. DOE/NNSA/LLNL United States

• BlueGene/L - eServer Blue Gene Solution IBM

2. NNSA/Sandia National Laboratories United States• Red Storm - Sandia/ Cray Red Storm, Opteron 2.4 GHz dual

core Cray Inc.

3. IBM Thomas J. Watson Research Center United States• BGW - eServer Blue Gene Solution IBM

4. DOE/NNSA/LLNL United States• ASC Purple - eServer pSeries p5 575 1.9 GHz IBM

5. Barcelona Supercomputing Center Spain• MareNostrum - BladeCenter JS21 Cluster, PPC 970, 2.3 GHz,

Myrinet IBM

Page 28: Introduction to Parallel Computing

28

The Top 100 on Google Maps

Page 29: Introduction to Parallel Computing

29

Presentation Outline• Doing science and engineering using HPC• Basic concepts of parallel computing• Discussion of HPC hardware• Programming approaches (HPC software):

• Library-based approaches• Language-based approaches

• HPC facilities at NIIT

Page 30: Introduction to Parallel Computing

30

Writing Parallel Software• There are mainly two approaches for writing parallel

software: • Software that can be executed on parallel hardware to exploit

computational and memory resources

• The first approach is to use libraries (packages) written in already existing languages like C, Fortran, and Java: • Economical • These libraries provide primitives (methods) like send() and recv() for communicating data

• The second and more radical approach is to provide new languages: • HPC has a history of novel parallel languages• These languages provide high level parallelism constructs:

• What is a construct?

Page 31: Introduction to Parallel Computing

31

Library-based Approach• One school of thought is to provide parallelism by

providing message passing between processors• Such libraries are based on the idea of supporting

parallelism in traditional languages like C and Fortran, • Obvious social advantages

• Two popular messaging approaches:• Parallel Virtual Machine (PVM) • Message Passing Interface (MPI)

• Other messaging libraries:• Message Passing Toolkit (MPT)• SHared MEMory (SHMEM) …

• The Message Passing Interface (MPI) has become a de facto standard for writing HPC applications

Page 32: Introduction to Parallel Computing

32

Message Passing Interface (MPI)• MPI is a standard (an interface or an API):

• It defines a set of methods that are used by application developers to write their applications

• MPI library implement these methods• MPI itself is not a library—it is a specification document that is followed!

• Reasons for popularity:• Software and hardware vendors were involved• Significant contribution from academia• MPICH served as an early reference implementation • MPI compilers are simply wrappers to widely used C and Fortran compilers

• MPI is a success story:• It is the mostly adopted programming paradigm of IBM Blue Gene systems

• At least two production-quality MPI libraries:• MPICH2 (http://www-unix.mcs.anl.gov/mpi/mpich2/)• OpenMPI (http://open-mpi.org)

• There’s even a Java library: • MPJ Express (http://mpj-express.org)

Page 33: Introduction to Parallel Computing

33

Language-based Approach

• There is a long history of novel parallel programming languages:

• The central idea is to support parallelism by providing easy-to-use constructs

• Social aspects to HPC languages:• Dialect or superset of existing languages• Completely new HPC languages - an ambitious approach

• What happens to legacy code?• Conceptually most HPC languages can be categorized

as:• Shared memory languages:

• Mainly for programming on shared memory platforms like SMP

• Partitioned Global Address Space (PGAS) languages:• Mainly for distributed memory HPC platforms

• Distributed memory languages:• Mainly for distributed memory HPC platforms

Page 34: Introduction to Parallel Computing

34

Shared Memory Languages• Designed to support parallel programming on

shared memory platforms:• OpenMP:

• Consists of a set of compiler directives, library routines, and environment variables

• The runtime uses fork-join model of parallel execution

• Cilk:• A design goal was to support asynchronous parallelism• A set of keywords:

• cilk, spawn, sync …

• POSIX Threads (PThreads)

Page 35: Introduction to Parallel Computing

35

Partitioned Global Address Space (PGAS) Languages

• A PGAS is an abstraction that logically divide a process’ address space into two halves:

• Private

• Shared

• Follow the so-called Distributed Shared Memory (DSM) model• Unified Parallel C (UPC):

• We discuss it in detail later

• Titanium:• A Java dialect

• Co-Array Fortran:• Support for co-arrays

Page 36: Introduction to Parallel Computing

36

Distributed Memory Languages

• These purely DM languages support HPC on distributed memory platforms

• High Performance Fortran (HPF):• Data parallelism• An effort to standardize a family of data parallel

Fortran languages

• Fortran M:• Ensured deterministic execution• Added message passing extensions to Fortran 77

• HPJava:• Motivated by HPF

Page 37: Introduction to Parallel Computing

37

MPI

SHMEM

Languages based on Global Address Space

Languages based on Directives

Languages based on Library

C Fortran Java

UPC CoArray Fortran Titanium

HPF

OpenMP

GPMEM

PVM

X10

Languages driven by HPCS

Fortress Chapel

libraries Language extension

A Different Aspect

Runtime level

Credit: Hong Ong, Oak Ridge National Laboratory

Page 38: Introduction to Parallel Computing

38

US High Productivity Computing Systems

• Aims:• To produce systems that double in productivity and value every

18 months• Decrease time-to-solution:

• Development time• Execution time

• Research:• In SW and HW technology:

• New Programming Languages

• Quantifying productivity

• Funding stages:• Three vendors are involved: Sun, IBM, and Cray

• Three new programming languages:• X10, Chapel, and Fortress