1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC...

38
1 Introduction to Parallel Computing
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    231
  • download

    2

Transcript of 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC...

Page 1: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

1

Introduction to Parallel Computing

Page 2: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

2

Presentation Outline• Doing science and engineering using HPC• Basic concepts of parallel computing• Discussion of HPC hardware• Programming approaches (HPC software):

• Library-based approaches• Language-based approaches

• HPC facilities at NIIT

Page 3: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

3

High Performance Computing (HPC)• The prime focus of HPC is performance—the ability to

solve biggest possible problems in the least possible time

• Also called “Parallel Computing”: • The use of multiple processors, used in parallel, to solve an

application

• Normally such computing is used to solve challenging scientific problems by doing simulations: • For this reason, it is also called “Scientific Computing”:

• Computational science

• HPC is a highly specialized area:• Probably our best chance to work for world’s top research and

commercial organizations: • NASA, European Agency (ESA) …• Google is known to have immense computational power—the

quantity remains unknown!

Page 4: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

4

Doing science and engineering using HPC

• HPC is aiding to solve some of the most important problems in science today by pushing software and hardware technology to its limits

• Scientific Computing (or computational science) is the field of study concerned with:• Constructing mathematical models and numerical solution

techniques• Using computers to analyze and solve scientific and engineering

problems

• Applications areas: • Computer-aided Engineering• Weather forecast simulations• Animated movies (Hollywood!)• Image processing• Cryptography • Hurricane forecasts:

• Path as well intensity (Katrina)

Page 5: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

5

HPC driving science? • The Millennium Simulation:

• Computational Astrophysics• Heralded as “the” largest ever model

of the Universe• Follows the evolution of ten billion

“dark matter” particles• The simulation ran on a

supercomputer for almost a month

• The Blue Brain Project:• Computational Neuroscience• An effort to simulate the working of a

mammalian brain• One of the fastest supercomputers in

the world is used for the simulations

Arguably these projects cannot be done without HPC

Page 6: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

6

PAM CRASH—A Case Study from Automobile Industry

• PAM CRASH is parallel application for studying structural deformation, employed in simulations of automotive crashes and other situations:• An effective alternative to physical crashes, which are

expensive and time-consuming

• Modern simulations take into account millions of elements:• Such compute-intensive simulations can only be

studied on parallel hardware

• Automobile giants including Audi, BMW, Volkswagen and others are conducting crash simulations using PAM CRASH

Page 7: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

7

Page 8: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

8

Presentation Outline• Doing science and engineering using HPC• Basic concepts of parallel computing• Discussion of HPC hardware• Programming approaches (HPC software):

• Library-based approaches• Language-based approaches

• HPC facilities at NIIT

Page 9: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

9

Serial Computation • Traditionally, software has been written for

serial computation:• To be run on a single computer having a single

Central Processing Unit (CPU)• A problem is broken into a discrete series of

instructions

Page 10: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

10

Parallel Computation• Parallel computing is the simultaneous use of

multiple compute resources to solve a computational problem:• To be run using multiple CPUs• A problem is broken into discrete parts that can be

solved concurrently

Page 11: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

11

Flynn’s Taxonomy• There is no authoritative classification of parallel

computers! • Flynn’s taxonomy is one such classification based on

number of instruction and data stream processed by a parallel computer: • Single Instruction Single Data (SISD)• Multiple Instruction Single Data (MISD)• Single Instruction Multiple Data (SIMD)• Multiple Instruction Multiple Data (MIMD)

• Almost all modern computers fall in this category

Page 12: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

12

Flynn’s Taxonomy• Extensions to Flynn’s taxonomy:

• Single Program Multiple Data (SPMD)—a programming model

• This classification is largely outdated!

Page 13: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

13

Presentation Outline• Doing science and engineering using HPC• Basic concepts of parallel computing• Discussion of HPC hardware• Programming approaches (HPC software):

• Library-based approaches• Language-based approaches

• HPC facilities at NIIT

Page 14: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

14

HPC Hardware• Traditionally HPC has adopted expensive

parallel hardware: • Massively Parallel Processors (MPP)• Symmetric Multi-Processors (SMP)

• Cluster Computers: • A group of PCs connected through a fast (private)

network

• Other classifications:• Distributed Memory Machines• Shared Memory Machines

Page 15: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

15

Massively Parallel Processors (MPP)

• A large parallel processing computer with a shared-nothing approach: • The term signifies that each computer has its own

cache and memory

• Examples include Cray XT3, T3E, T3D, IBM SP/2

Page 16: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

16

Symmetric Multi-Processors (SMP)• A SMP is a parallel processing system with a

shared-everything approach:• The term signifies that each processor shares the

main memory and possibly the cache

• Typically a SMP can have 2 to 256 processors• Examples include AMD Athlon, AMD Opteron

200 and 2000 series, Intel XEON etc

Page 17: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

17

Cluster Computers• A group of PCs or workstations or Macs (called nodes)

connected to each other via a fast (and private) interconnect: • Each node is an independent computer

• Each cluster has one head-node and multiple compute-nodes:• Users logon to head-node and start parallel jobs on compute-

nodes

• Such cluster can be made with Commodity-Off-The-Shelf (COTS) components: • A major breakthrough in HPC was the adoption of commodity

clusters: • Economics• Fast interconnects like Myrinet, Infiniband, Quadrics

• Two popular cluster classifications: • Beowulf Clusters (http://www.beowulf.org)• Rocks Clusters (http://www.rocksclusters.org)

Page 18: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

18

Proc 6

Proc 0

Proc 1

Proc 3

Proc 2

Proc 4

Proc 5

Proc 7

message

CPU

Memory LANEthernetMyrinet

Infiniband etc

Cluster Computer

Page 19: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

19

Beowulf History• At the most fundamental level, when two or more

computers are used together to solve a problem, it is considered a cluster

• In 1993, Donald Becker and Thomas Sterling started sketching the details of commodity-based cluster system: • The aim was to come up with a cost-effective alternative to

large supercomputers

• The initial prototype was a cluster computer consisting of 16 DX4 processors connected by channel bonded Ethernet

• The idea was an instant success!• Largely due to economics• Open-source software like Linux, GNU compilers, PVM, and

MPI, were a major factor

Page 20: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

20

Thomas Sterling with Naegling, Caltech's Beowulf Cluster

Page 21: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

21

SMP and Multi-core clusters• Most modern commodity clusters have SMP

and/or multi-core nodes: • Processors not only communicate via interconnect,

but shared memory programming is also required

• This trend is likely to continue: • Even a new name “constellations” has been

proposed

Page 22: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

22

Distributed Memory• Each processor has its own local memory• Processors communicate with each other via an

interconnect

Page 23: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

23

Shared Memory• All processors have access to shared memory:

• Notion of “Global Address Space”

Page 24: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

24

Hybrid• Modern clusters have hybrid architecture:

• Distributed memory for inter-node (between nodes) communications

• Shared memory for intra-node (within a node) communications

Page 25: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

25

The TOP500• The TOP500 project was started in 1993:

• Aim is to provide a reliable basis for tracking and detecting trends in HPC

• Twice a year, a list of the sites operating the 500 most powerful computer systems is assembled and released

• The best performance on the Linpack benchmark is used as performance measure for ranking the computer systems

• The latest list was released at Supercomputing 2006 held at Tampa Florida

• The fastest supercomputer is IBM Blue Gene/L at Lawrence Livermore National Lab (LLNL):• Theoretical peak performance: 280.6 TeraFLOPS • Number of Processors: 131072• Main memory: 32768 GB

Page 26: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

26

Page 27: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

27

The Top 51. DOE/NNSA/LLNL United States

• BlueGene/L - eServer Blue Gene Solution IBM

2. NNSA/Sandia National Laboratories United States• Red Storm - Sandia/ Cray Red Storm, Opteron 2.4 GHz dual

core Cray Inc.

3. IBM Thomas J. Watson Research Center United States• BGW - eServer Blue Gene Solution IBM

4. DOE/NNSA/LLNL United States• ASC Purple - eServer pSeries p5 575 1.9 GHz IBM

5. Barcelona Supercomputing Center Spain• MareNostrum - BladeCenter JS21 Cluster, PPC 970, 2.3 GHz,

Myrinet IBM

Page 28: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

28

The Top 100 on Google Maps

Page 29: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

29

Presentation Outline• Doing science and engineering using HPC• Basic concepts of parallel computing• Discussion of HPC hardware• Programming approaches (HPC software):

• Library-based approaches• Language-based approaches

• HPC facilities at NIIT

Page 30: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

30

Writing Parallel Software• There are mainly two approaches for writing parallel

software: • Software that can be executed on parallel hardware to exploit

computational and memory resources

• The first approach is to use libraries (packages) written in already existing languages like C, Fortran, and Java: • Economical • These libraries provide primitives (methods) like send() and recv() for communicating data

• The second and more radical approach is to provide new languages: • HPC has a history of novel parallel languages• These languages provide high level parallelism constructs:

• What is a construct?

Page 31: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

31

Library-based Approach• One school of thought is to provide parallelism by

providing message passing between processors• Such libraries are based on the idea of supporting

parallelism in traditional languages like C and Fortran, • Obvious social advantages

• Two popular messaging approaches:• Parallel Virtual Machine (PVM) • Message Passing Interface (MPI)

• Other messaging libraries:• Message Passing Toolkit (MPT)• SHared MEMory (SHMEM) …

• The Message Passing Interface (MPI) has become a de facto standard for writing HPC applications

Page 32: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

32

Message Passing Interface (MPI)• MPI is a standard (an interface or an API):

• It defines a set of methods that are used by application developers to write their applications

• MPI library implement these methods• MPI itself is not a library—it is a specification document that is followed!

• Reasons for popularity:• Software and hardware vendors were involved• Significant contribution from academia• MPICH served as an early reference implementation • MPI compilers are simply wrappers to widely used C and Fortran compilers

• MPI is a success story:• It is the mostly adopted programming paradigm of IBM Blue Gene systems

• At least two production-quality MPI libraries:• MPICH2 (http://www-unix.mcs.anl.gov/mpi/mpich2/)• OpenMPI (http://open-mpi.org)

• There’s even a Java library: • MPJ Express (http://mpj-express.org)

Page 33: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

33

Language-based Approach

• There is a long history of novel parallel programming languages:

• The central idea is to support parallelism by providing easy-to-use constructs

• Social aspects to HPC languages:• Dialect or superset of existing languages• Completely new HPC languages - an ambitious approach

• What happens to legacy code?• Conceptually most HPC languages can be categorized

as:• Shared memory languages:

• Mainly for programming on shared memory platforms like SMP

• Partitioned Global Address Space (PGAS) languages:• Mainly for distributed memory HPC platforms

• Distributed memory languages:• Mainly for distributed memory HPC platforms

Page 34: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

34

Shared Memory Languages• Designed to support parallel programming on

shared memory platforms:• OpenMP:

• Consists of a set of compiler directives, library routines, and environment variables

• The runtime uses fork-join model of parallel execution

• Cilk:• A design goal was to support asynchronous parallelism• A set of keywords:

• cilk, spawn, sync …

• POSIX Threads (PThreads)

Page 35: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

35

Partitioned Global Address Space (PGAS) Languages

• A PGAS is an abstraction that logically divide a process’ address space into two halves:

• Private

• Shared

• Follow the so-called Distributed Shared Memory (DSM) model• Unified Parallel C (UPC):

• We discuss it in detail later

• Titanium:• A Java dialect

• Co-Array Fortran:• Support for co-arrays

Page 36: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

36

Distributed Memory Languages

• These purely DM languages support HPC on distributed memory platforms

• High Performance Fortran (HPF):• Data parallelism• An effort to standardize a family of data parallel

Fortran languages

• Fortran M:• Ensured deterministic execution• Added message passing extensions to Fortran 77

• HPJava:• Motivated by HPF

Page 37: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

37

MPI

SHMEM

Languages based on Global Address Space

Languages based on Directives

Languages based on Library

C Fortran Java

UPC CoArray Fortran Titanium

HPF

OpenMP

GPMEM

PVM

X10

Languages driven by HPCS

Fortress Chapel

libraries Language extension

A Different Aspect

Runtime level

Credit: Hong Ong, Oak Ridge National Laboratory

Page 38: 1 Introduction to Parallel Computing. 2 Presentation Outline Doing science and engineering using HPC Basic concepts of parallel computing Discussion of.

38

US High Productivity Computing Systems

• Aims:• To produce systems that double in productivity and value every

18 months• Decrease time-to-solution:

• Development time• Execution time

• Research:• In SW and HW technology:

• New Programming Languages

• Quantifying productivity

• Funding stages:• Three vendors are involved: Sun, IBM, and Cray

• Three new programming languages:• X10, Chapel, and Fortress