Tutorial on OpenMP programming Victor Eijkhout SSiASC 2016

9
Tutorial on OpenMP programming Victor Eijkhout SSiASC 2016 Eijkhout: OMP intro 1

Transcript of Tutorial on OpenMP programming Victor Eijkhout SSiASC 2016

Page 1: Tutorial on OpenMP programming Victor Eijkhout SSiASC 2016

Tutorial on OpenMP programmingVictor EijkhoutSSiASC 2016

Eijkhout: OMP intro 1

Page 2: Tutorial on OpenMP programming Victor Eijkhout SSiASC 2016

Justification

OpenMP is a flexible tool for incrementally parallelizing a sharedmemory-based code. This course introduces the main concepts throughlecturing and exercises.

Eijkhout: OMP intro 2

Page 3: Tutorial on OpenMP programming Victor Eijkhout SSiASC 2016

Part I

The Fork-Join model

Eijkhout: OMP intro 3

Page 4: Tutorial on OpenMP programming Victor Eijkhout SSiASC 2016

Computer architecture terminology

One cluster node:

A node will have 1 or 2 or (sometimes) 4 ‘sockets’: processor chips.There may be a co-processor attached.

Eijkhout: OMP intro 4

Page 5: Tutorial on OpenMP programming Victor Eijkhout SSiASC 2016

Structure of a socket

Eight cores per socket, making 16 per node.They all access the same data.

Eijkhout: OMP intro 5

Page 6: Tutorial on OpenMP programming Victor Eijkhout SSiASC 2016

Threads

Process: stream of instructionsThread: process can duplicate itself, same code, access to same data

The OS will place threads on different cores: parallel performance.Note: threads are software. More threads than cores or fewer is allowed.

Eijkhout: OMP intro 6

Page 7: Tutorial on OpenMP programming Victor Eijkhout SSiASC 2016

To write an OpenMP program

#include "omp.h"

in C, and

use omp_lib

or

#include "omp_lib.h"

for Fortran.

Eijkhout: OMP intro 7

Page 8: Tutorial on OpenMP programming Victor Eijkhout SSiASC 2016

To compile an OpenMP program

# gccgcc -o foo foo.c -fopenmp# Intel compilericc -o foo foo.c -openmp

Eijkhout: OMP intro 8

Page 9: Tutorial on OpenMP programming Victor Eijkhout SSiASC 2016

To run an OpenMP program

export OMP_NUM_THREADS=8./my_omp_program

Stampede has 16 cores; more than 16 threads does not make sense.

Quick experiment:

for t in 1 2 4 8 12 16; doOMP_NUM_THREADS=$t ./my_omp_program

done

Eijkhout: OMP intro 9