OpenMP

44
A Standard for Shared Memory Parallel Programming Seyong Lee Purdue University School of Electrical and Computer Engineering

description

OpenMP. A Standard for Shared Memory Parallel Programming. Seyong Lee Purdue University School of Electrical and Computer Engineering. Overview. Introduction to OpenMP OpenMP Programming Model OpenMP Directives Run-Time Library Routine Environment Variables Summary. - PowerPoint PPT Presentation

Transcript of OpenMP

Page 1: OpenMP

A Standard for Shared Memory Parallel Programming

Seyong Lee

Purdue University

School of Electrical and Computer Engineering

Page 2: OpenMP

Overview

I. Introduction to OpenMP

II. OpenMP Programming Model

III. OpenMP Directives

IV. Run-Time Library Routine

V. Environment Variables

VI. Summary

Page 3: OpenMP

What is OpenMP? Application program interface (API) for shared memory

parallel programming A specification for a set of compiler directives, library

routines, and environment variables Make it easy to create multi-threaded (MT) programs in

Fortran, C and C++ Portable / multi-platform, including Unix platforms and

Windows NT platforms Jointly defined and endorsed by a group of major

computer hardware and software vendors

I. Introduction to OpenMP

Page 4: OpenMP

OpenMP is not….

Not Automatic parallelization- User explicitly specifies parallel execution- Compiler does not ignore user directives even if wrong

Not just loop level parallelism - Functionality to enable coarse grained parallelism Not meant for distributed memory parallel systems Not necessarily implemented identically by all vendors Not Guaranteed to make the most efficient use of

shared memory

Page 5: OpenMP

Why OpenMP?

Parallel programming before OpenMP- Standard way to program distributed memory computers (MPI and PVM) - No standard API for shared memory programming

Several vendors had directive based API for shared memory programming- All different, vendor proprietary

Commercial users, high end software vendors have big investment in existing code- Not very eager to rewrite their code in new language

Portability possible only through MPI- Library based, good performance and scalability- But sacrifice the built in shared memory advantage of the hardware

Page 6: OpenMP

Goals of OpenMP

Standardization : - Provide a standard among a variety of shared memory architectures/platforms

Lean and mean : - Establish a simple and limited set of directives for programming shared memory machines.

Ease of Use : - Provide capability to incrementally parallelize a serial program - Provide the capability to implement both coarse-grain and fine-grain parallelism

Portability : - Support Fortran (77, 90, and 95), C, and C++

Page 7: OpenMP

I. Introduction to OpenMP

II. OpenMP Programming Model

III. OpenMP Directives

IV. Run-Time Library Routine

V. Environment Variables

VI. Summary

Page 8: OpenMP

II. OpenMP Programming Model

Thread Based Parallelism Explicit Parallelism Fork-Join Model Compiler Directive Based Nested Parallelism Support Dynamic Threads

Page 9: OpenMP

User Interface Model

Compiler Directives- Most of the API : Control constructs, Data attribute constructs- Extends base language : f77, f90, C, C++- Example : C$OMP PARALLEL DO

Library- Small set of functions to control threads and to implement unstructured locks- Example : call omp_set_num_threads(128)

Environment Variables- For end users to control run time execution- Example : setenv OMP_NUM_THREADS 8

Page 10: OpenMP

Execution Model

Page 11: OpenMP

I. Introduction to OpenMP

II. OpenMP Programming Model

III. OpenMP Directives

IV. Run-Time Library Routine

V. Environment Variables

VI. Summary

Page 12: OpenMP

III. OpenMP Directives

OpenMP Directives Format Directive Scoping PARALLEL Region Construct Work-Sharing Constructs Synchronization Constructs Data Environment Constructs

Page 13: OpenMP

OpenMP Directives Format Fortran Directives Format

C$OMP construct [clause [clause]…]!$OMP construct [clause [clause]…]*$OMP construct [clause [clause]…]

!$OMP PARALLEL DEFAULT (SHARED) PRIVATE (a, b)[structured block of code]

!$OMP END PARALLEL

cf. !$ a = OMP_get_thread_num()

C / C++ Directives Format#pragma omp construct [clause [clause]…]

#pragma omp parallel default (shared) private(a,b){

[structured block of code]} /* all threads join master thread and terminate */

Page 14: OpenMP

Structured blocks

Most OpenMP constructs apply to structured block- Structured block: A block of code with one point of entry at the top and one point of exit at the bottom. The only other branches allowed are STOP statements in Fortran and exit() in C/C++

C$OMP PARALLEL10 wrk (id) = garbage (id) res (id) = wrk (id) ** 2 if (conv(res(id)) goto 10C$OMP END PARALLEL print *, id

C$OMP PARALLEL10 wrk (id) = garbage (id)30 res (id) = wrk (id) ** 2 if (conv(res(id)) goto 20 go to 10C$OMP END PARALLEL if (not_DONE) goto 3020 print *, id

A structured block Not a structured block

Page 15: OpenMP

III. OpenMP Directives

OpenMP Directives Format Directive Scoping PARALLEL Region Construct Work-Sharing Constructs Synchronization Constructs Data Environment Constructs

Page 16: OpenMP

Directive Scoping Static (Lexical) Extent :

- The block of code directly placed between the two directives !$OMP PARALLEL and !$OMP END PARALLEL

Dynamic Extent :- The code included in the lexical extent plus all the code called from inside the lexical extent

Orphaned Directive : - An OpenMP directive that appears independently from another enclosing directives. It exists outside of another directive’s static extent

Example PROGRAM TEST …

!$OMP PARALLEL … !$OMP DO

DO I = … … CALL SUB1 … ENDDO

!$OMP END DO … CALL SUB2 …

!$OMP END PARALLEL

SUBROUTINE SUB1…

!$OMP CRITICAL…

!$OMP END CRITICALEND

SUBROUTINE SUB2…

!$OMP SECTIONS…

!$OMP END SECTIONS…END

Static Extent Orphaned Directive

Page 17: OpenMP

III. OpenMP Directives

OpenMP Directives Format Directive Scoping PARALLEL Region Construct Work-Sharing Constructs Synchronization Constructs Data Environment Constructs

Page 18: OpenMP

A block of code that will be executed by multiple threads Properties

- Fork-Join Model- Number of threads won’t change inside a parallel region- SPMD execution within region- Enclosed block of code must be structured, no branching into or out of block

Format

!$OMP PARALLEL clause1 clause2 ……

!$OMP END PARALLEL

PARALLEL Region Construct

Page 19: OpenMP

PARALLEL Region Construct !$OMP PARALLEL

write (*,*) “Hello”

!$OMP END PARALLEL

Page 20: OpenMP

How many threads?1. Use of the omp_set_threads() library function2. Setting of the OMP_NUM_THREADS environment

variable3. Implementation default

Dynamic Threads : - By default, the same number of threads are used to execute each parallel region- Two methods for enabling dynamic threads

1. Use of the omp_set_dynamic() library function 2. Setting of the OMP_DYNAMIC environment variable

PARALLEL Region Construct

Page 21: OpenMP

III. OpenMP Directives

OpenMP Directives Format Directive Scoping PARALLEL Region Construct Work-Sharing Constructs Synchronization Constructs Data Environment Constructs

Page 22: OpenMP

Work-Sharing Constructs

Divide the execution of the enclosed code region among the members of the team that encounter it

Do not launch new threads Must be enclosed within a parallel region No-implied barrier upon entry to a work-sharing construct An implied barrier at the end of a work-sharing construct Type of work-sharing constructs

DO Directive: !OMP DO / !$OMP END DO SECTIONS Directive: !OMP SECTIONS / !$OMP END SECTIONS SINGLE Directive: !OMP SINGLE / !$OMP END SINGLE

Page 23: OpenMP

Work-Sharing Constructs DO Directive Format

!$OMP DO clause1 clause2 …

[do loop]

!$ OMP END DO end_clause

Page 24: OpenMP

Work-Sharing Constructs• How iterations of the loop are divided?

=> use SHEDULE (type, chunk) clause

STATIC DYNAMIC GUIDED

Page 25: OpenMP

SECTIONS Directive- Non-iterative work-sharing- Each section is executed once by a thread- Potential MIMD?

Format!$OMP SECTIONS clause1, clause2…

!$OMP SECTION[block1]

!$OMP SECTION[block2]

…!$OMP END SECTIONS end_clause

Work-Sharing Constructs

Page 26: OpenMP

Work-Sharing Constructs SECTIONS

DirectiveExample code!$OMP SECTIONS

!$OMP SECTION

write(*,*) “Hello”

!$OMP SECTION

write(*,*) “Hi”

!$OMP SECTION

write(*,*) “Bye”

!$OMP ENT SECTIONS

Page 27: OpenMP

SINGLE Directive- Encloses code to be executed by only one thread in the team- Threads in the team that do not execute the SINGLE directive, wait at the end of the enclosed code block unless a NOWAIT clause is specified

Format !$OMP SINGLE clause1 clause2…

… !$OMP END SINGLE end_clause

Work-Sharing Constructs

Page 28: OpenMP

Work-Sharing Constructs

SINGLE DIRECTIVE

Example code

!$OMP SINGLE

write(*,*) “Hello”

!$OMP END SINGLE

Page 29: OpenMP

III. OpenMP Directives

OpenMP Directives Format Directive Scoping PARALLEL Region Construct Work-Sharing Constructs Synchronization Constructs Data Environment Constructs

Page 30: OpenMP

OpenMP has the following construct to support synchronization:

- MASTER Directive- CRITICAL Directive- BARRIER Directive- ATOMIC Directive- ORDERED Directive- FLUSH Directive

Synchronization Constructs

Page 31: OpenMP

Synchronization Constructs

MASTER Directive

- Executed only by the master thread of the team

- No implied barrier associated with this directive

Format

!$OMP MASTER

!$OMP END MASTER

Page 32: OpenMP

Synchronization Constructs

CRITICAL Directive- Specified a region of code that must be executed by only one thread at a time

- The optional name enables multiple different CRITICAL regions to exist

Format!$OMP CRITICAL name

!$OMP END CRITICAL name

Page 33: OpenMP

Synchronization Constructs

BARRIER Directive

- Synchronize all threads in the team

- When encountered, each thread waits until all the other threads have reached this point

- Must be encountered by all threads in a team or by non at all: otherwise, deadlock

Format

!$OMP BARRIER

Page 34: OpenMP

FLUSH Directive- Explicit synchronization point at which the implementation is required to provide a consistent view of memory- Thread-visible variables are written to back to memory at this point

Format

!$OMP FLUSH (variable1, variable2, …)

The FLUSH directives is implied for the directives shown in the table below. - The directive is not implied if NOWAIT clause is present

BARRIER END SECTIONSCRITICAL and END CRITICAL END SINGLEEND DO ORDERED and END ORDEREDEND PARALLEL

Synchronization Constructs

Page 35: OpenMP

III. OpenMP Directives

OpenMP Directives Format Directive Scoping PARALLEL Region Construct Work-Sharing Constructs Synchronization Constructs Data Environment Constructs

Page 36: OpenMP

Define how and which data variables in the serial section of the program are transferred to the parallel sections of the program (and back)

Define which variables will be visible to all threads and which variable be private

Include- Directive

THREADPRIVATE - Data scope attribute clause

PRIVATE SHAREDFIRSTPRIVATE LASTPRIVATEDEFAULT COPYINREDUCTION

Data Environment Constructs

Page 37: OpenMP

THREADPRIVATE Directive- Make global file scope variable or common blocks local and persistent to a thread- Use COPYIN clause to initialize data in THREADPRIVATE variables and common blocks

Format!$OMP THREADPRIVATE (a, b, …)

Data scope attribute clauses- PRIVATE clause

!$OMP PARALLEL PRIVATE (a,b)- SHARED clause

!$OMP PARALLEL SHARED (c,d)

Data Environment Constructs

Page 38: OpenMP

Data scope attribute clauses- FIRSTPRIVATE clause : PRIVATE with automatic initialization

!$OMP PARALLEL FIRSTPRIVATE (a,b)- LASTPRIVATE clause : PRIVATE with a copy from the last loop iteration or section to the original variable object

!$OMP PARALLEL LASTPRIVATE (a,b)- DEFAULT clause : Specify a default PRIVATE, SHARED, or NONE scope for all variables in the lexical extent of any parallel region

!$OMP PARALLEL DEFAULT (PRIVATE | SHARED | NONE)- COPYIN clause : Assign the same value to THREADPRIVATE variables for all thread in the team

!$OMP PARALLEL COPYIN (a)

Data Environment Constructs

Page 39: OpenMP

I. Introduction to OpenMP

II. OpenMP Programming Model

III. OpenMP Directives

IV. Run-Time Library Routine

V. Environment Variables

VI. Summary

Page 40: OpenMP

IV. Run-Time Library Routine

API for library calls that performs a variety functions- For C/C++ : include “omp.h”- Fortran 95 : use “omp_lib” module

Runtime environment routines : - Modify/Check the number of threads

- OMP_SET_NUM_THREADS(), OMP_GET_NUM_THREADS(), - OMP_GET_MAX_THREADS(), OMP_GET_THREAD_NUM()- Turn on/off nesting and dynamic mode

- OMP_SET_NESTED(), OMP_GET_NESTED(), - OMP_SET_DYNAMIC(), OMP_GET_DYNAMIC()

- Are we in a parallel region?- OMP_IN_PARALLEL()

- How many processors in the system?- OMP_GET_NUM_PROCS()

Page 41: OpenMP

Lock routines- Lock : A flag which can be set or unset. - Ownership of the lock : A thread who sets a given lock to get some privileges- Lock differs from other synchronization directives

Only threads related to the lock are affected by the status of the lock

- Related functions : - OMP_INIT_LOCK(), OMP_SET_LOCK(), - OMP_UNSET_LOCK(), OMP_DESTROY_LOCK(), - OMP_TEST_LOCK()

Run-Time Library Routine

Page 42: OpenMP

I. Introduction to OpenMP

II. OpenMP Programming Model

III. OpenMP Directives

IV. Run-Time Library Routine

V. Environment Variables

VI. Summary

Page 43: OpenMP

V. Environment Variables

Control how “OMP DO SCHEDULE(RUNTIME)” loop iterations are scheduled.- OMP_SCHEDULE “schedule,chunk_size”

Set the default number of threads to use- OMP_NUM_THREADS int_literal

Can the program use a different number of threads in each parallel region ?- OMP_DYNAMIC TRUE || FALSE

Will nested parallel region create new teams of threads?- OMP_NESTED TRUE || FALSE

Page 44: OpenMP

VI. Summary

OpenMP is a directive based shared memory programming model

OpenMP API is a general purpose parallel programming API with emphasis on the ability to parallelize existing programs

Scalable parallel programs can be written by using parallel regions

Work-sharing constructs enable efficient parallelization of computationally intensive portions of program