1 Parallel Programming With OpenMP. 2 Contents Overview of Parallel Programming & OpenMP ...

Parallel Programming With OpenMP

Contents Overview of Parallel Programming & OpenMP Difference between OpenMP & MPI OpenMP Programming Model OpenMP Environment Variable OpenMP Clauses OpenMP Runtime Routines General Code Structure & Sample Examples Pros & Cons Of OpenMP Performance of one program (Serial vs Parallel)

Parallel Programming Decomposes an algorithm or data into parts, which

are processed by multiple processors simultaneously.

Co-ordinates work and communication between processors.

Threaded applications are ideal for multi-core.

OpenMP Open specifications for Multi Processing, based on

a thread paradigm. 3 primary component (Compiler Directives,

Runtime Library Routines, Environment Variables).

– Extensions for Fortran, C, C++

OpenMP vs MPI OpenMP :

Shared Memory Model Directive Based Easier to program & debug Supported by gcc4.2 & higher

MPI : Distributed Memory Model Message Passing Style More flexible & scalable Supported by MPICH2 library

OpenMP Programming Model Shared Memory, Thread Based Parallelism. Explicit Parallelism. For-Join Model

– Execution starts with one thread – master thread.

– Parallel regions fork off new threads on entry – team thread.

– Thread join back together at the end of the region – only master thread continues.

OpenMP Environment Variables

OMP_SCHEDULE OMP_NUM_THREADS OMP_DYNAMIC OMP_NESTED OMP_THREAD_LIMIT OMP_STACKSIZE

OpenMP Clauses Data Scoping Clauses (shared, private, default) InitializationClauses (firstprivate, lastprivate,

threadprivate) Data Copying Clauses (copyin, copyprivate) Worksharing Clauses (do/for directive, sections

directive, single directive, parallel do/for, parallel sections)

Scheduling Clauses (static, dynamic, guided) Synchronization Clauses (master, critical,

atomic, ordered, barrier, nowait, flush) Reduction Clause (operator: list)

OpenMP Runtime Routines To set & get number of threads :

– OMP_SET_NUM_THREADS– OMP_GET_NUM_THREADS

To get the thread number of a thread, in a team– OMP_GET_THREAD_NUM

To get the number of processors available to the program– OMP_GET_NUM_PROCS

OMP_IN_PARALLEL To enable or disable dynamic adjustment of the

number of threads– OMP_SET_DYNAMIC

OpenMP Runtime Routines Cont. To determine if dynamic thread adjustment is

enabled or not.– OMP_GET_DYNAMIC

To initialise and disassociates a lock associated with the lock variable.– OMP_INIT_LOCK– OMP_DESTROY_LOCK

To own and release a lock– OMP_SET_LOCK– OMP_UNSET_LOCK

To use clock timing routine– OMP_GET_WTICK

General Code Structure#include <omp.h>

main () {

int var1, var2, var3;

// Serial code

// Beginning of parallel section.

// Specify variable scoping

#pragma omp parallel private(var1, var2) shared(var3) {

// Parallel section executed by all threads

// All threads join master thread and disband

Resume serial code

omp keyword distinguishes the pragma as a OpenMP pragma and is processed by OpenMP compilers.

Parallel Region Example#include <omp.h>

main () {

int nthreads, tid;

/* Fork a team of threads

#pragma omp parallel private(tid) {tid = omp_get_thread_num(); /* Obtain thread id */

printf("Hello World from thread = %d\n", tid);

if (tid == 0) { /* Only master thread does this */

nthreads = omp_get_num_threads();

printf("Number of threads = %d\n", nthreads);

} /* All threads join master thread and terminate */

“for” Directive Example#include <omp.h>

#define CHUNKSIZE 10

#define N 100

main () {

int i, chunk;

float a[N], b[N], c[N];

for (i=0; i < N; i++)

a[i] = b[i] = i * 1.0;

chunk = CHUNKSIZE;

#pragma omp parallel shared(a,b,c,chunk) private(i) {

#pragma omp for schedule(dynamic,chunk) nowait for (i=0; i < N; i++) c[i] = a[i] + b[i];

} /* end of parallel section */

“sections” directive example#include <omp.h>

#define N 1000

main () {

int i;

float a[N], b[N], c[N], d[N];

for (i=0; i < N; i++) {

a[i] = i * 1.5; b[i] = i + 22.35;}

#pragma omp parallel shared(a,b,c,d) private(i) {

#pragma omp sections nowait { #pragma omp section for (i=0; i < N; i++) c[i] = a[i] + b[i];

#pragma omp section

for (i=0; i < N; i++) d[i] = a[i] * b[i];

} /* end of sections */} /* end of parallel section */

“critical” Directive Example

#include <omp.h>

main() {

int x;

x = 0;

#pragma omp parallel shared(x) {

#pragma omp critical x = x + 1;

“threadprivate” Directive Example#include <omp.h>

int a, b, i, tid; float x;

#pragma omp threadprivate(a, x)

main () {

/* Explicitly turn off dynamic threads */

omp_set_dynamic(0);

printf("1st Parallel Region:\n");

#pragma omp parallel private(b,tid) {

tid = omp_get_thread_num();

a = tid; b = tid; x = 1.1 * tid +1.0;

printf("Thread %d: a,b,x= %d %d %f\n",tid,a,b,x);

printf("Master thread doing serial work here\n");

printf("2nd Parallel Region:\n");

#pragma omp parallel private(tid {

tid = omp_get_thread_num();

printf("Thread %d: a,b,x= %d %d %f\n",tid,a,b,x);

“reduction” Clause Example

#include <omp.h>

main () {

int i, n, chunk;

float a[100], b[100], result;

n = 100 ; chunk = 10 ; result = 0.0 ;

for (i=0; i < n; i++) {

a[i] = i * 1.0 ; b[i] = i * 2.0;

#pragma omp parallel for default(shared) private(i) schedule(static,chunk) reduction(+:result)

for (i=0; i < n; i++)

result = result + (a[i] * b[i]);

printf("Final result= %f\n",result);

OpenMP - Pros and Cons Pros :

Simple Incremental Parallelism. Decomposition is handled automatically. Unified code for both serial and parallel applications.

Cons : Runs only on shared-memory multiprocessor. Scalability is limited by memory architecture. Reliable error handling is missing.

Performance of “arrayUpdate.c”Test Done on 2 GHz Intel Core 2 Duo With 1 GB 667 MHz DDR2 SDRAM

Array Size Serial (sec) Parallel (sec)

1000 0.000221 0.000389

5000 0.001060 0.000999

10000 0.002201 0.001323

50000 0.011266 0.005892

100000 0.22638 0.011715

500000 0.114033 0.068110

1000000 0.227713 0.123106

5000000 1.134773 0.579176

10000000 2.307644 1.151099

50000000 12.536466 5.772921

100000000 194.245929 58.532328

arrayUpdate.c Cont.

0 20000 40000 60000 80000 100000 1200000

Test Done on 2 GHz Intel Core 2 Duo With 1 GB 667 MHz DDR2 SDRAM

Serial (sec)

Parallel (sec)

Array Size (in 1000s)

References

• http://www.openmp.org/• Parallel Programming in OpenMP,

Morgan Kaufman Publishers.

Thank You

1 Parallel Programming With OpenMP. 2 Contents Overview of Parallel Programming & OpenMP ...

Documents

Transcript of 1 Parallel Programming With OpenMP. 2 Contents Overview of Parallel Programming & OpenMP ...

Parallel Programming: OpenMP

Parallel Programming with OpenMP

An Introduction to Parallel Programming with OpenMP

Parallel Programming Using OpenMP - Shodor · parallel programming that uses a multi-threading API called OpenMP. Requirements Knowledge in C Programming Language. C compiler that

Shared-Memory Parallel Programming with OpenMP - An ...

Parallel Programming with MPI and OpenMP

Parallel Programming in OpenMP - MDPlib.mdp.ac.id/ebook/Karya Umum/Parallel_Programming... · Parallel Programming in OpenMP. About the Authors Rohit Chandra is a chief scientist

Parallel Programming in OpenMP - USTC › zlsc › cxyy › 200910 › W...1 OpenMP - Introduction, Dieter an Mey, 18 January 2003 Parallel Programming in OpenMP Introduction Dieter

Parallel Programming using OpenMP

Parallel Programming in C with MPI and OpenMP

Introduction to Parallel Programming using OpenMP …...WestGrid – Compute Canada - Online Workshop 2017 Part - I Fundamental Basics of Parallel Programming using OpenMP Tuesday,

Chapter 3 Parallel Algorithm Design Parallel Programming · MPI and OpenMP programming model MPI: purely parallel, need parallel algorithm from the start-up OpenMP: Fork-join model,

Shared memory programming with OpenMP · OpenMP Programming 4 OpenMP Model for shared-memory parallel programming Portable across shared-memory architectures Incremental parallelization

Parallel Programming using OpenMPweb.engr.oregonstate.edu/~mjb/cs575/Handouts/openmp.1pp.pdf · OpenMP Multithreaded Programming • OpenMP stands for “Open Multi-Processing”

Introduction to OpenMP Parallel Programming · Introduction to OpenMP Parallel Programming Jemmy Hu SHARCNET HPC Consultant University of Waterloo January 20, 2016

Parallel Programming with OpenMP part 2 – OpenMP v3.0 - tasking

Parallel Programming by Tiago Sommer Damasceno Using OpenMP .

Parallel Programming in C with MPI and OpenMP

Using OpenMP. Portable Shared Memory Parallel Programming

Introduction to parallel programming via OpenMP · 2019. 11. 20. · Introduction to parallel programming via OpenMP August 20, 2019 Introduction to parallel rogrammingp via OpenMP