Parallel System Performancerijurekha/col380/lec3.pdf · 2020-01-07 · Well-known performance...

Post on 14-Apr-2020

1 views 0 download

Transcript of Parallel System Performancerijurekha/col380/lec3.pdf · 2020-01-07 · Well-known performance...

Parallel System Performance

Jan 6, 2020

Course outline (Pacheco; GGKK; Quinn)

● Motivation (1;1;1)● How to quantify performance improvement (2.6; 5; 7)● Parallel hardware architecture (2.2-2.3; 2,4; 2)● Parallel programming frameworks

○ Pthreads for shared memory (4; 7; -)○ OpenMP for shared memory (5; 7.10; 17)○ MPI for distributed memory (3; 6; 4)○ CUDA/OpenCL for GPU, ○ Hadoop/Spark/Mapreduce for distributed systems

● Parallel program verification● Parallel algorithm design● Some case studies

Why is performance analysis important?● Being able to accurately predict the performance of a parallel algorithm

○ can help decide whether to actually go to the trouble of coding and debugging it.

● Being able to analyze the execution time exhibited by a parallel program○ Can help understand barriers to higher performance○ Can help predict how much improvement can be realized by increasing number of processors

Well-known performance prediction formulas● Amdahl’s Law

○ Help decide whether a program merits parallelization

● Gustafson-Barsi’s Law○ Way to evaluate performance of a parallel program

● Karp-Flatt metric○ Decide whether the principal barrier to speedup is the amount of inherently sequential code or

parallel overhead

● Iso-efficiency metric○ Way to evaluate the scalability of a parallel algorithm executing on a parallel computer. Help

choose the design that will achieve higher performance when the number of processors increase.

Operations performed by a parallel algorithm● Computations that must be performed sequentially σ(n)● Computations that can be performed in parallel φ(n)● Parallel overhead (communication operations and redundant computations) κ(n,p)

Operations performed by a parallel algorithm● Computations that must be performed sequentially σ(n)● Computations that can be performed in parallel φ(n)● Parallel overhead (communication operations and redundant computations) κ(n,p)

Speedup and efficiencyWe design and implement parallel programs in the hope that they will run faster than their sequential counterparts.

● Speedup = (Sequential execution time)/(Parallel execution time)

● Efficiency = Speedup/(Processors used)

Speedup and efficiencyWe design and implement parallel programs in the hope that they will run faster than their sequential counterparts.

● Speedup = (Sequential execution time)/(Parallel execution time)

● Efficiency = Speedup/(Processors used)

Amdahl’s Law

Amdahl’s Law

Numerical Examples

Numerical Examples

Limitations of Amdahl’s Law

Limitations of Amdahl’s Law

Amdahl Effect

Gustafson-Barsi’s Law

Gustafson-Barsi’s Law

Numerical Examples

The Karp-Flatt Metric

The Karp-Flatt Metric

Numerical Examples

Numerical Examples

Iso-efficiency metric

Iso-efficiency metric

Iso-efficiency metric

Course outline (Pacheco; GGKK; Quinn)

● Motivation (1;1;1)● How to quantify performance improvement (2.6; 5; 7)● Parallel hardware architecture (2.2-2.3; 2,4; 2)● Parallel programming frameworks

○ Pthreads for shared memory (4; 7; -)○ OpenMP for shared memory (5; 7.10; 17)○ MPI for distributed memory (3; 6; 4)○ CUDA/OpenCL for GPU, ○ Hadoop/Spark/Mapreduce for distributed systems

● Parallel program verification● Parallel algorithm design● Some case studies