PDC Lesson2
-
Upload
gulraiz-khalid -
Category
Documents
-
view
218 -
download
0
Transcript of PDC Lesson2
-
8/12/2019 PDC Lesson2
1/13
Parallel and DistributedComputing
Dr. Haroon Mahmood
Assistant Professor
University of Central Punjab, Lahore
Date: 04-04-2014
-
8/12/2019 PDC Lesson2
2/13
2Parallel and Distributed Computing
Dr. Haroon Mahmood
Evaluation Criteria
Overview of previous lecture
Amdahls law Multiprocessor system classification by
Flynn taxonomy
Lecture Outline
-
8/12/2019 PDC Lesson2
3/13
3Parallel and Distributed Computing
Dr. Haroon Mahmood
Marks Distribution
Quizzes/Assignments 15%
Presentation 15 %
Midterm 20 %
50 %
-
8/12/2019 PDC Lesson2
4/13
4Parallel and Distributed Computing
Dr. Haroon Mahmood
Motivation for Parallel and Distributed Computing
Uniprocessor are fast but some problems
require too much computation
problems use too much data
have too many parameters to explore
Parallel and distributed Systems
-
8/12/2019 PDC Lesson2
5/13
5Parallel and Distributed Computing
Dr. Haroon Mahmood
Parallel and distributed systems
Parallel and distributed computing is going to be
more and more important Dual and quad core processors are very common
Up to six and eight cores for each CPU
Multithreading is growing
Hardware structure or architecture is importantfor understanding how much it is possible tospeed up beyond a single CPU
Also capability of compilers to generate efficientcode is very important
It is always difficult to distinguish between HWand SW influences
-
8/12/2019 PDC Lesson2
6/13
6Parallel and Distributed Computing
Dr. Haroon Mahmood
Amdahls law (1)
1
Speedup =
(1-p)+p/n
p is the ratio of parallelizablecodeover totalexecution time,from 0 to 1
n is the number of processors the code can use
If there are 4 processors and only 10% of the code is parallelizable
Speedup = 1/(0.9+(0.1/4))
= 1.081
that is only 8% with 4 processors!!
-
8/12/2019 PDC Lesson2
7/137
Parallel and Distributed Computing
Dr. Haroon Mahmood
Flynns Classical Taxonomy
Most Widely used parallel computer classifications
Distinguishes multiprocessor computers accordingto the dimensions of instruction and data
SISD: Single instruction stream, single data stream SIMD: Single instruction stream, multiple data
stream
MISD:Multiple instruction stream , single data
stream
MIMD:Multiple instruction stream, multiple datastream
-
8/12/2019 PDC Lesson2
8/138
Parallel and Distributed Computing
Dr. Haroon Mahmood
Processor organizations
Single instruction,
single data
stream (SISD)
Single instruction,
multiple data
stream (SIMD)
Multiple instruction,
single data
stream (MISD)
Multiple
instruction,
multiple data
stream (MIMD)
Uniprocessor Vectorprocessor
Arrayprocessor
Shared memory(tightly coupled)
Distributed memory(loosely coupled)
ClustersSymmetricmultiprocessor
(SMP)
Nonuniform
memory access
(NUMA)
-
8/12/2019 PDC Lesson2
9/139
Parallel and Distributed Computing
Dr. Haroon Mahmood
SISD (1)
A serial (non-parallel computer)
Single instruction: one instructionper cycle
Single data: only one data stream
per cycle Easy and deterministic execution
Example:
Single CPU workstations Most workstations from HP, IBM
and SGI are SISD machines
Load A
C = A + B
Load B
Store C
-
8/12/2019 PDC Lesson2
10/13
10Parallel and Distributed Computing
Dr. Haroon Mahmood
SISD (2)
Performance of a processor can be
measured with:
MIPS rate = f x IPC (instructions per cycle)
How to increase performance:
increasing clock frequency
increasing number of instructions completedduring a processor cycle (multiple pipelines in
a superscalar architecture and/or out of orderexecution)
multithreading
-
8/12/2019 PDC Lesson2
11/13
11Parallel and Distributed Computing
Dr. Haroon Mahmood
SISD(3)- Multithreading
Implicit multithreading
concurrent execution of multiple threadsextracted from a single sequential program
Managed by processor hardware
Improve individual application performance
Explicit multithreading
concurrent execution of instructions fromdifferent explicit threads, either by interleavinginstructions from different threads or by parallelexecution on parallel pipelines
-
8/12/2019 PDC Lesson2
12/13
12Parallel and Distributed Computing
Dr. Haroon Mahmood
SISD-Explicit Multithreading
Four approaches for explicit multithreading
Interleaved multithreading (fine-grained): switchingcan be at each clock cycle. In case of few activethreads, performance degrades
Blocked multithreading (coarse-grained): events like
cache miss produce switch Simultaneous multithreading (SMT): execution units of
a superscalar processor receive instructions frommultiple threads
Chip multiprocessing: e.g. dual core (not SISD)
Architectures like IA-64 Very Long Instruction Word(VLIW) allow multiple instructions (to be executed inparallel) in a single word
-
8/12/2019 PDC Lesson2
13/13
13Parallel and Distributed Computing
Dr. Haroon Mahmood
Intels hyper threading Technology
A single physical processor appears as two logical
processors by applying two-threaded SMT approach Each logical processor maintains a complete set of
architecture state (general-purpose registers, controlregisters,)
Logical processors share nearly all other resourcessuch as caches, execution units, branch predictors,control logic and buses
Partitioned resources are recombined when only one
thread is active Add less than 5% to the relative chip size
Improve performance by 16% to 28%