PP16 Lec6 Arch5.Problems

9
1.1 Parallel Processing sp2016 lec#6 Dr M Shamim Baig 

Transcript of PP16 Lec6 Arch5.Problems

8/16/2019 PP16 Lec6 Arch5.Problems

http://slidepdf.com/reader/full/pp16-lec6-arch5problems 1/9

1.1

Parallel Processingsp2016

lec#6

Dr M Shamim Baig 

8/16/2019 PP16 Lec6 Arch5.Problems

http://slidepdf.com/reader/full/pp16-lec6-arch5problems 2/9

Problems:

Explicit Parallel Architectures

1.2

8/16/2019 PP16 Lec6 Arch5.Problems

http://slidepdf.com/reader/full/pp16-lec6-arch5problems 3/9

1.3

Consider a !"!ultiprocessor using

32"bit $C processors running at 1%0

!&'( carries out one instruction percloc) c*cle. Assume 1%+ data"load ,

10+ data"store instructions using

shared data -us haing 2/-sec -.Compute !ax number o processors

possible to connect on the aboe -us

or olloing parallel conigurations:"

Example Problem1:

 -us based !"!ultiprocessor

4imit o Parallelism

8/16/2019 PP16 Lec6 Arch5.Problems

http://slidepdf.com/reader/full/pp16-lec6-arch5problems 4/9

1.5

a7 !P ithout cache memor*7

b7 !P ith cache memor*haing hit"ratio o 8%+ ,

memor* rite"through polic*

c7 9!A ith program 4ocalit*

actor ; <0 +

Example Problems:

 -us based !"!ultiprocessor:

4imit o Parallelism=.cont’d 

8/16/2019 PP16 Lec6 Arch5.Problems

http://slidepdf.com/reader/full/pp16-lec6-arch5problems 5/9

1.%

Bus-based interconnects (a) with no local caches; (b) with local memory/caches.

Since much of the data accessed by processors is local to the processor, a

local memory can improve the performance of bus-based machines. Eample!!

!P ! , hared -us $9

8/16/2019 PP16 Lec6 Arch5.Problems

http://slidepdf.com/reader/full/pp16-lec6-arch5problems 6/9

1.6

"#$ % &"#$ $rch Bloc' iarams

*ypical shared-address-space architectures+ (a) "niform-memory access shared-address-space

computer; (b) "niform-memory-access shared-address-space computer with caches and memories;(c) &on-uniform-memory-access shared-address-space computer with local memory only.

!

!

!

 

!A C!> A7 9!A ?!> A; ?!7

-oth are !"

multiprocessors

diering in!emor* Access

?ela* ormat

8/16/2019 PP16 Lec6 Arch5.Problems

http://slidepdf.com/reader/full/pp16-lec6-arch5problems 7/9

&omeor):

sel assessed problems

Please mar) *our solution , note

the mar)s *ou achieed@@@@@@@

1.

8/16/2019 PP16 Lec6 Arch5.Problems

http://slidepdf.com/reader/full/pp16-lec6-arch5problems 8/91.<

Eample roblem+

 #essae assin #ulticomputer,

ocal vs emote memory data access delays

Consider 65"node multicomputer( each node comprises o32"bit $C processor haing 2%0 !&' cloc) rate , < !-

local memor*. Bhe 4ocal memor* access reuires 5 cloc)

c*cles( remote comm initiate setup7 oerhead is 1% cloc)

c*cles , the $nterconnection 9etor) - is <0 !-sec.Botal number o instructions executed are 200(000.

$ memor* data load , store are 1%+ , 10+ respectiel*

o the instructions( compute:"

a74oad store time i all accesses are to local nodesb74oad store time i 20+ o accesses are to remote nodes

note: Assume Packet lengths are variable (depend on addr

& data bytes & communication protocol given (S0!!!).

(note: the si!e o" message packet "ields is in multiple o" bytes

8/16/2019 PP16 Lec6 Arch5.Problems

http://slidepdf.com/reader/full/pp16-lec6-arch5problems 9/91.8

Processor 

1nterconnection

networ'

4ocal

0omputers

#essaes

memor*

Eample roblem: #cont’d 

 #essae assin #ulticomputer,

ocal vs emote memory data access delays