Cis017 6 revision-parallel_2015
-
Upload
abdullah-al-thani -
Category
Technology
-
view
135 -
download
0
Transcript of Cis017 6 revision-parallel_2015
CIS017-6Distributed and Parallel Architectures
Revision(Parallel Architectures)
Jon Hitchcock
Exam questions• Past papers are available from the university library
catalogue– Search for the unit code: CIS017-6
• Questions are selected from a pool– Each exam paper does not cover every topic – Each year questions are updated– New questions are added
• Newer questions tend to be broader– Less focus on remembering lists of facts– More opportunity to show what you have learnt and that you have
a wide understanding of the subject• This year there is no choice of questions
– You should answer all the questions on the paper
Possible wording of questions
• Compare THIS and THAT and critically discuss their advantages and disadvantages.
• Give an example of SOMETHING.
• Illustrate your answer with a diagram.
• Consider a SITUATION and suggest what can be done about it.
Types of parallelism• Parallelism in applications
– Data parallelism– Task parallelism
• Parallelism in hardware – Bit-level parallelism– Instruction-level parallelism– Vector architectures and GPUs– Thread-level parallelism– Request-level parallelism
Flynn’s taxonomy
• SISD : Single Instruction Single Data
• SIMD : Single Instruction Multiple Data
• MIMD : Multiple Instruction Multiple Data
Parallel Computer Implementation• Parallel computers can be roughly classified according to
the level at which the hardware supports parallelism.– Multicore processor– Shared memory multiprocessor– Cluster
• These classes are not mutually exclusive• Clusters of multicore processors are common• Low-level implementations
– Vector processor– Graphics processing unit– Spatial computing
• GPUs are often used for general-purpose parallel computation
Memory Organisation• Shared memory
– Multicore processor– Shared memory multiprocessor– Parallel Random Access Machine (PRAM) model
• Concurrent read and/or write access• Exclusive access
• Distributed memory– Cluster– Warehouse-scale computer– Passes messages to transfer data– Bulk Synchronous Parallel (BSP) model
Parallel Programming• Shared-state concurrency
– Java• Threads
– C++• Threads• Task-based concurrency
– OpenMP– GPU programming
• Message-passing concurrency– Erlang– MPI
Performance Analysis of Parallel Systems
• Scalability– Speedup
• Linear• Superlinear
– Efficiency• Amdahl’s law• Gustafson’s law
OpenMP• Parallel computing on shared memory systems• Directives
– Control Structures• parallel
– Work Sharing• sections• for
– Synchronisation• barrier
• Regions and loops
MPI• Message passing (a distributed computing style of
communication) used for parallel computing• Collective communications
– One-to-many (broadcast, scatter)– Many-to-one (reduce, gather)– Many-to-many (prefix sum, total exchange, circular shift)
• Synchronous message passing– Three-way signalling process
• Request to send• Ready-to-accept acknowledgement• Message transfer
– “Blocking” Functions in MPI• MPI_Send and MPI_Recv return when it is safe to continue
See also• The essential reading from Hennessy and Patterson (2012)
– in BREO weeks 2, 3 and 4• The essential reading from Greaves (2015)
– in BREO week 3• Tutorial questions
– in BREO weeks 2 and 4