Lecture4
-
Upload
asad-abbas -
Category
Technology
-
view
1.021 -
download
0
description
Transcript of Lecture4
High Performance Computing
Jawwad ShamsiLecture #4
25th January 2010
Recap
• Pipelining• Super Scalar Execution– Dependencies• Branch• Data• Resource
• Effect of Latency and Memory• Effect of Parallelism
Flynn’s taxanomy
It distinguishes multiprocessor computers according to data and instruction
• the dimensions of Instruction and Data• SISD: Single Instruction Single data (Uniprocessor)• SIMD: Single Instruction Multiple data (Vector
Processing)• MISD: Multiple Instruction Single date• MIMD: Multiple Instruction Multiple data (SMP,
cluster, NUMA)
MIMD
• MIMD– Shared Memory (tightly coupled)• SMP (Symmetric Multiprocessing)• Non-Uniform Memory access
– Distributed Memory (loosely coupled)• Clusters
Taxonomy of Parallel Processor Architectures
Shared Address Space
• Shared Memory• Distributed Memory
SMP
• Two or more similar processors• Same main memory and I/O• Can perform similar operations• Share access to I/O devices
Multiprogramming and Multiprocessing
SMP Advantages
• Performance• Availability• Incremental growth• Scaling
Block Diagram of Tightly Coupled Multiprocessor
Cache Coherence
• Multiple copies of cache can maintain different data– Protocols?
Processor Design: Modes of Parallelism
• Two ways to increase parallelism– Superscaling• Instruction level parallelism
– Threading• Thread level parallelism
– Concept of Multithreaded processors» May or may not be different than OS level mult-threading
• Temporal Multi-threading (also called implicit)– Instructions from only one thread
• Simultaneous Multi-threading (explicit)– Instructions from more than one thread can be executed
Scalar Processor Approaches
• Single-threaded scalar– Simple pipeline – No multithreading
• Interleaved multithreaded scalar– Easiest multithreading to implement– Switch threads at each clock cycle– Pipeline stages kept close to fully occupied– Hardware needs to switch thread context between cycles
• Blocked multithreaded scalar– Thread executed until latency event occurs– Would stop pipeline– Processor switches to another thread
Clusters
• Alternative to SMP• High performance• High availability• Server applications
• A group of interconnected whole computers• Working together as unified resource• Illusion of being one machine• Each computer called a node
Cluster Benefits
• Absolute scalability• Incremental scalability• High availability• Superior price/performance
Cluster Configurations - Standby Server, No Shared Disk
Cluster v. SMP• Both provide multiprocessor support to high demand
applications.• Both available commercially
– SMP for longer• SMP:
– Easier to manage and control– Closer to single processor systems
• Scheduling is main difference• Less physical space• Lower power consumption
• Clustering:– Superior incremental & absolute scalability– Superior availability
• Redundancy