Complete Lessonplan Aca 12 Unki

7/27/2019 Complete Lessonplan Aca 12 Unki

1/19

B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

DEPARTMENT OF COMPUTER SCIENCE

Department of computer science & Engineering

Title of the Course: Advanced Computer Architecture Course Code: 10CS81Type of the Course: Lecture Designation: Core

Total Hrs. 52 Hrs/Week: 04

Exam Hours: 03 Exam Marks: 100

Semester: 08

Course Assignment Methods: Continuous (Three IA Tests & One Main VTU Examination)

Prerequisites:

1. Familiarity with computer organization

2. Basic concepts of cache memory and microprocessor

Syllabus:

PART - A

UNIT - 1

FUNDAMENTALS OF COMPUTER DESIGN: Introduction; Classes of computers;

Defining computer architecture; Trends in Technology, power in Integrated Circuits and cost;

Dependability; Measuring, reporting and summarizing Performance; Quantitative Principles ofcomputer design.

6 hours

UNIT - 2

PIPELINING: Introduction; Pipeline hazards; Implementation of pipeline; What makes

pipelining hard to implement?

6 Hours

UNIT - 3

INSTRUCTION LEVEL PARALLELISM 1: ILP: Concepts and challenges; Basic

Compiler Techniques for exposing ILP; Reducing Branch costs with prediction; OvercomingData hazards with Dynamic scheduling; Hardware-based speculation.

7 Hours

UNIT - 4

INSTRUCTION LEVEL PARALLELISM 2: Exploiting ILP using multiple issue and staticscheduling; Exploiting ILP using dynamic scheduling, multiple issue and speculation; Advanced

Techniques for instruction delivery and Speculation; The Intel Pentium 4 as example.

7 Hours


2/19


PART - B

UNIT - 5

MULTIPROCESSORS AND THREAD LEVEL PARALLELISM: Introduction;Symmetric shared-memory architectures; Performance of symmetric sharedmemory

multiprocessors; Distributed shared memory and directory-based coherence; Basics of

synchronization; Models of Memory Consistency.

7 Hours

UNIT - 6

REVIEW OF MEMORY HIERARCHY: Introduction; Cache performance; CacheOptimizations, Virtual memory.

6 Hours

UNIT - 7

MEMORY HIERARCHY DESIGN: Introduction; Advanced optimizations of Cacheperformance; Memory technology and optimizations; Protection: Virtual memory and virtual

machines.

6 Hours

UNIT - 8

HARDWARE AND SOFTWARE FOR VLIW AND EPIC: Introduction: ExploitingInstruction-Level Parallelism Statically; Detecting and Enhancing Loop-Level Parallelism;

Scheduling and Structuring Code for Parallelism; Hardware Support for Exposing Parallelism:

Predicated Instructions; Hardware Support for Compiler Speculation; The Intel IA-64

Architecture and Itanium Processor; Conclusions.

7 Hours

TEXT BOOK:1. Computer Architecture, A Quantitative Approach John L. Hennessey and David

A. Patterson:,4th Edition,Elsevier, 2007.

REFERENCE BOOKS:

1. Advanced Computer Architecture Parallelism, Scalability Kai Hwang:,

Programability, Tata Mc Grawhill, 2003.


3/19


2. Parallel Computer Architecture, A Hardware / Software Approach David E.Culler, Jaswinder Pal Singh, Anoop Gupta:, Morgan Kaufman, 1999.

Course Overview and its relevance to program:

The term architecture in computer literature can be traced to the work of Lyle R. Johnson,

Muhammad Usman Khan and Frederick P. Brooks, Jr., members in 1959 of the Machine

Organization department in IBMs main research center. Johnson had the opportunity to write aproprietary research communication about Stretch, an IBM-developed supercomputer for Los

Alamos Scientific Laboratory. In computer science and computer engineering, computer

architecture or digital computer organization is the conceptual design and fundamentaloperational structure of a computer system. It's a blueprint and functional description of

requirements and design implementations for the various parts of a computer, focusing largely on

the way by which the central processing unit(CPU) performs internally and accesses addressesin memory. It may also be defined as the science and art of selecting and interconnecting

hardware components to create computers that meet functional, performance and cost goals.

Computer technology has made incredible progress in the roughly from last 55 years. This rapid

rate of improvement has come both from advances in the technology used to build computersand from innovation in computer design.

Advanced computer architecture aims to develop a thorough understanding of high-performanceand energy-efficient computers as a basis for informed software performance engineering and as

a foundation for advanced work in computer architecture, compiler design, operating systems

and parallel processing.

This course contains pipelined CPU architecture instruction set design and pipeline structure,

dynamic scheduling using scoreboarding and Tomasulo's algorithm, register renaming, software

instruction scheduling and software pipelining, superscalar and long-instruction-wordarchitectures (VLIW, EPIC and Itanium), branch prediction and speculative execution.

The cache memory associativity, allocation and replacement policies, multilevel caches, cache

performance issues. uniprocessor cache coherency issues are discussed with examples.Implementations of shared memory, the cache coherency problem. the bus-based 'snooping'

protocol, scalable shared memory using directory-based cache coherency are explained with

practical examples.

Applications:

1. To understand various computer architectures currently used in market2. To understand parallel programming.

3. To design new computer architectures
http://en.wikipedia.org/wiki/Supercomputerhttp://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Computer_engineeringhttp://en.wikipedia.org/wiki/Computerhttp://en.wikipedia.org/wiki/Computerhttp://en.wikipedia.org/wiki/Blueprinthttp://en.wikipedia.org/wiki/Central_processing_unithttp://en.wikipedia.org/wiki/Central_processing_unithttp://en.wikipedia.org/wiki/Memory_addresshttp://en.wikipedia.org/wiki/Memory_addresshttp://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Computer_engineeringhttp://en.wikipedia.org/wiki/Computerhttp://en.wikipedia.org/wiki/Blueprinthttp://en.wikipedia.org/wiki/Central_processing_unithttp://en.wikipedia.org/wiki/Memory_addresshttp://en.wikipedia.org/wiki/Memory_addresshttp://en.wikipedia.org/wiki/Supercomputer


4/19


PART_A

UNIT I

UNIT WISE PLAN

Chapter Number: 1 No of Hours: 06

Unit Title: FUNDAMENTALS OF COMPUTER DESIGN

Learning Objectives:

Atthe end of this unit students will understand:

1. Classes of computers, Practical knowledge of computer architecture

2. Trends in Technology, Power in IC and cost

3. Quantitative Principles

4. Performance

5. Real processor examples

Lesson Plan:

L1. Introduction; Classes of computers

L2. Defining computer architecture

L3. Trends in Technology, power in Integrated Circuits and cost

L4. Dependability.

L5. Measuring, reporting and summarizing Performance

L6. Quantitative Principles of computer design


5/19


Assignment Questions:

Q1) Explain the growth in processor and computer performance using a graph.

Q2) Explain the different classes of computers.Q3) Define computer architecture. Discuss the 7 dimensions of ISA.Q4) Explain the meaning of following MIPS instructions and explain instruction formats.

.

Q5) List the most important functional requirements an architect faces. `Q6) Explain the different trends in technology.

Q7) Write the formulas for the following (i) Powerdynamic (ii) Energy dynamic (iii) Powerstatic A 20% reduction in voltage may result in a 10% reduction in frequency. What would be the

impact on dynamic power.

Q8) Write the formulas for the following.

(i) cost of IC (ii) cost of die (iii) dies per wafer (iv) die yield

Find the die yield for a die that is 2.0 cm on a side, assuming a defect density of 0.3 per cm2and is 4.

Q9) Explain MTTF and MTTR. Calculate reliability of a redundant power supply if MTTF of

Power supply is 5*105 hours and it takes on average 48 hours for a human operator torepair the system. Assume two power supplies are available.

Q10) Explain the different desktop and server benchmarks.


6/19


UNIT II

UNIT WISE PLAN

Chapter Number: Appendix A No of Hours: 06

Unit Title: PIPELINING



1. Pipeline basics, hazards

2. Implementation of pipeline

3. Pipeline to design parallel processors

4. Performance evalvation of pipeline processors

5. Applications of pipeline

Lesson Plan:

L1. Introduction

L2. Pipeline hazards

L3. Pipeline hazards continued

L4. Implementation of pipeline

L5. Implementation of pipeline continued

L6. What makes pipelining hard to implement?


7/19



Q1) What is pipelining? Explain the basics of RISC instruction set.

Q2) Explain the simple implementation of a RISC instruction setQ3) Explain the classic five stage pipeline for RISC processor and explain the useof pipeline registers.

Q4) Assume that unpipelined processor has a 1ns clock cycle and that it uses 4 cycles for ALU

operations and branches and 5 cycles for memory operations. Assume that the relativefrequencies of these operations are 30%, 20% and 50% respectively. Suppose that due to

clock skew and setup, pipelining the processor adds 0.3ns of overhead to the clock. Ignoring

any latency impact, how much speedup in the instruction execution rate will we gain from apipeline?

Q5) Explain the major hurdles of pipelining-pipeline hazards in brief.

Q6) Explain in detail the data hazard with an example.

Q7) Discuss branch hazards along with reducing pipeline branch penalties andscheduling branch delay slot.

Q8) Explain the simple implementation of MIPS with a neat diagram

Q9) Explain the basic pipeline for MIPS and discus implementation of controlfor MIPS & branches.

Q10) Explain the five categories of exceptions.


8/19


UNIT III

UNIT WISE PLAN


Unit Title: INSTRUCTION LEVEL PARALLELISM 1



1. Parallel processing using ILP2. Static and Dynamic scheduling

3. Speculation

4. Implementation of scheduling algorithms.

5. Implementation of reducing branch costs

Lesson Plan:

L1. ILP: Concepts and challenges

L2. Basic Compiler Techniques for exposing ILP

L3. Reducing Branch costs with prediction

L4. Reducing Branch costs with prediction -Examples.

L5. Overcoming Data hazards with Dynamic scheduling

L6. Overcoming Data hazards with Dynamic scheduling- Examples

L7. Hardware-based speculation


9/19



Q1) What is ILP? What are the ILP Concepts and challenges?Q2) Discuss data dependences and hazards.

Q3) Discuss control dependences with examples .

Q4) Explain the basic Compiler Techniques for exposing ILP Examples.Q5) Explain the methods for reducing branch costs with prediction.

Q6) Explain the method for overcoming Data hazards with Dynamic scheduling.

Q7) Explain the various fields in reservation station with an example.

Q8) Explain tomasulo algorithm using loop based example.Q9) Explain hardware-based speculation and explain the basic structure of a FP

unit using tomasulo algorithm and extended to handle speculation.


10/19


UNIT IV

UNIT WISE PLAN


Unit Title: INSTRUCTION LEVEL PARALLELISM 2



1. ILP-multiple issue and static scheduling2. Dynamic scheduling

3. Instruction delivery

4. Exploiting ILP

5. Intel Pentium 4 for understanding ILP

Lesson Plan:

L1. Exploiting ILP using multiple issue and static scheduling

L2. Exploiting ILP using dynamic scheduling, multiple issue and speculation

L3. Exploiting ILP using dynamic scheduling, multiple issue and speculation-examples

L4. Advanced Techniques for instruction delivery and Speculation

L5. Advanced Techniques for instruction delivery and Speculation-examples

L6. The Intel Pentium 4 as example.

L7. The Intel Pentium 4 as example-analysis


11/19



Q1) List the five primary approaches in use for multiple-issue processors and

their primary characteristics.Q2) Explain the basic VLIW approach for exploiting ILP using an example.

Q3) Explain exploiting ILP using dynamic scheduling, multiple issue and speculation

examples.Q4) Explain increasing instruction fetch bandwidth for instruction delivery and

Speculation.

Q5) Explain increasing instruction fetch bandwidth for instruction delivery and

Speculation.Q6) Explain the Pentium 4 microarchitecture with a neat diagram

Q7) List the important characteristics of the recent pentiun 4 640

Q8) Explain the analysis of the perfiormance of the Pentium 4.


12/19


PART_B

UNIT V

UNIT WISE PLAN


Unit Title: MULTIPROCESSORS AND THREAD LEVEL PARALLELISM



1. Multiprocessors2. Shared-memory architectures

3. Distributed shared memory

4. Performance of symmetric sharedmemory multiprocessors

5. Synchronization and Memory Consistency

Lesson Plan:

L1. Introduction to multiprocessors

L2. Symmetric shared-memory architectures

L3. Performance of symmetric sharedmemory multiprocessors

L4. Distributed shared memory

L5. Directory-based coherence;

L6. Basics of synchronization

L7. Models of Memory Consistency


13/19



Q1) Explain the taxonomy of parallel architectures and draw the basic structure of shared

memory and distributed memory multiprocessor

Q2) Suppose you want to achieve a speedup of 80 with 100 processors. What fraction of the

original computation can be sequential?Q3) What is multiprocessor cache coherence? Explain with an example.

Q4) What are the basic schemes for enforcing coherence? Explain in brief .

Q5) Explain Snooping protocols and basic implementation techniques with an exampleprotocol

Q6) Explain Performance of symmetric sharedmemory multiprocessors

for a commercial workload

Q7) Explain distributed shared memory and directory-based coherence withan example protocol.

Q8) Explain basics of synchronization

Q9) Explain Models of Memory Consistency


14/19


UNIT VI

UNIT WISE PLAN

Chapter Number: Appendix C No of Hours: 06

Unit Title: REVIEW OF MEMORY HIERARCHY



1. Cache memory

2. Virtual memory

3. Mathematical and theory aspects of cache

4. Problems based on cache

5. Cache Optimization methods

Lesson Plan:

L1. Introduction

L2. Cache performance

L3. Cache Optimizations

L4. Virtual memory

L5. Numerical Problems-1

L6. Numerical Problems-2



15/19


Q1) Assume we have a computer where the CPI is 2.0 when all memory accesses hit in the

cache. The only data accesses are loads and stores ,and these total 40% of the instructions.

If the miss penalty is 35 clock cycles and the miss rare is 3%, how much faster would thecomputer be if all instructions were cache hits?

Q2) What do you mean by memory stall cycles? List the different formulae for memory stallcycles.

Q3) Explain different block placement methods with neat diagrams.

Q4) Explain the following terms: (i)Write through (ii) Write back (iii) Write stall and Write

buffer (iv) Write allocate (v) No-write allocateQ5) Explain the organization of Opteron data cache with a neat diagram.

Q6) Explain multilevel caches to reduce miss penalty. Discuss average memory access time,

local miss rate, global miss rate w.r.t. multilevel caches.Q7) Suppose that in 1000 memory references there are 50 misses in the first level cache and 30

misses in the second level cache. What are the various miss rates? Assume the miss penalty

from L2 cache to memory is 250 clock cycles, the hit time of L2 cache is 15 clock cycles,the hit time of L1 is 1 clock cycle, and there are 1.4 memory references per instruction.

What is the average memory access time and average stall cycles per instruction?

Q8) Compare paging and segmentation with neat diagrams.Q9) List the typical levels in memory hierarchy with their important features.


16/19


UNIT VII

UNIT WISE PLAN

Chapter Number: No of Hours: 06

Unit Title: REVIEW OF MEMORY HIERARCHY



1. Memory hierarchy and cache optimization

2. Memory technology

3. Virual machines

4. Cache performance

5. Protection using virtual memory and virtual machines

Lesson Plan:

L1. Introduction to memory hierarchy design

L2. Advanced optimizations of Cache performance

L3. Memory technology and optimizations

L4. Protection: Virtual memory

L5. Virtual machines.L6. Numerical problems



17/19


Q1) Explain the following optimizations techniques which reduce hit time.

(i)Small and simple caches (ii) Way prediction (iii) Trace caches

Q2) Explain the compiler optimization techniques to reduce miss rate.Q3) Differentiate between SRAM and DRAM. Draw the internal organization of

64M bit DRAM.

Q4) List eleven advanced optimizations of cache performance and explain any one.Q5) Explain optimizations techniques for increasing cache bandwidth.

Q6) Explain memory technology and optimizations.

Q7)Explainoptimizations techniques for Reducing miss penalty.

Q8) Explain protection via virtual memory.Q9) Explain protection via virtual machines.

Q10) Explain Xen virtual machine .


18/19


UNIT VIII

UNIT WISE PLAN

Chapter Number: Appendix G No of Hours: 07

Unit Title: HARDWARE AND SOFTWARE FOR VLIW AND EPIC



1. VLIW.

2. EPIC

3. Intel IA-64 Architecture, Itanium Processor

4. Loop-Level Parallelism, Code for Parallelism

5. Hardware Support for Parallelism

Lesson Plan:

L1. Introduction: Exploiting Instruction-Level Parallelism Statically

L2. Detecting and Enhancing Loop-Level Parallelism

L3. Scheduling and Structuring Code for Parallelism

L4. Hardware Support for Exposing Parallelism: Predicated Instructions

L5. Hardware Support for Compiler Speculation

L6. The Intel IA-64 Architecture

L7. Itanium Processor; Conclusions.


19/19



Q1) Explain the methods, advantages and disadvantages for exploiting

instruction-level parallelism statically.Q2) Explain the methods for detecting and enhancing loop-level parallelism

Q3)Explain software pipelining using symbolic loop unrolling.

Q4) Explain global code schedulingQ5) Explain hardware support for exposing parallelism using predicated

instructions in detail.

Q6)Explainhardware support for compiler speculation.

Q7) Explain superblocks using a flowchartQ8) Explain IA-64 instruction set architecture

Q9) Explain Itanium 2 processor in detail.

Complete Lessonplan Aca 12 Unki

Documents

Transcript of Complete Lessonplan Aca 12 Unki