Complete Lessonplan Aca 12 Unki

download Complete Lessonplan Aca 12 Unki

of 19

Transcript of Complete Lessonplan Aca 12 Unki

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    1/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    DEPARTMENT OF COMPUTER SCIENCE

    Department of computer science & Engineering

    Title of the Course: Advanced Computer Architecture Course Code: 10CS81Type of the Course: Lecture Designation: Core

    Total Hrs. 52 Hrs/Week: 04

    Exam Hours: 03 Exam Marks: 100

    Semester: 08

    Course Assignment Methods: Continuous (Three IA Tests & One Main VTU Examination)

    Prerequisites:

    1. Familiarity with computer organization

    2. Basic concepts of cache memory and microprocessor

    Syllabus:

    PART - A

    UNIT - 1

    FUNDAMENTALS OF COMPUTER DESIGN: Introduction; Classes of computers;

    Defining computer architecture; Trends in Technology, power in Integrated Circuits and cost;

    Dependability; Measuring, reporting and summarizing Performance; Quantitative Principles ofcomputer design.

    6 hours

    UNIT - 2

    PIPELINING: Introduction; Pipeline hazards; Implementation of pipeline; What makes

    pipelining hard to implement?

    6 Hours

    UNIT - 3

    INSTRUCTION LEVEL PARALLELISM 1: ILP: Concepts and challenges; Basic

    Compiler Techniques for exposing ILP; Reducing Branch costs with prediction; OvercomingData hazards with Dynamic scheduling; Hardware-based speculation.

    7 Hours

    UNIT - 4

    INSTRUCTION LEVEL PARALLELISM 2: Exploiting ILP using multiple issue and staticscheduling; Exploiting ILP using dynamic scheduling, multiple issue and speculation; Advanced

    Techniques for instruction delivery and Speculation; The Intel Pentium 4 as example.

    7 Hours

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    2/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    PART - B

    UNIT - 5

    MULTIPROCESSORS AND THREAD LEVEL PARALLELISM: Introduction;Symmetric shared-memory architectures; Performance of symmetric sharedmemory

    multiprocessors; Distributed shared memory and directory-based coherence; Basics of

    synchronization; Models of Memory Consistency.

    7 Hours

    UNIT - 6

    REVIEW OF MEMORY HIERARCHY: Introduction; Cache performance; CacheOptimizations, Virtual memory.

    6 Hours

    UNIT - 7

    MEMORY HIERARCHY DESIGN: Introduction; Advanced optimizations of Cacheperformance; Memory technology and optimizations; Protection: Virtual memory and virtual

    machines.

    6 Hours

    UNIT - 8

    HARDWARE AND SOFTWARE FOR VLIW AND EPIC: Introduction: ExploitingInstruction-Level Parallelism Statically; Detecting and Enhancing Loop-Level Parallelism;

    Scheduling and Structuring Code for Parallelism; Hardware Support for Exposing Parallelism:

    Predicated Instructions; Hardware Support for Compiler Speculation; The Intel IA-64

    Architecture and Itanium Processor; Conclusions.

    7 Hours

    TEXT BOOK:1. Computer Architecture, A Quantitative Approach John L. Hennessey and David

    A. Patterson:,4th Edition,Elsevier, 2007.

    REFERENCE BOOKS:

    1. Advanced Computer Architecture Parallelism, Scalability Kai Hwang:,

    Programability, Tata Mc Grawhill, 2003.

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    3/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    2. Parallel Computer Architecture, A Hardware / Software Approach David E.Culler, Jaswinder Pal Singh, Anoop Gupta:, Morgan Kaufman, 1999.

    Course Overview and its relevance to program:

    The term architecture in computer literature can be traced to the work of Lyle R. Johnson,

    Muhammad Usman Khan and Frederick P. Brooks, Jr., members in 1959 of the Machine

    Organization department in IBMs main research center. Johnson had the opportunity to write aproprietary research communication about Stretch, an IBM-developed supercomputer for Los

    Alamos Scientific Laboratory. In computer science and computer engineering, computer

    architecture or digital computer organization is the conceptual design and fundamentaloperational structure of a computer system. It's a blueprint and functional description of

    requirements and design implementations for the various parts of a computer, focusing largely on

    the way by which the central processing unit(CPU) performs internally and accesses addressesin memory. It may also be defined as the science and art of selecting and interconnecting

    hardware components to create computers that meet functional, performance and cost goals.

    Computer technology has made incredible progress in the roughly from last 55 years. This rapid

    rate of improvement has come both from advances in the technology used to build computersand from innovation in computer design.

    Advanced computer architecture aims to develop a thorough understanding of high-performanceand energy-efficient computers as a basis for informed software performance engineering and as

    a foundation for advanced work in computer architecture, compiler design, operating systems

    and parallel processing.

    This course contains pipelined CPU architecture instruction set design and pipeline structure,

    dynamic scheduling using scoreboarding and Tomasulo's algorithm, register renaming, software

    instruction scheduling and software pipelining, superscalar and long-instruction-wordarchitectures (VLIW, EPIC and Itanium), branch prediction and speculative execution.

    The cache memory associativity, allocation and replacement policies, multilevel caches, cache

    performance issues. uniprocessor cache coherency issues are discussed with examples.Implementations of shared memory, the cache coherency problem. the bus-based 'snooping'

    protocol, scalable shared memory using directory-based cache coherency are explained with

    practical examples.

    Applications:

    1. To understand various computer architectures currently used in market2. To understand parallel programming.

    3. To design new computer architectures

    http://en.wikipedia.org/wiki/Supercomputerhttp://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Computer_engineeringhttp://en.wikipedia.org/wiki/Computerhttp://en.wikipedia.org/wiki/Computerhttp://en.wikipedia.org/wiki/Blueprinthttp://en.wikipedia.org/wiki/Central_processing_unithttp://en.wikipedia.org/wiki/Central_processing_unithttp://en.wikipedia.org/wiki/Memory_addresshttp://en.wikipedia.org/wiki/Memory_addresshttp://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Computer_engineeringhttp://en.wikipedia.org/wiki/Computerhttp://en.wikipedia.org/wiki/Blueprinthttp://en.wikipedia.org/wiki/Central_processing_unithttp://en.wikipedia.org/wiki/Memory_addresshttp://en.wikipedia.org/wiki/Memory_addresshttp://en.wikipedia.org/wiki/Supercomputer
  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    4/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    PART_A

    UNIT I

    UNIT WISE PLAN

    Chapter Number: 1 No of Hours: 06

    Unit Title: FUNDAMENTALS OF COMPUTER DESIGN

    Learning Objectives:

    Atthe end of this unit students will understand:

    1. Classes of computers, Practical knowledge of computer architecture

    2. Trends in Technology, Power in IC and cost

    3. Quantitative Principles

    4. Performance

    5. Real processor examples

    Lesson Plan:

    L1. Introduction; Classes of computers

    L2. Defining computer architecture

    L3. Trends in Technology, power in Integrated Circuits and cost

    L4. Dependability.

    L5. Measuring, reporting and summarizing Performance

    L6. Quantitative Principles of computer design

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    5/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    Assignment Questions:

    Q1) Explain the growth in processor and computer performance using a graph.

    Q2) Explain the different classes of computers.Q3) Define computer architecture. Discuss the 7 dimensions of ISA.Q4) Explain the meaning of following MIPS instructions and explain instruction formats.

    .

    Q5) List the most important functional requirements an architect faces. `Q6) Explain the different trends in technology.

    Q7) Write the formulas for the following (i) Powerdynamic (ii) Energy dynamic (iii) Powerstatic A 20% reduction in voltage may result in a 10% reduction in frequency. What would be the

    impact on dynamic power.

    Q8) Write the formulas for the following.

    (i) cost of IC (ii) cost of die (iii) dies per wafer (iv) die yield

    Find the die yield for a die that is 2.0 cm on a side, assuming a defect density of 0.3 per cm2and is 4.

    Q9) Explain MTTF and MTTR. Calculate reliability of a redundant power supply if MTTF of

    Power supply is 5*105 hours and it takes on average 48 hours for a human operator torepair the system. Assume two power supplies are available.

    Q10) Explain the different desktop and server benchmarks.

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    6/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    UNIT II

    UNIT WISE PLAN

    Chapter Number: Appendix A No of Hours: 06

    Unit Title: PIPELINING

    Learning Objectives:

    Atthe end of this unit students will understand:

    1. Pipeline basics, hazards

    2. Implementation of pipeline

    3. Pipeline to design parallel processors

    4. Performance evalvation of pipeline processors

    5. Applications of pipeline

    Lesson Plan:

    L1. Introduction

    L2. Pipeline hazards

    L3. Pipeline hazards continued

    L4. Implementation of pipeline

    L5. Implementation of pipeline continued

    L6. What makes pipelining hard to implement?

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    7/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    Assignment Questions:

    Q1) What is pipelining? Explain the basics of RISC instruction set.

    Q2) Explain the simple implementation of a RISC instruction setQ3) Explain the classic five stage pipeline for RISC processor and explain the useof pipeline registers.

    Q4) Assume that unpipelined processor has a 1ns clock cycle and that it uses 4 cycles for ALU

    operations and branches and 5 cycles for memory operations. Assume that the relativefrequencies of these operations are 30%, 20% and 50% respectively. Suppose that due to

    clock skew and setup, pipelining the processor adds 0.3ns of overhead to the clock. Ignoring

    any latency impact, how much speedup in the instruction execution rate will we gain from apipeline?

    Q5) Explain the major hurdles of pipelining-pipeline hazards in brief.

    Q6) Explain in detail the data hazard with an example.

    Q7) Discuss branch hazards along with reducing pipeline branch penalties andscheduling branch delay slot.

    Q8) Explain the simple implementation of MIPS with a neat diagram

    Q9) Explain the basic pipeline for MIPS and discus implementation of controlfor MIPS & branches.

    Q10) Explain the five categories of exceptions.

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    8/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    UNIT III

    UNIT WISE PLAN

    Chapter Number: 2 No of Hours: 07

    Unit Title: INSTRUCTION LEVEL PARALLELISM 1

    Learning Objectives:

    Atthe end of this unit students will understand:

    1. Parallel processing using ILP2. Static and Dynamic scheduling

    3. Speculation

    4. Implementation of scheduling algorithms.

    5. Implementation of reducing branch costs

    Lesson Plan:

    L1. ILP: Concepts and challenges

    L2. Basic Compiler Techniques for exposing ILP

    L3. Reducing Branch costs with prediction

    L4. Reducing Branch costs with prediction -Examples.

    L5. Overcoming Data hazards with Dynamic scheduling

    L6. Overcoming Data hazards with Dynamic scheduling- Examples

    L7. Hardware-based speculation

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    9/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    Assignment Questions:

    Q1) What is ILP? What are the ILP Concepts and challenges?Q2) Discuss data dependences and hazards.

    Q3) Discuss control dependences with examples .

    Q4) Explain the basic Compiler Techniques for exposing ILP Examples.Q5) Explain the methods for reducing branch costs with prediction.

    Q6) Explain the method for overcoming Data hazards with Dynamic scheduling.

    Q7) Explain the various fields in reservation station with an example.

    Q8) Explain tomasulo algorithm using loop based example.Q9) Explain hardware-based speculation and explain the basic structure of a FP

    unit using tomasulo algorithm and extended to handle speculation.

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    10/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    UNIT IV

    UNIT WISE PLAN

    Chapter Number: 2 No of Hours: 07

    Unit Title: INSTRUCTION LEVEL PARALLELISM 2

    Learning Objectives:

    Atthe end of this unit students will understand:

    1. ILP-multiple issue and static scheduling2. Dynamic scheduling

    3. Instruction delivery

    4. Exploiting ILP

    5. Intel Pentium 4 for understanding ILP

    Lesson Plan:

    L1. Exploiting ILP using multiple issue and static scheduling

    L2. Exploiting ILP using dynamic scheduling, multiple issue and speculation

    L3. Exploiting ILP using dynamic scheduling, multiple issue and speculation-examples

    L4. Advanced Techniques for instruction delivery and Speculation

    L5. Advanced Techniques for instruction delivery and Speculation-examples

    L6. The Intel Pentium 4 as example.

    L7. The Intel Pentium 4 as example-analysis

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    11/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    Assignment Questions:

    Q1) List the five primary approaches in use for multiple-issue processors and

    their primary characteristics.Q2) Explain the basic VLIW approach for exploiting ILP using an example.

    Q3) Explain exploiting ILP using dynamic scheduling, multiple issue and speculation

    examples.Q4) Explain increasing instruction fetch bandwidth for instruction delivery and

    Speculation.

    Q5) Explain increasing instruction fetch bandwidth for instruction delivery and

    Speculation.Q6) Explain the Pentium 4 microarchitecture with a neat diagram

    Q7) List the important characteristics of the recent pentiun 4 640

    Q8) Explain the analysis of the perfiormance of the Pentium 4.

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    12/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    PART_B

    UNIT V

    UNIT WISE PLAN

    Chapter Number: 4 No of Hours: 07

    Unit Title: MULTIPROCESSORS AND THREAD LEVEL PARALLELISM

    Learning Objectives:

    Atthe end of this unit students will understand:

    1. Multiprocessors2. Shared-memory architectures

    3. Distributed shared memory

    4. Performance of symmetric sharedmemory multiprocessors

    5. Synchronization and Memory Consistency

    Lesson Plan:

    L1. Introduction to multiprocessors

    L2. Symmetric shared-memory architectures

    L3. Performance of symmetric sharedmemory multiprocessors

    L4. Distributed shared memory

    L5. Directory-based coherence;

    L6. Basics of synchronization

    L7. Models of Memory Consistency

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    13/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    Assignment Questions:

    Q1) Explain the taxonomy of parallel architectures and draw the basic structure of shared

    memory and distributed memory multiprocessor

    Q2) Suppose you want to achieve a speedup of 80 with 100 processors. What fraction of the

    original computation can be sequential?Q3) What is multiprocessor cache coherence? Explain with an example.

    Q4) What are the basic schemes for enforcing coherence? Explain in brief .

    Q5) Explain Snooping protocols and basic implementation techniques with an exampleprotocol

    Q6) Explain Performance of symmetric sharedmemory multiprocessors

    for a commercial workload

    Q7) Explain distributed shared memory and directory-based coherence withan example protocol.

    Q8) Explain basics of synchronization

    Q9) Explain Models of Memory Consistency

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    14/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    UNIT VI

    UNIT WISE PLAN

    Chapter Number: Appendix C No of Hours: 06

    Unit Title: REVIEW OF MEMORY HIERARCHY

    Learning Objectives:

    Atthe end of this unit students will understand:

    1. Cache memory

    2. Virtual memory

    3. Mathematical and theory aspects of cache

    4. Problems based on cache

    5. Cache Optimization methods

    Lesson Plan:

    L1. Introduction

    L2. Cache performance

    L3. Cache Optimizations

    L4. Virtual memory

    L5. Numerical Problems-1

    L6. Numerical Problems-2

    Assignment Questions:

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    15/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    Q1) Assume we have a computer where the CPI is 2.0 when all memory accesses hit in the

    cache. The only data accesses are loads and stores ,and these total 40% of the instructions.

    If the miss penalty is 35 clock cycles and the miss rare is 3%, how much faster would thecomputer be if all instructions were cache hits?

    Q2) What do you mean by memory stall cycles? List the different formulae for memory stallcycles.

    Q3) Explain different block placement methods with neat diagrams.

    Q4) Explain the following terms: (i)Write through (ii) Write back (iii) Write stall and Write

    buffer (iv) Write allocate (v) No-write allocateQ5) Explain the organization of Opteron data cache with a neat diagram.

    Q6) Explain multilevel caches to reduce miss penalty. Discuss average memory access time,

    local miss rate, global miss rate w.r.t. multilevel caches.Q7) Suppose that in 1000 memory references there are 50 misses in the first level cache and 30

    misses in the second level cache. What are the various miss rates? Assume the miss penalty

    from L2 cache to memory is 250 clock cycles, the hit time of L2 cache is 15 clock cycles,the hit time of L1 is 1 clock cycle, and there are 1.4 memory references per instruction.

    What is the average memory access time and average stall cycles per instruction?

    Q8) Compare paging and segmentation with neat diagrams.Q9) List the typical levels in memory hierarchy with their important features.

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    16/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    UNIT VII

    UNIT WISE PLAN

    Chapter Number: No of Hours: 06

    Unit Title: REVIEW OF MEMORY HIERARCHY

    Learning Objectives:

    Atthe end of this unit students will understand:

    1. Memory hierarchy and cache optimization

    2. Memory technology

    3. Virual machines

    4. Cache performance

    5. Protection using virtual memory and virtual machines

    Lesson Plan:

    L1. Introduction to memory hierarchy design

    L2. Advanced optimizations of Cache performance

    L3. Memory technology and optimizations

    L4. Protection: Virtual memory

    L5. Virtual machines.L6. Numerical problems

    Assignment Questions:

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    17/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    Q1) Explain the following optimizations techniques which reduce hit time.

    (i)Small and simple caches (ii) Way prediction (iii) Trace caches

    Q2) Explain the compiler optimization techniques to reduce miss rate.Q3) Differentiate between SRAM and DRAM. Draw the internal organization of

    64M bit DRAM.

    Q4) List eleven advanced optimizations of cache performance and explain any one.Q5) Explain optimizations techniques for increasing cache bandwidth.

    Q6) Explain memory technology and optimizations.

    Q7)Explainoptimizations techniques for Reducing miss penalty.

    Q8) Explain protection via virtual memory.Q9) Explain protection via virtual machines.

    Q10) Explain Xen virtual machine .

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    18/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    UNIT VIII

    UNIT WISE PLAN

    Chapter Number: Appendix G No of Hours: 07

    Unit Title: HARDWARE AND SOFTWARE FOR VLIW AND EPIC

    Learning Objectives:

    Atthe end of this unit students will understand:

    1. VLIW.

    2. EPIC

    3. Intel IA-64 Architecture, Itanium Processor

    4. Loop-Level Parallelism, Code for Parallelism

    5. Hardware Support for Parallelism

    Lesson Plan:

    L1. Introduction: Exploiting Instruction-Level Parallelism Statically

    L2. Detecting and Enhancing Loop-Level Parallelism

    L3. Scheduling and Structuring Code for Parallelism

    L4. Hardware Support for Exposing Parallelism: Predicated Instructions

    L5. Hardware Support for Compiler Speculation

    L6. The Intel IA-64 Architecture

    L7. Itanium Processor; Conclusions.

  • 7/27/2019 Complete Lessonplan Aca 12 Unki

    19/19

    B.L.D.E.AsV.P Dr P.G. H College Of Engineering & Technology, Bijapur.

    Assignment Questions:

    Q1) Explain the methods, advantages and disadvantages for exploiting

    instruction-level parallelism statically.Q2) Explain the methods for detecting and enhancing loop-level parallelism

    Q3)Explain software pipelining using symbolic loop unrolling.

    Q4) Explain global code schedulingQ5) Explain hardware support for exposing parallelism using predicated

    instructions in detail.

    Q6)Explainhardware support for compiler speculation.

    Q7) Explain superblocks using a flowchartQ8) Explain IA-64 instruction set architecture

    Q9) Explain Itanium 2 processor in detail.