Lecture 14: Course Review Kai Bu

download Lecture 14: Course Review Kai Bu

If you can't read please download the document

description

Lecture 02 Fundamentals of Computer Design

Transcript of Lecture 14: Course Review Kai Bu

Lecture 14: Course Review Kai Bu THANK YOU Lecture 02 Fundamentals of Computer Design Classes of Parallel Arch itectures by Michael Flynn according to the parallelism in the instruction and data streams called for by the instructions at the most constrained component of the multiprocessor: SISD, SIMD, MISD, MIMD SISD Single instruction stream, single data stream uniprocessor Can exploit instruction-level parallelism SIMD Single instruction stream, multiple data stream The same instruction is executed by multiple processors using different data streams. Exploits data-level parallelism Data memory for each processor; whereas a single instruction memory and control processor. MISD Multiple instruction streams, single data stream No commercial multiprocessor of this type yet MIMD Multiple instruction streams, multiple data streams Each processor fetches its own instructions and operates on its own data. Exploits task-level parallelism Instruction Set Architecture ISA actual programmer-visible instruction set the boundary between software and hardware 7 major dimensions ISA: Class Most are general-purpose register architectures with operands of either registers or memory locations Two popular versions register-memory ISA: e.g., 80x86 many instructions can access memory load-store ISA: e.g., ARM, MIPS only load or store instructions can access memory ISA: Memory Addressing Byte addressing Aligned address object width: s bytes address: A aligned if A mod s = 0 Each misaligned object requires two memory accesses ISA: Addressing Modes Specify the address of a memory object Register, Immediate, Displacement Trends in Cost Cost of an Integrated Circuit wafer for test; chopped into dies for packaging Trends in Cost Cost of an Integrated Circuit percentage of manufactured devices that survives the testing procedure Trends in Cost Cost of an Integrated Circuit Trends in Cost Cost of an Integrated Circuit Trends in Cost Cost of an Integrated Circuit N: process-complexity factor for measuring manufacturing difficulty Dependability Two measures of dependability Module reliability Module availability Dependability Two measures of dependability Module reliability continuous service accomplishment from a reference initial instant MTTF: mean time to failure MTTR: mean time to repair MTBF: mean time between failures MTBF = MTTF + MTTR Dependability Two measures of dependability Module reliability FIT: failures in time failures per billion hours MTTF of 1,000,000 hours = 10 9 /10 6 = 1000 FIT Dependability Two measures of dependability Module availability Measuring Performance Execution time the time between the start and the completion of an event Throughput the total amount of work done in a given time Measuring Performance Computer X and Computer Y X is n times faster than Y Quantitative Principles Parallelism Locality temporal locality: recently accessed items are likely to be accessed in the near future; spatial locality: items whose addresses are near one another tend to be referenced close together in time Quantitative Principles Amdahls Law Quantitative Principles Amdahls Law: two factors 1. Fraction enhanced : e.g., 20/60 if 20 seconds out of a 60- second program to enhance 2. Speedup enhanced : e.g., 5/2 if enhanced to 2 seconds while originally 5 seconds Quantitative Principles The Processor Performance Equation Lecture 03 Instruction Set Principles ISA Classification Classification Basis the type of internal storage: stack accumulator register ISA Classes: stack architecture accumulator architecture general-purpose register architecture (GPR) ISA Classes: Stack Architecture implicit operands on the Top Of the Stack C = A + B Push A Push B Add Pop C First operand removed from stack Second op replaced by the result ISA Classes: Accumulator Architecture one implicit operand: the accumulator one explicit operand: mem location C = A + B Load A Add B Store C accumulator is both an implicit input operand and a result ISA Classes: General-Purpose Register Arch Only explicit operands registers memory locations Operand access: direct memory access loaded into temporary storage first ISA Classes: General-Purpose Register Arch Two Classes: register-memory architecture any instruction can access memory load-store architecture only load and store instructions can access memory ISA Classes: General-Purpose Register Arch Two Classes: register-memory architecture any instruction can access mem C = A + B Load R1, A Add R3, R1, B Store R3, C ISA Classes: General-Purpose Register Arch Two Classes: load-store architecture only load and store instructions can access memory C = A + B Load R1, A Load R2, B Add R3, R1, R2 Store R3, C GPR Classification ALU instruction has 2 or 3 operands? 2 = 1 result&source op + 1 source op 3 = 1 result op + 2 source op ALU instruction has 0, 1, 2, or 3 operands of memory address? Addressing Modes How instructions specify addresses of objects to access Types constant register memory location effective address frequently used Lectures Pipelining start executing one instruction before completing the previous one Pipelined Laundry Observations No speed up for individual task; e.g., A still takes =90 But speed up for average task execution time; e.g., 3.5*60/4=52.5 < =90 Task Order A B C D Time Hours MIPS Instruction at most 5 clock cycles per instruction IF ID EX MEM WB MIPS Instruction IF ID EX MEM WB IR Mem[PC]; NPC PC + 4; MIPS Instruction IF ID EX MEM WB A Regs[rs]; B Regs[rt]; Imm sign-extended immediate field of IR (lower 16 bits) MIPS Instruction IF ID EX MEM WB ALUOutput A + Imm; ALUOutput A func B; ALUOutput A op Imm; ALUOutput NPC + (Imm