Cpe 252: Computer Organization1 Lo’ai Tawalbeh Programming The Basic Computer Chapter 6:
Oct. 11, 2000Machine Organization1 Machine Organization (CS 570) Lecture 3: Instruction Set...
-
Upload
ashley-gibbs -
Category
Documents
-
view
213 -
download
0
description
Transcript of Oct. 11, 2000Machine Organization1 Machine Organization (CS 570) Lecture 3: Instruction Set...
Oct. 11, 2000 Machine Organization 1
Machine Organization (CS 570)
Lecture 3: Instruction Set Principles and Examples*
Jeremy R. JohnsonWed. Oct. 11, 2000
*This lecture was derived from material in the text (Chap. 2, Appendices C and D). All figures from Computer Architecture: A Quantitative Approach, Second Edition, by John Hennessy and David Patterson, are copyrighted material (COPYRIGHT 1996 MORGAN KAUFMANN PUBLISHERS, INC. ALL RIGHTS RESERVED).
Oct. 11, 2000 Machine Organization 2
Introduction
• Objective: To examine the interface between the hardware and the programmer - Instruction Set Architecture. To present some design alternatives and examples.
• The Instruction Set Architecture (ISA) is the portion of the machine visible to the programmer and compiler writer
• Topics– Looking at assembly code– Taxonomy and design Alternatives– Instruction set measurements– DLX
Oct. 11, 2000 Machine Organization 3
Storage in the CPU
• Stack• Accumulator• Register
– register-memory– register-register
Stack Accumulator Register-Memory Register-Register
Push A Load A Load R1, A Load R1, APush B Add B Add R1, B Load R2, BAdd Store C Store C, R1 Add R3, R1, R2Pop C Store C, R3
C + A + B
Oct. 11, 2000 Machine Organization 4
General Purpose Register (GPR) Machine
• Why?– Faster than memory access– Simplify compiler’s task
• How many registers?– Parameter passing, expression evaluation, variables
• How many operands and of what type (register vs. memory)– (0,3)
+ Simple fixed-length instruction encoding, similar number of clocks- Higher instruction count
– (1,2)+ Data can be accessed without first loading, easy to encode and good density- Not symmetric, variable number of clocks, may limit number of registers
– (3,3)+ Most compact, doesn’t waste registers for temporaries- Large variation in instruction size and number of clocks, memory bottleneck
Oct. 11, 2000 Machine Organization 5
Addressing Modes
• Register: Add R4, R3• Immediate: Add R4, #3• Displacement: Add R4, 100(R1)• Indirect: Add R4, (R1)• Indexed: Add R3, (R1 + R2)• Direct: Add R1, (1001)• Memory indirect: Add R1, @(R3)• Auto-increment: Add R1, (R2)+• Auto-decrement: Add R1,-(R2)• Scaled: Add R1, 100(R2)[R3]
Oct. 11, 2000 Machine Organization 6
Summary of Use of Addressing Modes
Oct. 11, 2000 Machine Organization 7
Distribution of Displacement
Oct. 11, 2000 Machine Organization 8
Percentage Immediate Mode
Oct. 11, 2000 Machine Organization 9
Distribution Immediate Mode
Oct. 11, 2000 Machine Organization 10
Instruction Categories
• Arithmetic and Logical• Data Transfer• Control• System• Floating point• Decimal• String• Graphics
Oct. 11, 2000 Machine Organization 11
Top Ten Instructions (Intel)SPECint92
• Load 22%• Conditional branch 20%• Compare 16%• Store 12%• Add 8%• And 6%• Sub 5%• Move reg, reg 4%• Call 1%• Return 1%
Oct. 11, 2000 Machine Organization 12
Control Transfer
• Conditional Branches• Jumps• Procedure calls• Procedure returns
Oct. 11, 2000 Machine Organization 13
Implementing Transfer Control
• Condition Code? Special bits are set by ALU operations+ Sometimes set for free (typically not the case)- extra state, constrain ordering of instructions
• Condition Register? Test arbitrary register with result of comparison+ Simple- Uses up a register
• Compare and Branch? Compare is part of branch (often limited to subset)+ One instruction rather than two- May be too much work for an instruction
Oct. 11, 2000 Machine Organization 14
PC Relative Addressing
• Displacement off of PC– Typically branch nearby
Oct. 11, 2000 Machine Organization 15
Encoding of Instruction Set
• Variable• Fixed• Hybrid
Oct. 11, 2000 Machine Organization 16
Compiler Optimizations• High Level
– Procedure Inlining• Local
– Common subexpression elimination– Constant propagation– Stack height reduction
• Global– Global common subexpression elimination– Copy propagation– Code motion– Induction variable elimination
• Machine Dependent– Strength reduction– Pipeline scheduling– Branch offset optimization
Oct. 11, 2000 Machine Organization 17
Effect of Compiler Optimization
Oct. 11, 2000 Machine Organization 18
DLX
• Registers– 32 32-bit GPR’s (R0 - R31), R0 = 0– 32 SP FP registers (can be viewed as 16 DP FP registers)– FP status register
• Data types– 8-bit byte, 16-bit half word, 32-bit word, IEEE SP and DP FP
• Memory– byte addressable, big Endian, 32-bit addresses– addresses must be aligned
• Addressing Modes– immediate– displacement
Oct. 11, 2000 Machine Organization 19
DLX Operations• Data Transfer
– LB, LBU, SB– LH, LHU, SH– LW, SW– LF, LD, SF, SD– MOVI2S, MOVS2I– MOVF, MOVD– MOVFP2I, MOVI2FP
• Arithmetic/Logical– ADD, ADDI, ADDU, ADDUI– SUB, SUBI, SUBU, SUBUI– MULT, MULTU, DIV, DIVU– AND, ANDI– OR, ORI, XOR, XORI– LHI– SLL, SRL, SRA, SLLI, SRLI, SRAI– S__, S__I : “__” = LT, GT, LE, GE, EQ, NE
Oct. 11, 2000 Machine Organization 20
DLX Operations (cont)
• Control– BEQZ, BNEZ : 16 bit offset from PC+4– BFPT, BFPF : 16 bit offset from PC+4– J, JR : 26-bit offset from PC+4(J)– JAL, JALR : R31 = PC+4– TRAP– RFE
• Floating point– ADDD, ADDF– SUBD, SUBF– MULTD, MULTF– DIVD, DIVF– CVTF2D, CVTF2I, CVTD2F, CVTD2I, CVTI2F, CVTI2D– __D, __F : “__” = LT, GT, LE, GE, EQ, NE, sets bit in FP status register
Oct. 11, 2000 Machine Organization 21
DLX Instruction Format
• I-type• R-type• J-type
Oct. 11, 2000 Machine Organization 22
Distribution of Instructions in DLX
Oct. 11, 2000 Machine Organization 23
Distribution of Instructions in DLX
Oct. 11, 2000 Machine Organization 24
Effectiveness of DLX