June 18, 2001Systems Architecture II1 Systems Architecture II Systems Architecture I Review * Jeremy...
-
Upload
steven-gregory -
Category
Documents
-
view
225 -
download
0
Transcript of June 18, 2001Systems Architecture II1 Systems Architecture II Systems Architecture I Review * Jeremy...
June 18, 2001 Systems Architecture II 1
Systems Architecture II Systems Architecture I Review*
Jeremy R. Johnson
June 18, 2001
*Most figures from Computer Organization and Design: The Hardware/Software Approach, Second Edition, by David Patterson and John Hennessy, are copyrighted material (COPYRIGHT 1998 MORGAN KAUFMANN PUBLISHERS, INC. ALL RIGHTS RESERVED).
June 18, 2001 Systems Architecture II 2
A Random Access Machine
Control Unit
AC
Program
1
2
3
4
5
6...
...
Memory
AC = accumulatorregister
June 18, 2001 Systems Architecture II 3
Instruction Set
• LDA X; Load the AC with the contents of memory address X
• LDI X; Load the AC indirectly with the contents of address X
• STA X; Store the contents of the AC at memory address X
• STI X; Store the contents of the AC indirectly at address X
• ADD X; Add the contents of address X to the contents of the AC
• SUB X; Subtract the contents of address X from the AC
• JMP X; Jump to the instruction labeled X
• JMZ X; Jump to the instruction labeled X if the AC contains 0
• JMN X; Jump to the instruction labeled X if the contents of the AC ; is negative
• HLT ; Halt execution
June 18, 2001 Systems Architecture II 4
Sample RAM Program1. LDI 3; get i-th entry from A2. ADD 4; add offset to compute index j3. STA 5; store index j4. LDI 5; get j-th entry from B5. JMZ 9; if entry 0, go to 96. LDA 3; if entry 1, get index i7. STA 2; and store it at 2.8. HLT ; stop execution9. LDA 1; get constant 110. STI 5; and store it in B11. LDA 3; get index i12. SUB 4; subtract limit13. JMZ 8; if i = limit, stop14. LDA 3; get index i again15. ADD 1; increment I16. STA 3; store new value of I17. JMP 1;
AC
1
Memory
1 constant2 0 answer3 6 Index i4 9 Limit of A5 0 Index j6 3789
422
A
10111213
000
0 B
June 18, 2001 Systems Architecture II 5
MBR
MAR
AC
ALU
PC
IR(C) IR(O)
MemoryMUX
MUX
0 1 2 3
MUX 01
MUX 01
0123
t9 t8 t7 t6 t5 t4 t3 t2 t1 t0
q9 q8 q7 q6 q5 q4 q3 q2 q1 q0x13
x12
x11
x10
x9
x8
x7
x6
x1
x2
x3
x4
x5
Decoder T
Decoder
s
s
s
s
LOAD
LOAD
LOAD
LOAD
LOAD
LOAD AD
READ/WRITE
INC
CLEARINC
June 18, 2001 Systems Architecture II 6
Building Blocks
c = a . bba
000
010
001
111
b
ac
b
ac
a c
c = a + bba
000
110
101
111
10
01
c = aa
a0
b1
cd
0
1
a
c
b
d
1. AND gate (c = a . b)
2. OR gate (c = a + b)
3. Inverter (c = a)
4. Multiplexor (if d = = 0, c = a; else c = b)
June 18, 2001 Systems Architecture II 7
Implementing Logic Gates with Transistors
output
gate
+V
ground
A Transistor NOT Gate
A NAND B
A
+V
ground
A Transistor NAND Gate
B
June 18, 2001 Systems Architecture II 8
A Clocked Flip-Flop
Q
Q’S
R
clock
June 18, 2001 Systems Architecture II 9
Memory Cell
R Q
S Q’
input
read/write
select
output
June 18, 2001 Systems Architecture II 10
Memory
Decoder
input input input input
output output output output
read/write
0
1
7
June 18, 2001 Systems Architecture II 11
MIPS Architecture
• Load/Store architecture
• General purpose register machine (32 registers)
• ALU operations have 3 register operands (2 source + 1 dest)
• 16 bit constants for immediate mode
• Simple instruction set– simple branch operations (beq, bne)
– Use register to set condition (e.g. slt)
– operations such as move built, li, blt from existing operations
• Uniform encoding– All instructions are 32-bits long
– Opcode is always in the high-order 6 bits
– 3 types of instruction formats
– register fields in the same place for all formats
June 18, 2001 Systems Architecture II 12
Design Principles
• Simplicity favors regularity– uniform instruction length– all ALU operations have 3 register operands– register addresses in the same location for all instruction formats
• Smaller is faster– register architecture– small number of registers
• Good design demands good compromises– fixed length instructions and only 16 bit constants– several instruction formats but consistent length
• Make common cases fast– immediate addressing– 16 bit constants– only beq and bne
June 18, 2001 Systems Architecture II 13
MIPS Encoding• All instructions are 32-bits long• Opcode is always in the high-order 6 bits• Only three instruction formats • 32 registers implies 5 bit register addresses:
– $zero R0 ; zero register always equal to 0– $at R1 ; temporary register– $v0 - $v1 R2-R3 ; return registers– $a0 - $a3 R4-R7 ; argument registers– $t0 - $t7 R8-R15 ; temporary - not preserved across calls– $s0 - $s7 R16-R23 ; saved registers - preserved across calls– $t8 - $t9 R24-R25 ; temporary not preserved across calls– $k0 - $k1 R26-R27 ; reserved by OS kernel– $gp R28 ; global pointer– $sp R29 ; stack pointer– $fp R30 ; frame pointer– $ra R31 ; return address
June 18, 2001 Systems Architecture II 14
MIPS Instruction Set
• Arithmetic/Logical– add, sub, and, or– addi, andi, ori
• Data Transfer– lw, lb– sw, sb– lui
• Control– beq, bne– slt, slti– j, jal, jr
June 18, 2001 Systems Architecture II 15
MIPS Instruction Formats
• R format (register format - add, sub, …)
• I format (immediate format - lw, sw, …)
• J format (jump format – j, jal)
op rs rt rd shamt func
op rs rt address
op 26-bit constant
June 18, 2001 Systems Architecture II 16
MIPS Addressing Modes• Immediate Addressing
– 16 bit constant from low order bits of instruction– addi $t0, $s0, 4
• Register Addressing– add $t0, $s0, $s1
• Base Addressing (displacement addressing)– 16-bit constant from low order bits of instruction plus base register– lw $t0, 16($sp)
• PC-Relative Addressing– (PC+4) + 16-bit address (word) from instruction– bne $s0, $s1, Target
• Pseudodirect Addressing– high order 4 bits of PC+4 concatenated with 26 bit word address - low
order 26 bits from instruction shifted 2 bits to the left– j Address
June 18, 2001 Systems Architecture II 17
MIPS Memory Convention
$sp
$gp
0040 0000 hex
0
1000 0000 hex
Text
Static data
Dynamic data
Stack7fff ffff hex
1000 8000hex
pc
Reserved
June 18, 2001 Systems Architecture II 18
sub $sp,$sp,8 # push stack sw $ra,4($sp) # save return address sw $a0,0($sp) # save n
slt $t0,$a0,1 # test n < 1 beq $t0,$zero,L1 # branch if n >= 1 add $v0,$zero,1 # return 1 add $sp,$sp,8 # pop stack jr $ra # return to calling procedureL1: sub $a0,$a0,1 # set parameter to n-1 jal fact # call fact(n-1) lw $a0,0($sp) # restore previous value of n lw $ra,4($sp) # restore previous return address mul $v0,$a0,$v0 # return n * fact(n-1)
add $sp,$sp,8 # pop stack jr $ra # return to calling procedure
June 18, 2001 Systems Architecture II 19
Activation Records (Frames) and the Call Stack
• An activation record (frame) is a segment on the stack containing a procedure’s saved registers and local variables.
• Each time a procedure is called a frame ($fp: frame pointer register points to the current frame) is placed on the stack.
Saved argument
registers (if any)
Local arrays and
structures (if any)
Saved saved
registers (if any)
Saved return address
b.
$sp
$sp
$sp
c.
$fp
$fp
$fp
a.
High address
Low address
June 18, 2001 Systems Architecture II 20
Compiler Linker LoaderTranslation Hierarchy
Assembler
Assembly language program
Compiler
C program
Linker
Executable: Machine language program
Loader
Memory
Object: Machine language module Object: Library routine (machine language)
June 18, 2001 Systems Architecture II 21
• Bits have no inherent meaning— conventions define relationship between bits and numbers
• Binary numbers (base 2)0000 0001 0010 0011 0100 0101 0110 0111 1000
1001...decimal: 0...2n-1
• Complications:numbers are finite (overflow)fractions and real numbersnegative numbers
– e.g., no MIPS subi instruction; addi can add a negative number)
• How do we represent negative numbers?– Which bit patterns will represent which numbers?
Numbers
June 18, 2001 Systems Architecture II 22
• Sign Magnitude: One's Complement Two's Complement
000 = +0 000 = +0 000 = +0001 = +1 001 = +1 001 = +1010 = +2 010 = +2 010 = +2011 = +3 011 = +3 011 = +3100 = -0 100 = -3 100 = -4101 = -1 101 = -2 101 = -3110 = -2 110 = -1 110 = -2111 = -3 111 = -0 111 = -1
• Issues: balance, number of zeros, ease of operations• Which one is best? Why?
Possible Representations
June 18, 2001 Systems Architecture II 23
• 32 bit signed numbers:
0000 0000 0000 0000 0000 0000 0000 0000two = 0ten
0000 0000 0000 0000 0000 0000 0000 0001two = + 1ten
0000 0000 0000 0000 0000 0000 0000 0010two = + 2ten
...0111 1111 1111 1111 1111 1111 1111 1110two = + 2,147,483,646ten
0111 1111 1111 1111 1111 1111 1111 1111two = + 2,147,483,647ten
1000 0000 0000 0000 0000 0000 0000 0000two = – 2,147,483,648ten
1000 0000 0000 0000 0000 0000 0000 0001two = – 2,147,483,647ten
1000 0000 0000 0000 0000 0000 0000 0010two = – 2,147,483,646ten
...1111 1111 1111 1111 1111 1111 1111 1101two = – 3ten
1111 1111 1111 1111 1111 1111 1111 1110two = – 2ten
1111 1111 1111 1111 1111 1111 1111 1111two = – 1ten
maxint
minint
MIPS
June 18, 2001 Systems Architecture II 24
• Negating a two's complement number: invert all bits and
add 1
– remember: “negate” and “invert” are quite different!
• Converting n bit numbers into numbers with more than n
bits:
– MIPS 16 bit immediate gets converted to 32 bits for arithmetic
– copy the most significant bit (the sign bit) into the other bits
0010 -> 0000 0010
1010 -> 1111 1010
– "sign extension" (lbu vs. lb)
Two's Complement Operations
June 18, 2001 Systems Architecture II 25
• Just like in grade school (carry/borrow 1s) 0111 0111 0110+ 0110 - 0110 - 0101
• Two's complement operations easy– subtraction using addition of negative numbers
0111+ 1010
• Overflow (result too large for finite computer word):– e.g., adding two n-bit numbers does not yield an n-bit number
0111+ 0001 note that overflow term is somewhat misleading, 1000 it does not mean a carry “overflowed”
Addition & Subtraction
June 18, 2001 Systems Architecture II 26
• No overflow when adding a positive and a negative number
• No overflow when signs are the same for subtraction
• Overflow occurs when the value affects the sign:
Detecting Overflow
Operation A B ResultA+B >= 0 >= 0 < 0A+B < 0 < 0 >= 0A-B >= 0 < 0 < 0A-B < 0 >= 0 >= 0
June 18, 2001 Systems Architecture II 27
Addition and Subtraction
• Carry-ripple adder
0000 0111
+ 0000 0110
0000 1101
0000 0111 0000 0111 0000 0110 0000 0110
- 0000 0110 +1111 1010 - 0000 0111 + 1111 1001
0000 0001 0000 0001 1111 1111 1111 1111
1
0
(0) 1
1
1
(1) 0
(0)
1
1
(1) 1
(1)
0
0
(0) 1
(1)
0
0
(0) 0
(0)
0
0
(0) 0
(0) (Carries)
97108/Patterson Fig 4.03
June 18, 2001 Systems Architecture II 28
Logical Operations
• Shift left << sll• Shift right >> srl• Shift right arithmetic sra• Bitwise and & and, andi• Bitwise or | or, ori• Bitwise complement (not) ~ not (pseudo)• Exclusive or ^ xor, xori
June 18, 2001 Systems Architecture II 29
Representation of Shift Instruction
• Example
sll $t2, $s0, 8 # $t2 = $s0 << 8
$t2 = $8, $s0 = $16
000000 00000 10000 01010 01000 000000
op rs rt rd shamt func
June 18, 2001 Systems Architecture II 30
MIPS ALU
Seta31
0
Result0a0
Result1a1
0
Result2a2
0
Operation
b31
b0
b1
b2
Result31
Overflow
Bnegate
Zero
ALU0Less
CarryIn
CarryOut
ALU1Less
CarryIn
CarryOut
ALU2Less
CarryIn
CarryOut
ALU31Less
CarryIn
ALU control lines Function000 and001 or010 add110 subtract111 set on less than
ALU ResultZero
Overflow
a
b
ALU operation
CarryOut
June 18, 2001 Systems Architecture II 31
Distribution of Floating Point Numbers
• 3 bit mantissa• exponent {-1,0,1}
e = -1 e = 0 e = 11.00 X 2 (̂-1) = 1/2 1.00 X 2 0̂ = 1 1.00 X 2 1̂ = 21.01 X 2 (̂-1) = 5/8 1.01 X 2 0̂ = 5/4 1.01 X 2 1̂ = 5/21.10 X 2 (̂-1) = 3/4 1.10 X 2 0̂ = 3/2 1.10 X 2 1̂= 31.11 X 2 (̂-1) = 7/8 1.11 X 2 0̂ = 7/4 1.11 X 2 1̂ = 7/2
0 1 2 3
June 18, 2001 Systems Architecture II 32
Representation of Floating Point Numbers
• IEEE 754 single precision
31 30 23 22 0
Sign Biased exponent Normalized Mantissa (implicit 24th bit)
(-1)s F 2E-127
Exponent Mantissa Object Represented0 0 00 non-zero denormalized
1-254 anything FP number255 0 pm infinity255 non-zero NaN
June 18, 2001 Systems Architecture II 33
Representation of Floating Point Numbers
• IEEE 754 double precision
31 30 20 19 0
Sign Biased exponent Normalized Mantissa (implicit 53rd bit)
(-1)s F 2E-1023
Exponent Mantissa Object Represented0 0 00 non-zero denormalized
1-2046 anything FP number2047 0 pm infinity2047 non-zero NaN
June 18, 2001 Systems Architecture II 34
Floating Point Addition Hardware
0 10 1 0 1
Control
Small ALU
Big ALU
Sign Exponent Significand Sign Exponent Significand
Exponentdifference
Shift right
Shift left or right
Rounding hardware
Sign Exponent Significand
Increment ordecrement
0 10 1
Shift smallernumber right
Compareexponents
Add
Normalize
Round
June 18, 2001 Systems Architecture II 35
Single Cycle Datapath & Control
• Implementation of MIPS• Simplified to contain only:
– memory-reference instructions: lw, sw – arithmetic-logical instructions: add, sub, and, or, slt– control flow instructions: beq, j
• Generic Implementation:
– use the program counter (PC) to supply instruction address– get the instruction from memory– read registers– use the instruction to decide exactly what to do
June 18, 2001 Systems Architecture II 36
Timing
• Clocks used in synchronous logic – when should an element that contains state be updated?
• Edge-triggered timing
cycle time
rising edge
falling edge
June 18, 2001 Systems Architecture II 37
Edge Triggered Timing
• State updated at clock edge• read contents of some state elements, • send values through some combinational logic• write results to one or more state elements
Clock cycle
Stateelement
1Combinational logic
Stateelement
2
June 18, 2001 Systems Architecture II 38
Components for Simple Implementation
• Functional Units needed for each instruction
PC
Instructionmemory
Instructionaddress
Instruction
a. Instruction memory b. Program counter
Add Sum
c. Adder16 32
Signextend
b. Sign-extension unit
MemRead
MemWrite
Datamemory
Writedata
Readdata
a. Data memory unit
Address
ALU control
RegWrite
RegistersWriteregister
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Writedata
ALUresult
ALU
Data
Data
Registernumbers
a. Registers b. ALU
Zero5
5
5 3
June 18, 2001 Systems Architecture II 39
Datapath with Control
PC
Instructionmemory
Readaddress
Instruction[31– 0]
Instruction [20– 16]
Instruction [25– 21]
Add
Instruction [5– 0]
MemtoReg
ALUOp
MemWrite
RegWrite
MemRead
BranchRegDst
ALUSrc
Instruction [31– 26]
4
16 32Instruction [15– 0]
0
0Mux
0
1
Control
Add ALUresult
Mux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Signextend
Shiftleft 2
Mux
1
ALUresult
Zero
Datamemory
Writedata
Readdata
Mux
1
Instruction [15– 11]
ALUcontrol
ALUAddress
June 18, 2001 Systems Architecture II 40
Multicycle Implementation of MIPS
• Break up the instructions into steps, each step takes a cycle
– balance the amount of work to be done– restrict each cycle to use only one major functional unit– Functional units: memory, register file, and ALU
• At the end of a cycle– Use internal registers to store results between steps
Shiftleft 2
PC
Memory
MemData
Writedata
Mux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
Mux
0
1
Mux
0
1
4
Instruction[15– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
1 Mux
0
3
2
Mux
ALUresult
ALUZero
Memorydata
register
Instruction[15– 11]
A
B
ALUOut
0
1
Address
June 18, 2001 Systems Architecture II 41
Execution Steps
1 Instruction fetch– IR = Memory[PC];– PC = PC + 4;
2 Instruction decode and register fetch– A = Reg[IR[25-21]];– B = Reg[IR[20-16]];– ALUOut = PC + (sign-extend (IR[15-0]) << 2);
June 18, 2001 Systems Architecture II 42
Execution Steps
3 Execution, memory address computation, or branch completion
– Memory reference• ALUOut = A + sign-extend (IR[15-0]);
– Arithmetic-logical instruction (R-type)• ALUOut = A op B;
– Branch• if (A == B) PC = ALUOut;
– Jump• PC = PC [31-28] || (IR[25-0]<<2)
June 18, 2001 Systems Architecture II 43
Execution Steps
4 Memory access or R-type instruction completion– memory reference
• MDR = Memory [ALUOut];
– or• Memory [ALUOut] = B;
– R-type completion• Reg [IR[15-11]] = ALUOut;
5 Memory read completion– Reg [IR[20-16]] = MDR;
June 18, 2001 Systems Architecture II 44
Finite State Machine
• Set of states• Next function determined by input and current state• Output determined by current state and possibly input
• Moore machine (output determined only by current state)
Next-statefunction
Current state
Clock
Outputfunction
Nextstate
Outputs
Inputs
June 18, 2001 Systems Architecture II 45
Multicycle Datapath with Exception Support
Shiftleft 2
Memory
MemData
Writedata
Mux
0
1
Instruction[15– 11]
Mux
0
1
4
Instruction[15– 0]
Signextend
3216
Instruction[25– 21]
Instruction[20– 16]
Instruction[15– 0]
Instructionregister
ALUcontrol
ALUresult
ALUZero
Memorydata
register
A
B
IorD
MemRead
MemWrite
MemtoReg
PCWriteCond
PCWrite
IRWrite
Control
Outputs
Op[5– 0]
Instruction[31-26]
Instruction [5– 0]
Mux
0
2
Jumpaddress [31-0]Instruction [25– 0] 26 28
Shiftleft 2
PC [31-28]
1
Address
EPC
CO 00 00 00 3
Cause
ALUOp
ALUSrcB
ALUSrcA
RegDst
PCSource
RegWrite
EPCWriteIntCauseCauseWrite
1
0
1 Mux
0
3
2
Mux
0
1
Mux
0
1
PC
Mux
0
1
RegistersWriteregister
Writedata
Readdata 1
Readdata 2
Readregister 1
Readregister 2
ALUOut
June 18, 2001 Systems Architecture II 46
Multicycle Control with Exceptions
ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCWriteCond
PCSource = 01
ALUSrcA = 1ALUSrcB = 00ALUOp = 10
RegDst = 1RegWrite
MemtoReg = 0
MemWriteIorD = 1
MemReadIorD = 1
ALUSrcA = 1ALUSrcB = 00ALUOp = 00
RegWriteMemtoReg = 1
RegDst = 0
ALUSrcA = 0ALUSrcB = 11ALUOp = 00
MemReadALUSrcA = 0
IorD = 0IRWrite
ALUSrcB = 01ALUOp = 00
PCWritePCSource = 00
Instruction fetchInstruction decode/
Register fetch
Jumpcompletion
BranchcompletionExecution
Memory addresscomputation
Memoryaccess
Memoryaccess R-type completion
Write-back step
(Op = 'LW') or (Op = 'SW') (Op = R-type)
(Op
= 'B
EQ')
(Op
= 'J
')
(Op = 'SW
')
(Op
= 'L
W')
4
01
9862
7 11 1053
Start
(Op = other)
Overflow
Overflow
ALUSrcA = 0ALUSrcB = 01ALUOp = 01
EPCWritePCWrite
PCSource = 11
IntCause = 0CauseWrite
ALUSrcA = 0ALUSrcB = 01ALUOp = 01
EPCWritePCWrite
PCSource = 11
IntCause = 1CauseWrite
PCWritePCSource = 10
June 18, 2001 Systems Architecture II 47
Implementation
PCWritePCWriteCondIorD
MemtoRegPCSourceALUOpALUSrcBALUSrcARegWrite
AddrCtl
Outputs
Microcode memory
IRWrite
MemReadMemWrite
RegDst
Control unit
Input
Microprogram counter
Address select logic
Op[
5–
0]
Adder
1
Datapath
Instruction registeropcode field
BWrite