ECE 232 L8.Arithm.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232...
-
date post
20-Dec-2015 -
Category
Documents
-
view
217 -
download
0
Transcript of ECE 232 L8.Arithm.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232...
ECE 232 L8.Arithm.1 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
ECE 232
Hardware Organization and Design
Lecture 8
Computer ArithmeticALU, Adders
Maciej Ciesielski
www.ecs.umass.edu/ece/labs/vlsicad/ece232/spr2002/index_232.html
ECE 232 L8.Arithm.2 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Outline
° Number representation• Signed and unsigned numbers
• Comparisons, sign extensions
• Overflow exception, detection
° Computer arithmetic• ALU
• Adders
• Overflow detection
° Speed: Carry-Look-Ahead adder
ECE 232 L8.Arithm.3 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Sign and magnitude
0000
0010
0001
0011
0100
0101
0110
01111000
1001
1010
1011
1101
1110
1111
1
2
+0
3
4
5
6
7-0
-1
-2
-3
- 4
-5
-6
-7
1100
Sign bit = 0 pos1 neg
0n-2n-1
n-1 bit integer
• Problem: two zeros (+0, –0)
ECE 232 L8.Arithm.4 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Two’s complement representation
0000
0010
0001
0011
0100
0101
0110
01111000
1001
1010
1011
1101
1110
1111
1
2
0
3
4
5
6
7-8
-7
-6
-5
-4
-3
-2
-1
1100
Sign bit = 0 pos1 neg
0n-2n-1
Formula: -Xtwo = 2n - XXtwo = -2n-1xn-1+ 2n-2 xn-2 + … + 2 x1 + x0
ECE 232 L8.Arithm.5 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Signed vs. Unsigned Comparison
° Instruction slt: set on less-then unsigned (ignore sign bit)
R1 = 0…00 0000 0000 0000 0001 = 1twos
R2 = 0…00 0000 0000 0000 0010 = 2twos
R3 = 1…11 1111 1111 1111 1111 = -1twos
° After executing these instructions:
slt r4,r2,r1 ; if (r2 < r1) r4=1; else r4=0
slt r5,r3,r1 ; if (r3 < r1) r5=1; else r5=0
sltu r6,r2,r1 ; if (r2 < r1) r6=1; else r6=0
sltu r7,r3,r1 ; if (r3 < r1) r7=1; else r7=0
° What are values of registers r4 - r7? Why?
r4 = ; r5 = ; r6 = ; r7 = ;
ECE 232 L8.Arithm.6 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Sign Extension
° Extend (left) – why do we need it * ?
° Extend the MS-bit all the way to the left
° Example: 4bits: 510 = 0101
16 bits: 510 = 0000000000000101
4 bits: - 510 = 1011
16 bits: - 510 = 1111111111111011
* To fill in 32-bit register with shorter constant
ECE 232 L8.Arithm.7 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Data Path Diagram
Program Counter (PC)
Instruction Register
Register File
ALU
Cache Memory
Data In
Address
4
Out
Rs
RtRd
ControlLogic
ECE 232 L8.Arithm.8 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Design Process
• Design finishes as assembly- Design understood in terms of components and how they have been assembled
- Top Down decomposition of complex functions (behaviors) into more primitive functions
- bottom-up composition of primitive building blocks into more complex assemblies
CPU
Datapath Control
ALU Regs Shifter
Gates
Design is a "creative process," not a simple method
ECE 232 L8.Arithm.9 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
MIPS ALU requirements
° ALU operations: • Arithmetic: add, addu, sub, subu, addi, addiu
2’s complement adder/sub with overflow detection
• Logical: AND, ANDi, OR, ORi, XOr, XOri, NOR
Logical AND, logical OR, XOR, NOR
• Decision: slti, sltiu (set less than), beq, bne
2’s complement adder with inverter, check sign bit of result
° ALU from textbook Chapter 4 supports these OPs
ECE 232 L8.Arithm.10 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
MIPS arithmetic instruction format
° Signed arithmetic generate overflow, no carry
31 25 20 15 5 0R-type:
I-Type:
op Rs Rt Rd funct
op Rs Rt Immed 16
Type op funct
ADDI 10 xx
ADDIU 11 xx
SLTI 12 xx
SLTIU 13 xx
ANDI 14 xx
ORI 15 xx
XORI 16 xx
LUI 17 xx
Type op funct
ADD 00 40
ADDU 00 41
SUB 00 42
SUBU 00 43
AND 00 44
OR 00 45
XOR 00 46
NOR 00 47
Type op funct
00 50
00 51
SLT 00 52
SLTU 00 53
ECE 232 L8.Arithm.11 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Refined Requirements
• Functional Specificationinputs: two 32-bit operands A, B; 4-bit mode Moutputs: 32-bit result S; 1-bit carry c; 1 bit overflowoperations: add, addu, sub, subu, and, or, xor, nor, slt,
sltu
• Block Diagram (symbol)
ALUALUA B
Moverflow
S
32 32
32
4c
ECE 232 L8.Arithm.12 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Refined Diagram: bit-slice ALU
A B
M
S
32 32
32
4
Overflow
ALU0a0 b0
m
cincos0
ALU0
a31 b31m
cincos31
ECE 232 L8.Arithm.13 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
ALU – bit slice design
° Basic ALU functions• Add, AND, OR
A
B
1-bitFull
Adder
Cout
MU
X
Cin
Result
ADD
AND
OR
S-select
ECE 232 L8.Arithm.14 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
ALU - Additional operations
° Subtract: A - B = A + (– B)• form two’s complement by invert and +1 (Cin)
° Set-less-than? – left as an exercise (see text, 4.5)
A
B
1-bitFull
Adder
Cout
MU
XCin
Result
add
and
or
S-select
invert
ECE 232 L8.Arithm.15 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Revised Diagram
° LSB and MSB need to do a little extra• less than, set, overflow, zero – see Fig. 4.19 in text
A B
M
S
32 32
32
4
Overflow
ALU0
a0 b0
cincos0
ALU0
a31 b31
cincos31
C/L toproduceselect,comp,c-in
?
ECE 232 L8.Arithm.16 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Overflow
° Examples: 7 + 3 = 10 but ... - 4 - 5 = - 9 but ...
2’s ComplementBinaryDecimal0 0000 0000
Decimal0
1 0001
2 0010
3 0011
1111
1110
1101
- 1
- 2
- 3
4 0100
5 0101
6 0110
7 0111
1100
1011
1010
1001
- 4
- 5
- 6
- 7
1000- 8
0 1 1 1
0 0 1 1+
1 0 1 0
110
= 7
= 3
1
= – 6
1 1 0 0
1 0 1 1+
0 1 1 1
1
= – 4
= – 5
= 7
ECE 232 L8.Arithm.17 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Overflow Detection
° Overflow: the result is too large (or too small) to represent properly• Example: - 8 < = 4-bit binary number <= 7
° When adding operands with different signs, overflow cannot occur!
° Overflow occurs when adding:• 2 positive numbers and the sum is negative
• 2 negative numbers and the sum is positive
° On your own: Prove you can detect overflow by:• Carry into MSB carry out of MSB
0 1 1 1
0 0 1 1+
1 0 1 0
1
= 7
= 3
1
= – 6
10
= – 4
= – 5
= 7
1 1 0 0
1 0 1 1+
0 1 1 1
1 0
ECE 232 L8.Arithm.18 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Overflow Detection Logic
° Carry into MSB Carry out of MSB• For a n-bit ALU: overflow = Cin[N - 1] XOR Cout[N - 1]
A0
B0
1-bitALU
Result0
Cin0
Cout0
A1
B1
1-bitALU
Result1
Cin1
Cout1
A2
B2
1-bitALU
Result2
Cin2
A3
B3
1-bitALU
Result3
Cin3
Cout3
Overflow
X Y X XOR Y
0 0 0
0 1 1
1 0 1
1 1 0
ECE 232 L8.Arithm.19 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
More Revised Diagram
° LSB and MSB need to do a little extra
M
4
C/L toproduceselect,comp,c-in
Signed-arithmetic,Cin xor Cout
A B
S
32 32
32Overflow
ALU0
a0 b0
cincos0
ALU0
a31 b31
cincos31
ECE 232 L8.Arithm.20 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
What about Performance?
° Critical path of n-bit Ripple-Carry (RC) adder is n•CP
Cout3
1-bitALU
Result0
Cin0
Cout0
1-bitALU
Result1
Cin1
Cout1
1-bitALU
Result2
CiIn2
Cout2
1-bitALU
A0
B0
A1
B1
A2
B2
A3
B3Result3
Cin3
Design trick: throw hardware at it
ECE 232 L8.Arithm.21 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Carry Look Ahead adder - Principle
° Examine the Full Adder table
0 0 0 0 00 0 1 0 10 1 0 0 10 1 1 1 01 0 0 0 11 0 1 1 01 1 0 1 01 1 1 1 1
a b Cin Cout S
Cout = a • b + Cin • (a + b)S = a’b’c + a’bc’ + ab’c’ + abc = a b c
a
b
Cin
Cout
S
In general, for bit i: ci+1 = ai bi + ci (ai+bi)
where ci = Cout, ci-1= Cin
ECE 232 L8.Arithm.22 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Designing Carry Look Ahead adder
° Compute carry-out Ci in terms of the primary inputs:
ci+2 = ai+1 bi+1 + ci+1 (ai+1 + bi+1)
= ai+1 bi+1 + (ci(ai + bi) + ai bi) (ai+1 + bi+1)
ai+1 bi+1
ci+2
Si+1
FAi+1
ci+1
Si
ai bi
ciFAi
° Create auxiliary functions:
Generate: gi= ai bi and Propagate: pi = ai + bi
c1 = a0 b0 + c0(a0 + b0) = g0 + (p0 c0)
c2 = a1 b1 + (a1 + b1) (a0 b0 + c0(a0 + b0)) = g1 + p1 g0+ p1 p0 c0
ECE 232 L8.Arithm.23 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Carry Look Ahead (CLA) adder
A B C-out0 0 0 “kill”0 1 C-in “propagate”1 0 C-in “propagate”1 1 1 “generate”
P = A and BG = A xor B
Cin
C1 =G0 + C0 P0
C2 = G1 + G0 P1 + C0 P0 P1
C3 = G2 + G1 P2 + G0 P1 P2 + C0 P0 P1 P2
G
C4 = . . .
P
A0
B0
S0G0P0
A1
B1
S1G1P1
A2
B2
S2G2P2
A3
B3
S3G3P3
These can be usedin a hybrid 4x4-bit adder
ECE 232 L8.Arithm.24 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Plumbing as Carry Look Ahead analogy
p0
c0g0
c1
p0
c0g0
p1g1
c2
p0
c0g0
p1g1
p2g2
p3g3
c4
c1 = g0 + c0 p0
c2 = g1 + g0p1 + c0 p0 p1
c4 = g3 + g2 p3 + g1 p2 + g0 p1 p2 + c0 p0 p1 p2 p3
ECE 232 L8.Arithm.25 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Cascaded Carry Look Ahead (16-bit): Abstraction
C2 = G1 + G0 P1 + C0 P0 P1
C3 = G2 + G1 P2 + G0 P1 P2 + C0 P0 P1 P2
C1 =G0 + C0 P0
GP
C4 = . . .
CLA
4-bitAdder
4-bitAdder
4-bitAdder
G0P0
C0 • Carries are generated by CLA, not RC adder
ECE 232 L8.Arithm.26 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
2nd level Carry, Propagate as Plumbing
p0g0
p1g1
p2g2
p3g3
G0
p1
p2
p3
P0
ECE 232 L8.Arithm.27 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Carry Select Adder (CSA)
° Design trick: guess
n-bit adder n-bit adderCarry propagate delayCP(2n) = 2*CP(n)
n-bit adder n-bit addern-bit adder 1 0
Cout
CP(2n) = CP(n) + CP(mux)
Carry-select adder
Compute both, select one
ECE 232 L8.Arithm.28 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Carry Skip Adder: reduce worst case delay
4-bit Ripple Adder
A0B
SP0P1P2P3
4-bit Ripple Adder
A4B
SP0P1P2P3
Exercise: optimal design uses variable block sizes
Just speed up the slowest case for each block
ECE 232 L8.Arithm.29 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Additional MIPS ALU requirements
° Multiply: mult, multu (next lecture), divide: div, divu (?) need 32-bit multiply and divide, signed and unsigned
° Shift: sll, srl, sra (next lecture) need left shift, right shift, right shift arithmetic by 0 to 31 bits
° NOR (leave as exercise to reader) logical NOR or use 2 steps: (A OR B) XOR 1111....1111
ECE 232 L8.Arithm.30 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Elements of the Design Process
° Divide and Conquer (e.g., ALU)• Formulate a solution in terms of simpler components.
• Design each of the components (subproblems)
° Generate and Test (e.g., ALU)• Given a collection of building blocks, look for ways of putting them
together that meets requirement
° Successive Refinement (e.g., carry lookahead)• Solve "most" of the problem (i.e., ignore some constraints or special
cases), examine and correct shortcomings.
° Formulate High-Level Alternatives (e.g., carry select)• Articulate many strategies to "keep in mind" while pursuing any one
approach.
° Work on the Things you Know How to Do• The unknown will become “obvious” as you make progress.
ECE 232 L8.Arithm.31 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Summary of the Design Process
Hierarchical Design to manage complexity
Top Down vs. Bottom Up vs. Successive Refinement
Importance of Design Representations:
Block Diagrams
Decomposition into Bit Slices
Truth Tables, K-Maps
Circuit Diagrams
Other Descriptions: state diagrams, timing diagrams, reg xfer, . . .
Optimization Criteria:
Gate Count
[Package Count]
Logic Levels
Fan-in/Fan-outPower
topdown
bottom up
AreaDelay
mux designmeets at TT
Cost Design timePin Out
ECE 232 L8.Arithm.32 Adapted from Patterson 97 ©UCB Copyright 1998 Morgan Kaufmann Publishers
Lecture Summary
° Computer arithmetic • Number systems
• Consequences for computer organization
• Example: overflow detection
° An Overview of the Design Process• Design is an iterative process, multiple approaches to get started
• Do NOT wait until you know everything before you start
• Example: Instruction Set drives the ALU design