Post on 27-Mar-2022
1
CS365 1
Arithmetic and Logic Unit
CS 365 Lecture 5
Prof. Yih Huang
CS365 2
Inside a Processor
Data Cache Instruction Cache
(Internal) Bus
Integer Arithmetic Circuits
Floating Point Arithmetic CircuitsBranch
ControlLogic
Registers
2
CS365 3
Arithmetic and Logic Unit (ALU)
�The part of a processor circuit that actually gets the computations done.
32
32
32
operation
result
a
b
ALU
CS365 4
�Bits are just bits (no inherent meaning)
�Binary numbers (base 2) ⇒ decimal: 0...2n-1
�ASCII codes
�Of course it gets more complicated:numbers are finite (overflow)fractions and real numbersnegative numbers
�How do we represent negative numbers?
Numbers
3
CS365 5
� Sign Magnitude: One's Complement Two's Complement000 = +0 000 = +0 000 = +0001 = +1 001 = +1 001 = +1010 = +2 010 = +2 010 = +2011 = +3 011 = +3 011 = +3100 = -0 100 = -3 100 = -4101 = -1 101 = -2 101 = -3110 = -2 110 = -1 110 = -2111 = -3 111 = -0 111 = -1
�Most of the modern architectures use two’s complement.
Possible Representations
CS365 6
Two’s Complement Numbers
�0010 =
�1010 =
�-10 in 8-bit two’s complement =
X3 X2 X1 X0
202122−−−−23
4
CS365 7
0000 0000 0000 0000 0000 0000 0000 0000 = 00000 0000 0000 0000 0000 0000 0000 0001 = +10000 0000 0000 0000 0000 0000 0000 0010 = +2...0111 1111 1111 1111 1111 1111 1111 1110 = +2,147,48 3,6460111 1111 1111 1111 1111 1111 1111 1111 = +2,147,48 3,6471000 0000 0000 0000 0000 0000 0000 0000 = –2,147,48 3,6481000 0000 0000 0000 0000 0000 0000 0001 = –2,147,48 3,6471000 0000 0000 0000 0000 0000 0000 0010 = –2,147,48 3,646...1111 1111 1111 1111 1111 1111 1111 1101 = –31111 1111 1111 1111 1111 1111 1111 1110 = –21111 1111 1111 1111 1111 1111 1111 1111 = –1
32-bit Signed Numbers
Two’s Complement Decimal
CS365 8
�Negating a two's complement number: invert all bits and add 1
– remember: “negate” and “invert” are different!
�Exercises (in 6 bits)
– Negate 12
– Negate -5
Two's Complement Operations
5
CS365 9
Sign Extensions
�MIPS 16 bit immediate gets converted to 32 bits for arithmetic
�copy the most significant bit (the sign bit) into the other bits
0010 ⇒⇒⇒⇒ 0000 00101010 ⇒⇒⇒⇒ 1111 1010
4 bit number 8 bit equivalent
CS365 10
Additions & Subtractions
�Just like regular binary numbers
0010+ 0110
1111+ 0001
1111+ 1111
0010- 0110
1111- 0001
1111- 1111
6
CS365 11
Overflows
�Result too large to store in finite-size computer words
– e.g., adding two n-bit numbers does not always yields an n-bit number
�Depends on the kind of numbers you have in mind: Signed or unsigned
0010+ 0110
1000- 0001
CS365 12
�No overflow when adding a positive and a negative number
�No overflow when signs are the same for subtraction
�Overflows when the value affects the sign:
Detecting Overflow
>0>0<0A−B
<0<0>0A−B
>0<0<0A+B
<0>0>0A+B
resultBA
7
CS365 13
�Architecture and case dependent
�Solution 1: just remember it and leave the handing to software.
– The condition/flag register of IA32
�Solution 2: exception/interrupt
– Control jumps to predefined address for exception
– Interrupted address is saved for possible resumption
– Used by MIPS
Effects of Overflow
CS365 14
Discussion
� IA32 provides an addc (add with carry) instruction. What is its use?
8
CS365 15
� Problem: Consider a logic function with three inputs: A, B, and C.
Output D is true if at least one input is trueOutput E is true if exactly two inputs are trueOutput F is true only if all three inputs are true
� Show the truth table for these three functions.
� Show the Boolean equations for the three functions.
� Show an implementation consisting of inverters, AND, and OR gates
Review: Boolean Algebra & Gates
CS365 16
Design An Overflow Detector
� Inputs: SA (sign of A), SB (Sign of B), OP (operation, 0 for add, 1 for sub).
� Output: OF=0 no overflow, 1 overflow
� Truth Table:
� Boolean equation for OF.
� A circuit design of OF according to the equation above. 111
011
101
001
110
010
100
000
OFSBSAOP
9
CS365 17
�Selects one of the inputs to be the output, based on a control input
�Note: we call this 2-input multiplexer even though it actually has three inputs
Review: The Multiplexer
Multiplexor
Output
Select
A B
CS365 18
More Inputs
�The general case: N-input multiplexer needs log2N select lines.
�You should be able to design its logic circuit.
Multiplexor
Output
Select
A B C D
2
10
CS365 19
Second Exercise
�Let us build a one-bit ALU to support addition and logic or.
– Operation: 0 for add 1 for or
operation
result
a
b
ALU
CS365 20
Solution
�Truth Table
�Sum of product
11
CS365 21
Supporting MIPS Logic Instructions
�MIPS provides bit-wise and , or , xor , and nor instructions.
� Input operation (3 bits) determine the output.
operation
result
a
b
ALU
3
CS365 22
32-bit ALU
�Both inputs A and B are 32 bit wide.
– Size of the truth table ?
�Rather we will just cascade 32 1-bit ALU.
– How about carries ?
– We need to refine the spec of the 1-bit ALU
12
CS365 23
Two Solutions
�Truth table and sum of product
�Use multiplexer
CS365 24
1-bit Adder
�How could we build a 1-bit ALU for add, and, or?
�How could we build a 32-bit ALU?
+A
B
Cin Cout = AB + ACin + BCin
Sum = A xor B xor Cin
13
CS365 25
Building a 32-bit ALU
b
0
2
Result
Operation
a
1
CarryIn
CarryOut
R e su lt 3 1a 3 1
b 3 1
R e su lt 0
C a rr y In
a 0
b 0
R e su lt 1a 1
b 1
R e su lt 2a 2
b 2
O p e ra t io n
A L U 0
C a rry In
C a rry O u t
A L U 1
C a rry In
C a rry O u t
A L U 2
C a rry In
C a rry O u t
A L U 3 1
C a rry In
CS365 26
� Two's complement approach: negate b and add.
� How do we negate?
� A clever solution:
What about subtraction (a – b) ?
0
2
Result
Operation
a
1
CarryIn
CarryOut
0
1
Binvert
b
14
CS365 27
�Need to support the set-on-less-than instruction (slt)
– remember: slt is an arithmetic instruction
– produces a 1 if rs < rt and 0 otherwise
– use subtraction: (a-b) < 0 implies a < b
�Need to support test for equality (beq $t5, $t6, offset)
– use subtraction: (a-b) = 0 implies a = b
Tailoring the ALU to the MIPS
CS365 28
Supporting slt
0 3
Re
su
lt
Op
era
tio
n
a
1
Ca
rryI
n
Ca
rryO
ut
0 1
Bin
ve
rt
b2
Le
ss
0 3
Re
su
lt
Op
era
tio
n
a
1
Ca
rryI
n
0 1
Bin
ve
rt
b2
Le
ss
Se
t
Ov
erf
low
de
tec
tio
nO
ve
rflo
w
a. b.
15
CS365 29
Se
ta
31 0
AL
U0
Re
sult0
Ca
rryI
n
a0
Re
sult1
a1 0
Re
sult2
a2 0
Op
era
tion
b3
1b0
b1
b2
Re
sult3
1
Ov
erf
low
Bin
vert
Ca
rry
In
Le
ss
Ca
rryI
n
Ca
rryO
ut
AL
U1
Le
ss
Ca
rryI
n
Ca
rryO
ut
AL
U2
Le
ss
Ca
rryI
n
Ca
rryO
ut
AL
U3
1L
ess
Ca
rryI
n
CS365 30
Test for equality
� Notice control lines:
Seta31
0
Result0a0
Result1a1
0
Result2a2
0
Operation
b31
b0
b1
b2
Result31
Overflow
Bnegate
Zero
ALU0Less
CarryIn
CarryOut
ALU1Less
CarryIn
CarryOut
ALU2Less
CarryIn
CarryOut
ALU31Less
CarryIn
000 = and001 = or010 = add110 = subtract111 = slt
� Ouput zero=1 when result is 0.
16
CS365 31
� Important points about hardware
– all of the gates are always working
– the speed of a gate is affected by the number of inputs to the gate
– the speed of a circuit is affected by the number of gates in series(on the “critical path” or the “deepest
level of logic”)
– What is the critical path in our 32-bit ALU?
CS365 32
Ripple Carry Adder Is Slow
�Logic circuit speed is determined by the number of gates a signal have to pass in the worst case.
�Assuming each 1-bit ALU adds x–gate delay, what is the delay of a 32-bit ALU?
17
CS
36533
Carry Look A
head A
BC
out0
00
“kill”0
1C
in“propagate”
10
Cin
“propagate”1
11
“generate”A
0
B0
SGP
G = A
and
BP
= A xo
rB
A1
B1
SGP
A2
B2
SGP
A3
B3
SGP
Cin
C1 =
G0 +
C0 P
0
C2 =
G1 +
G0 P
1 + C
0 P0 P
1
C3 =
G2 +
G1 P
2 + G
0 P1 P
2 + C
0 P0 P
1 P2
G
C4 =
. . .
P
CS
36534
16-Bit A
dder
CarryIn
Result0--3
ALU0
CarryIn
Result4--7
ALU1
CarryIn
Result8--11
ALU2
CarryIn
CarryOut
Result12--15
ALU3
CarryIn
C1
C2
C3
C4
P0G0
P1G1
P2G2
P3G3
pigi
p i + 1gi + 1
ci + 1
ci + 2
ci + 3
ci + 4
pi + 2gi + 2
pi + 3gi + 3
a0 b0 a1 b1 a2 b2 a3 b3
a4 b4 a5 b5 a6 b6 a7 b7
a8 b8 a9 b9
a10 b10 a11 b11
a12 b12 a13 b13 a14 b14 a15 b15
Carry-lookahead unit
18
CS365 35
Exercise A B Cout0 0 0 “kill”0 1 Cin “propagate”1 0 Cin “propagate”1 1 1 “generate”A0 S
GP
G = A and BP = A xor BA1 S
GP
A2 SGP
A3 SGP
Cin
C1 =
C2 =
C3 =
G =
C4 = . . .
P =
CS365 36
Multiplications
�N bits × N bits → 2N bits result
�Paper and pencil example (unsigned):
Multiplicand: 0 0 1 1
Multiplier: 0 1 0 1
19
CS365 37
Unsigned Combinational Multiplier
�Stage i accumulates A * 2 i if B i == 1
B0
A0A1A2A3
A0A1A2A3
A0A1A2A3
A0A1A2A3
B1
B2
B3
P0P1P2P3P4P5P6P7
0 0 0 0
CS365 38
Discussions
�Multiplication is expensive
�A combinational multiplier uses a great deal of silicon
– 32 32-bit adders needed
�We will discuss designs that are slower but less silicon demanding.
– Due to its complexity, we will first present a basic but suboptimal design, and refine it twice.
20
CS365 39
Shift and Add
�One step per clock tick; n clock cycles needed for n-bit multiplications
B0
A0A1A2A3
A0A1A2A3
A0A1A2A3
A0A1A2A3
B1
B2
B3
P0P1P2P3P4P5P6P7
0 0 0 00 0 0
Clock tick
Clock tick
Clock tick
Clock tick
CS365 40
Example: 0101 ×××× 0011
Product
Multiplicand
To addor not to add
Multiplier
21
CS365 41
Unsigned Shift-Add Multiplier�64-bit Multiplicand reg, 64-bit ALU, 64-bit Product reg, 32-bit multiplier reg
Product
Multiplier
Multiplicand
64-bit ALU
Shift Left
Shift Right
WriteControl
32 bits
64 bits
64 bits
Multiplier = datapath + control
CS365 42
Observations
�Half bits in multiplicand always 0
– 64 bit adder is a waste
� Improvement:
– Use 32 bit multiplicand
– Don’t shift multiplicand left; shift the product right instead
22
CS365 43
Example: 0101 ×××× 0011
Product
Multiplicand
To addor not to add
Multiplier
CS365 44
Shift-Add Multiplier Version 2
�32-bit Multiplicand reg, 32 -bit ALU, 64-bit Product reg, 32-bit Multiplier reg
Product
Multiplier
Multiplicand
32-bit ALU
Shift Right
WriteControl
32 bits
32 bits
64 bits
Shift Right
23
CS365 45
A Second Example
Product Multiplier Multiplicand
0000 0000 0011 0010
0010 0000
0001 0000 0001 00100011 0000 0001 0010
0001 1000 0000 0010
0000 1100 0000 00100000 0110 0000 0010
CS365 46
Observations
�Product register wastes space that exactly matches size of multiplier
�Improvement: combine Multiplier register and Product register
24
CS365 47
Example: 0101 ×××× 0011
Product
MultiplicandTo addor not to add
Multiplier
CS365 48
A Second Example
Multiplicant
Initial Product
Product after 1st shift
Product after 2nd shift
Product after 3rd shift
Product after 4th shift
Product after 5th shift
0 10 0 1
0 00 0 0 1 00 1 1
25
CS365 49
Multiplier Hardware Version 3
�32-bit Multiplicand reg, 32 -bit ALU, 64-bit Product reg, (0-bit Multiplier reg)
Product (Multiplier)
Multiplicand
32-bit ALU
WriteControl
32 bits
64 bits
Shift Right
CS365 50
Discussions
�Can you see where the MIPS Hi and Lo registers come from?
�Can you see the special hardware associated with IA32 EAX and EDX?