Grade School Revisited: How To Multiply Two Numbers
description
Transcript of Grade School Revisited: How To Multiply Two Numbers
Grade School Revisited:How To Multiply Two Numbers
Great Theoretical Ideas In Computer Science
Steven Rudich
CS 15-251 Spring 2004
Lecture 16 March 4, 2004 Carnegie Mellon University
2 X 2 = 5
The best way is often far from obvious!
(a+bi)(c+di) Gauss
Gauss’ Complex PuzzleRemember how to multiply 2 complex numbers a + bi and c + di?
(a+bi)(c+di) = [ac –bd] + [ad + bc] i
Input: a,b,c,d Output: ac-bd, ad+bc
If a real multiplication costs $1 and an addition cost a penny. What is the cheapest way to obtain the output from the input?
Can you do better than $4.02?
Gauss’ $3.05 Method:Input: a,b,c,d Output: ac-bd,
bc+ad
X1 = a + b
X2 = c + d
X3 = X1 X2 = ac + ad + bc + bd
X4 = ac
X5 = bd
X6 = X4 – X5 = ac-bd
X7 = X3 – X4 – X5 = bc + ad
The Gauss optimization saves one multiplication
out of four. It requires 25% less
work.
Time complexity of grade school addition
**
*
*
**
*
*
**
*
*
**
*
*
**
*
*
**
*
*
**
*
*
**
*
*
**
*
*
**
*
+*
*
*
**
*
T(n) = The amount of time grade school addition uses to add two n-bit numbers
We saw that T(n) was linear.
Time complexity of grade school multiplication
T(n) = The amount of time grade school multiplication uses
to add two n-bit numbers
We saw that T(n) was quadratic.
-
X* * * * * * * * * * * * * * * *
* * * * * * * ** * * * * * * ** * * * * * * ** * * * * * * ** * * * * * * ** * * * * * * ** * * * * * * ** * * * * * * ** * * * * * * * * * * * * * * *
n2
Grade School Addition: Linear timeGrade School Multiplication: Quadratic
time
No matter how dramatic the difference in the constants the quadratic curve will eventually
dominate the linear curve
# of bits in numbers
time
Grade School Addition: (n) timeFurthermore, it is optimal
Grade School Multiplication: (n2) time
Is there a clever algorithm to multiply two numbers
in linear time?
Open Problem!
Is there a faster way to multiply two numbers
than the way you learned in grade school?
Grade School Multiplication: (n2) timeKissing Intuition
Intuition: Let’s say that each time an algorithm has to multiply a digit from one
number with a digit from the other number, we call that a
“kiss”. It seems as if any correct algorithm must kiss at
least n2 times.
Divide And Conquer(an approach to faster algorithms)
DIVIDE a problem into smaller subproblems
CONQUER them recursively
GLUE the answers together so as to obtain the answer to the larger problem
Multiplication of 2 n-bit numbers
X = Y =
X = a 2n/2 + b Y = c 2n/2 + d
XY = ac 2n + (ad+bc) 2n/2 + bd
a b
c d
Multiplication of 2 n-bit numbers
X = Y = XY = ac 2n + (ad+bc) 2n/2 + bd
a b
c d
MULT(X,Y):
If |X| = |Y| = 1 then RETURN XY
Break X into a;b and Y into c;d
RETURN
MULT(a,c) 2n + (MULT(a,d) + MULT(b,c)) 2n/2 + MULT(b,d)
Multiplying (Divide & Conquer style)
12345678 * 21394276
*4276 5678*2139 5678*4276
Multiplying (Divide & Conquer style)
12345678 * 21394276
1234*2139 1234*4276 5678*2139 5678*4276
Multiplying (Divide & Conquer style)
12345678 * 21394276
1234*2139 1234*4276 5678*2139 5678*4276
Multiplying (Divide & Conquer style)
12345678 * 21394276
1234*2139 1234*4276 5678*2139 5678*4276
Multiplying (Divide & Conquer style)
12345678 * 21394276
1234*2139 1234*4276 5678*2139 5678*4276
Multiplying (Divide & Conquer style)
12345678 * 21394276
1234*2139 1234*4276 5678*2139 5678*4276
12*21 12*39 34*21 34*39 xxxxxxxxxxxxxxxxxxxxxxxxx
Multiplying (Divide & Conquer style)
12345678 * 21394276
1234*2139 1234*4276 5678*2139 5678*4276
12*21 12*39 34*21 34*39 xxxxxxxxxxxxxxxxxxxxxxxxx
Multiplying (Divide & Conquer style)
12345678 * 21394276
1234*2139 1234*4276 5678*2139 5678*4276
12*21 12*39 34*21 34*39 xxxxxxxxxxxxxxxxxxxxxxxxx
1*2 1*1 2*2 2*1 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Multiplying (Divide & Conquer style)
12345678 * 21394276
1234*2139 1234*4276 5678*2139 5678*4276
12*21 12*39 34*21 34*39 xxxxxxxxxxxxxxxxxxxxxxxxx
1*2 1*1 2*2 2*1 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
2 1 4 2
Multiplying (Divide & Conquer style)
12345678 * 21394276
1234*2139 1234*4276 5678*2139 5678*4276
12*21 12*39 34*21 34*39 xxxxxxxxxxxxxxxxxxxxxxxxx
1*2 1*1 2*2 2*1 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Hence: 12*21 = 2*102 + (1 + 4)101 + 2 = 252
2 1 4 2
Multiplying (Divide & Conquer style)
12345678 * 21394276
1234*2139 1234*4276 5678*2139 5678*4276
252 12*39 34*21 34*39 xxxxxxxxxxxxxxxxxxxxxxxxx
1*2 1*1 2*2 2*1 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Hence: 12*21 = 2*102 + (1 + 4)101 + 2 = 252
2 1 4 2
Multiplying (Divide & Conquer style)
12345678 * 21394276
1234*2139 1234*4276 5678*2139 5678*4276
12*21 12*39 34*21 34*39 xxxxxxxxxxxxxxxxxxxxxxxxx
252 468 714 1326
Multiplying (Divide & Conquer style)
12345678 * 21394276
1234*2139 1234*4276 5678*2139 5678*4276
12*21 12*39 34*21 34*39 xxxxxxxxxxxxxxxxxxxxxxxxx
252 468 714 1326*104 + *102 + *102 + *1
Multiplying (Divide & Conquer style)
12345678 * 21394276
1234*2139 1234*4276 5678*2139 5678*4276
12*21 12*39 34*21 34*39 xxxxxxxxxxxxxxxxxxxxxxxxx
= 2639526
252 468 714 1326*104 + *102 + *102 + *1
Multiplying (Divide & Conquer style)
12345678 * 21394276
2639526 1234*4276 5678*2139 5678*4276
Multiplying (Divide & Conquer style)
12345678 * 21394276
2639526 5276584 12145242 24279128
Multiplying (Divide & Conquer style)
12345678 * 21394276
2639526 5276584 12145242 24279128*108 + *104 + *104 + *1
Multiplying (Divide & Conquer style)
12345678 * 21394276
2639526 5276584 12145242 24279128*108 + *104 + *104 + *1
= 264126842539128
Divide, Conquer, and Glue
MULT(X,Y)
Divide, Conquer, and Glue
MULT(X,Y): If |X|=|Y|=1,Then Return XY
Else ….
Divide, Conquer, and Glue
MULT(X,Y): X=a;b Y=c;d
Mult(a,c)Mult(a,d)
Mult(b,c)
Mult(b,d)
Divide, Conquer, and Glue
MULT(X,Y): X=a;b Y=c;d
Mult(a,c)
Mult(a,d)Mult(b,c)
Mult(b,d)
Divide, Conquer, and Glue
MULT(X,Y): X=a;b Y=c;d ac
Mult(a,d)Mult(b,c)
Mult(b,d)ac
Divide, Conquer, and Glue
MULT(X,Y): X=a;b Y=c;d ac
ac
Mult(a,d)
Mult(b,c)
Mult(b,d)
Divide, Conquer, and Glue
MULT(X,Y): X=a;b Y=c;d ac, ad
acMult(b,c)
Mult(b,d)ad
Divide, Conquer, and Glue
MULT(X,Y): X=a;b Y=c;dac, ad, bc, bd
acad bc bd
Divide, Conquer, and Glue
MULT(X,Y): X=a;b Y=c;d ac2n + (ad+bc)2n/2 + bd
acad bc bd
Time required by MULT
T(n) = time taken by MULT on two n-bit numbers What is T(n)? What is its growth rate? Is it (n2)?
T(n/2) T(n/2) T(n/2) T(n/2)
T(n) = 4 T(n/2) + k’ n + k’’ , n=|X|
X=a;b Y=c;d ac2n + (ad+bc)2n/2 + bd
Time: k’n + k”
acad bc bd
Divide and glue
Conquer
Recurrence Relation
T(1) = k for some constant k
T(n) = 4 T(n/2) + k’ n + k’’ for some constants k’ and k’’
MULT(X,Y):
If |X| = |Y| = 1 then RETURN XY
Break X into a;b and Y into c;d
RETURN
MULT(a,c) 2n + (MULT(a,d) + MULT(b,c)) 2n/2 + MULT(b,d)
Let’s be concrete to keep it simple
T(1) = 1T(n) = 4 T(n/2) + n
Notice that T(n) is inductively defined only for positive powers of 2
How do we unravel T(n) so that we can determine its growth rate?
Technique 1Guess and Verify
Guess: G(n) = 2n2 – n
Verify: G(1) = 1 and G(n) = 4 G(n/2) + n
4 [2(n/2)2 – n/2] + n= 2n2 – 2n + n= 2n2 – n= G(n)
Technique 1Guess and Verify
Guess: G(n) = 2n2 – n
Verify: G(1) = 1 and G(n) = 4 G(n/2) + n
Similarly:T(1) = 1 and T(n) = 4 T(n/2) + n
So T(n) = G(n) = (n2)
Technique 2Guess Form and Calculate Coefficients
Guess: T(n) = an2 + bn + c for some a,b,c
Calculate: T(1) = 1 a + b + c = 1
T(n) = 4 T(n/2) + n an2 + bn + c = 4 [a(n/2)2 + b(n/2) + c] + n = an2 + 2bn + 4c + n (b+1)n + 3c = 0
Therefore: b=-1 c=0 a=2
Technique 3: Labeled Tree Representation
T(n) = n + 4 T(n/2)
n
T(n/2) T(n/2) T(n/2) T(n/2)
T(n) =
T(1)
T(1) = 1
1=
Node labels: time spent NOT conquering.
T(n) = n + 4 T(n/2)
n
T(n/2) T(n/2) T(n/2) T(n/2)
T(n) =
T(1)
T(1) = 1
1=
T(n/2) T(n/2) T(n/2) T(n/2)
T(n) = 4 T(n/2) + k’ n + k’’ , n=|X|
X=a;b Y=c;d ac2n + (ad+bc)2n/2 + bd
Time: k’n + k”
acad bc bd
Divide and glue
Conquer
n
T(n/2) T(n/2) T(n/2) T(n/2)
T(n) =
n
T(n/2) T(n/2) T(n/2)
T(n) =
n/2
T(n/4)T(n/4)T(n/4)T(n/4)
nT(n) =
n/2
T(n/4)T(n/4)T(n/4)T(n/4)
n/2
T(n/4)T(n/4)T(n/4)T(n/4)
n/2
T(n/4)T(n/4)T(n/4)T(n/4)
n/2
T(n/4)T(n/4)T(n/4)T(n/4)
nT(n) =
n/2 n/2 n/2n/2
11111111111111111111111111111111 . . . . . . 111111111111111111111111111111111
n/4 n/4 n/4n/4n/4n/4n/4n/4n/4n/4n/4n/4n/4n/4n/4 n/4
n
n/2 + n/2 + n/2 + n/2
Level i is the sum of 4i copies of n/2i
. . . . . . . . . . . . . . . . . . . . . . . . . .
1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1
n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4
0
1
2
i
log n2
= 1n
= 2n
= 4n
= 2in
= (n)n
n(1+2+4+8+ . . . +n) = n(2n-1) = 2n2-n
Level i is the sum of 4 i copies of n/ 2 i
1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1
. . . . . . . . . . . . . . . . . . . . . . . . . .
n/ 2 + n/ 2 + n/ 2 + n/ 2
n
n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4
0
1
2
i
lo g n2
Divide and Conquer MULT: (n2) time Grade School Multiplication: (n2) time
All that work for nothing!
Divide and Conquer MULT: (n2) time Grade School Multiplication: (n2) time
In retrospect, it is obvious that the kissing number for Divide and Conquer MULT is n2,
since the leaves are in correspondence with
the kisses.
MULT revisited
MULT calls itself 4 times. Can you see a way to reduce the number of calls?
MULT(X,Y):
If |X| = |Y| = 1 then RETURN XY
Break X into a;b and Y into c;d
RETURN
MULT(a,c) 2n + (MULT(a,d) + MULT(b,c)) 2n/2 + MULT(b,d)
Gauss’ Optimization:Input: a,b,c,d Output: ac, ad+bc,
bd
X1 = a + b
X2 = c + d
X3 = X1 X2 = ac + ad + bc + bd
X4 = ac
X5 = bd
X6 = X3 – X4 - X5 = ad + bc
Karatsuba, Anatolii Alexeevich (1937-)
Sometime in the late 1950’s Karatsuba had formulated the first algorithm to break the kissing barrier!
Gaussified MULT(Karatsuba 1962)
T(n) = 3 T(n/2) + n
Actually: T(n) = 2 T(n/2) + T(n/2 + 1) + kn
MULT(X,Y):
If |X| = |Y| = 1 then RETURN XY
Break X into a;b and Y into c;d
e = MULT(a,c) and f =MULT(b,d)
RETURN e2n + (MULT(a+b, c+d) – e - f) 2n/2 + f
n
T(n/2) T(n/2) T(n/2)
T(n) =
n
T(n/2) T(n/2)
T(n) =
n/2
T(n/4)T(n/4)T(n/4)
nT(n) =
n/2
T(n/4)T(n/4)T(n/4)
n/2
T(n/4)T(n/4)T(n/4)
n/2
T(n/4)T(n/4)T(n/4)
n
n/2 + n/2 + n/2
Level i is the sum of 3i copies of n/2i
. . . . . . . . . . . . . . . . . . . . . . . . . .
1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1
n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4
0
1
2
i
log n2
= 1n= 3/2n
= 9/4n
= (3/2)in
n(1+3/2+(3/2)2+ . . . + (3/2) ) =3n 2n1 58. .... log n2
(3 / 2) nlog n2
Level i is the sum of 3 i copies of n/ 2 i
1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1
. . . . . . . . . . . . . . . . . . . . . . . . . .
n/ 2 + n/ 2 + n/ 2
n
n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4
0
1
2
i
lo g n2
1 + X1 + X11 + X + X22 + X + X 33 + … + X + … + Xn-1n-1 + X + Xnn ==
XXn+1 n+1 - 1- 1
XX - 1- 1
The Geometric Series
1 x x x ... xx 1
x 1
x3
2k log n
3232
1
12
3 3
n2
3 3
n2
3n
n2
2 3 kk 1
2
log n
log n log n
log 3log n
log 3 log 3
2
2 2
2 2
2 2
FHIK
af af
Substituting into our formula….
= 1n= 3/2n
= 9/4n
= (3/2)in
3n 2nlog 32
(3 / 2) nlog n2
Level i is the sum of 3 i copies of n/ 2 i
1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1
. . . . . . . . . . . . . . . . . . . . . . . . . .
n/ 2 + n/ 2 + n/ 2
n
n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4
0
1
2
i
lo g n2
n3n
n2
log 32
FHG
IKJ
Dramatic improvement for large n
A huge savings over (n2) when n gets large.
3n 2n (n ) (n )log 3 log 3 1.58...2 2
Strange! The Gauss optimization seems to only be worth a 25%
speedup. It simply replaces every 4 multiplications with 3. Where
did the power come from?
We applied the Gauss optimization
RECURSIVELY!
So what is the kissing number of Karatsuba’s algorithm?
Not even clear that we kiss once.
Mystery MULT
What is going on? Is this a better way?
MULT(X,Y):
If Y = 0 then RETURN 0
If Y = 1 then RETURN X
Break Y into c;d, where |d| =1
RETURN 2 [MULT(X,c)] + MULT(X,d)
That is the Egyptian method sated in
recursive language.
Multiplication Algorithms
Kindergarten n2n
Grade School n2
Karatsuba n1.58…
Fastest Known n logn loglogn
REFERENCES
Karatsuba, A., and Ofman, Y. Multiplication of multidigit numbers on automata. Sov. Phys. Dokl. 7 (1962), 595--596.