Notes on the analysis of multiplication algorithms.. Dr. M. Sakalli, Marmara University.

9
Notes on the analysis of Notes on the analysis of multiplication algorithms.. multiplication algorithms.. Dr. M. Sakalli, Marmara University Dr. M. Sakalli, Marmara University
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    220
  • download

    0

Transcript of Notes on the analysis of multiplication algorithms.. Dr. M. Sakalli, Marmara University.

Notes on the analysis of multiplication algorithms.. Notes on the analysis of multiplication algorithms.. Dr. M. Sakalli, Marmara UniversityDr. M. Sakalli, Marmara University

3-2M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes

Integer Multiplication MIT notes and wikipediaInteger Multiplication MIT notes and wikipedia

Example.. Classic High school math.. Example.. Classic High school math.. Let Let gg = A|B and = A|B and hh = C|D where A,B,C and D are n/2 bit integers = C|D where A,B,C and D are n/2 bit integersSimple Method: Simple Method: ghgh = (2 = (2n/2n/2A+B)(2A+B)(2n/2n/2C+D) same as given above. C+D) same as given above. 4 multiplication routines. XY = (24 multiplication routines. XY = (2nn)AC+2)AC+2n/2n/2(AD+BC) + BD and (AD+BC) + BD and

carriages carriages cc..

Long multiplicationLong multiplication:: rrjj = = cc + + ΣΣk = i-jk = i-j ggj j hhkk

Running Time Recurrence T(n) < 4T(n/2) + 100n, 100 Running Time Recurrence T(n) < 4T(n/2) + 100n, 100 multiplications.??, In-place??..multiplications.??, In-place??..

T(n) = T(n) = (n(n22))

Provided that neither Provided that neither cc nor the total sum exceed nor the total sum exceed log space,log space, indeed, a simple inductive argument shows that the carry indeed, a simple inductive argument shows that the carry cc and the total sum for and the total sum for rrii can never exceed can never exceed nn and 2 and 2nn: <<?? 2lg: <<?? 2lgn n respectively. Space efficiency: S(respectively. Space efficiency: S(nn)=O(loglog()=O(loglog(NN)), )), (loglog((loglog(NN)). )). NN==ghgh. .

3-3M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes

Integer Multiplication MIT notes and wikipediaInteger Multiplication MIT notes and wikipedia

Pseudo code: Pseudo code: Log space multiplication algorithmLog space multiplication algorithm, , multiply(multiply(gg[0..n-1], [0..n-1], hh[0..n-1]) // Arrays representing to [0..n-1]) // Arrays representing to

the binary representations the binary representations x ← 0 x ← 0 for i= 0 : 2n-1for i= 0 : 2n-1 for j= 0 : i for j= 0 : i k ← i - j k ← i - j x ← x + (x ← x + (gg[j] × [j] × hh[k]) [k])

r[i] ← x mod 2r[i] ← x mod 2 x ← floor(x/2) //I think this is carriage return. Last x ← floor(x/2) //I think this is carriage return. Last bit if 1.. bit if 1.. endendendend

Lattice method,Lattice method, Muhammad Muhammad ibnibn Musa al-Khwarizmi Musa al-Khwarizmi. Gauss's . Gauss's complex multiplication algorithm.complex multiplication algorithm.

3-4M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes

Integer Multiplication MIT notes and wikipediaInteger Multiplication MIT notes and wikipedia

Karatsuba’s algorithmKaratsuba’s algorithm: Polynomial extensions.. : Polynomial extensions..

gg == gg111010n/2n/2 + + gg22

hh == hh111010n/2n/2 + + hh22

gg hh == gg1 1 hh111010nn + (+ (gg11hh2 2 + + gg22hh11)10)10n/2n/2 + + gg22hh22

((gg11hh22+ + gg22hh11) ) = = ((gg1 1 + + hh11)()(gg2 2 + + hh22) - () - (gg22hh22+ + gg11hh11), f(n) ), f(n) = =

4sums+1 more final sum 4sums+1 more final sum = = 5n, n>2, suppose it is 5n, n>2, suppose it is a constant 100n, and some carriages. a constant 100n, and some carriages.

XY XY == (2 (2n/2n/2+2+2nn)AC+2)AC+2n/2n/2(A-B)(C-D) + (2(A-B)(C-D) + (2n/2n/2+1) BD+1) BD

A(n) = 3A(n/2)+5n, A(n) = 3A(n/2)+5n,

A(n) A(n) << O(n O(n lg 3lg 3) ) ≈≈(n(n1.61.6))

Base value 7, when n<2, Base value 7, when n<2,

3-5M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes

Karatsuba (g, h : n-digit integer; n : integer) Karatsuba (g, h : n-digit integer; n : integer) // return (2n)-digit integer is// return (2n)-digit integer isa, b, c, d;a, b, c, d; // (n/2)-digit integer// (n/2)-digit integerU, V, W; U, V, W; //n-digit integer;//n-digit integer;beginbegin if n == 1 thenif n == 1 then return return g(0)*h(0); ????g(0)*h(0); ???? elseelse g1 g1 g(n-1) ... g(n/2); g(n-1) ... g(n/2); g2 g2 g(n/2-1) ... g(0); g(n/2-1) ... g(0); h1 h1 h(n-1) ... h(n/2); h(n-1) ... h(n/2); h2 h2 h(n/2-1) ... h(0); h(n/2-1) ... h(0); U U Karatsuba ( g1, h1, n/2 ); Karatsuba ( g1, h1, n/2 ); V V Karatsuba ( g2, h2, n/2 ); Karatsuba ( g2, h2, n/2 ); W W Karatsuba ( g1+g2, h1+h2, n/2 ); Karatsuba ( g1+g2, h1+h2, n/2 ); return return U*10 U*10nn + (W-U-V)*10^n/2 + V; + (W-U-V)*10^n/2 + V; end if;end if; end Karatsuba;end Karatsuba;

FFT and Fast Matrix multiplication. FFT and Fast Matrix multiplication.

3-6M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes

Quarter square multiplierQuarter square multiplier 1980, Everett L. Johnson: 1980, Everett L. Johnson:

ghgh = {( = {(gg + + hh))22 - ( - (gg - - hh))22}/4= {(}/4= {(gg22 + 2 + 2hghg+ + hh22) - () - (gg22 - 2 - 2hghg+ + hh22) }/4 ) }/4

Think hardware implementation, with a lookup table Think hardware implementation, with a lookup table (converter), the difficulty is that summation of the two (converter), the difficulty is that summation of the two numbers each 8bits, will require at least 9 bits, when squared, numbers each 8bits, will require at least 9 bits, when squared, 18 bits wide.. But if divided by 2 before squared, (discarding 18 bits wide.. But if divided by 2 before squared, (discarding remainder when n is odd) . remainder when n is odd) .

Table lookupTable lookup from 0 to .. 9+9, from 0 … to 81. O(3n), working from 0 to .. 9+9, from 0 … to 81. O(3n), working S(n) = S(n) = (n).(n).

i.e. 7 by 3, observe that the sum and difference are 10 and 4 i.e. 7 by 3, observe that the sum and difference are 10 and 4 respectively. Looking both of those values up on the table respectively. Looking both of those values up on the table yields 25 and 4, the difference of which is 21. yields 25 and 4, the difference of which is 21.

3-7M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes

Russian (Egyptian) Peasant’s binary multiplicationRussian (Egyptian) Peasant’s binary multiplication

Shift and add.. In-place algorithm, may be implemented and 2n Shift and add.. In-place algorithm, may be implemented and 2n space.. Try complex examples.space.. Try complex examples.

11 3, in binary 11 3, in binary 1011 1011 11 11 011 011

5 6, 5 6, 101 101 110 110 110 110

2 12, 2 12, 10 110010 1100

1 24, 1 24, 1 1 11000 11000.. = 10000111000 11000.. = 100001

T(n) = T(n) = (n)+O(n(n)+O(n22), think about this?.. Why.. ), think about this?.. Why..

S(n) = S(n) = (loglog(n)) which is the carriage. (loglog(n)) which is the carriage.

If invertible? Division potential question. If invertible? Division potential question.

3-8M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes

Matrix multiplicationMatrix multiplication, , 8 multiplications, O(n8 multiplications, O(n33))

A11 A12

A21 A22

B11 B12

B21 B22

C11 C12

C21 C22

C11 A11B11 A12B21

C12 A11B12 A12B22

C21 A21B11 A22B21

C22 A21B12 A22B22

Pseudo code for MM. Pseudo code for MM. MM(A, B) MM(A, B)

for i ← 1 : Nfor i ← 1 : N for j ← 1 : N for j ← 1 : N C(i, j) ← 0;C(i, j) ← 0; for k ← 1 : N for k ← 1 : N

C(i, j) ← C(i, j) + A(i, k) * B(k, j)C(i, j) ← C(i, j) + A(i, k) * B(k, j)end, end, endend, end, end

Time complexity of this algo is Time complexity of this algo is nn33 multiplications and multiplications and additions. additions.

Can we do better using divide and conquer?.. Can we do better using divide and conquer?.. Subdividing matrices into four sub-matrices. Subdividing matrices into four sub-matrices. T(n) = b, nT(n) = b, n2,2,T(n) = 8T(n/2) + cT(n) = 8T(n/2) + cnn22, n>2, which has T(n) = O(n??), n>2, which has T(n) = O(n??)

3-9M, Sakalli, CS246 Design & Analysis of Algorithms, Lecture Notes

Strassen’s AlgorithmStrassen’s Algorithm

P1 A11 A22 B11 B22 P2 A21 A22 B11

P3 A11 B12 B22 P4 A22 B21 B11

P5 A11 A12 B22

P6 A21 A11 B11 B12 P7 A12 A22 B21 B22

C11 P1 P4 P5 P7

C12 P3 P5

C21 P2 P4

C22 P1 P3 P2 P6

Strassen: 7 multiplies, 18 additionsStrassen: 7 multiplies, 18 additionsT(n) = b, nT(n) = b, n2,2,T(n) = 7T(n/2) + (7m+18s)T(n) = 7T(n/2) + (7m+18s)nn22, n>2, which has T(n) = , n>2, which has T(n) = O(nO(n2.812.81))77nn22(1/4+1/16+…)(1/4+1/16+…)Strassen-Winograd: 7 multiplies, 15 additionsStrassen-Winograd: 7 multiplies, 15 additionsCoppersmith-Winograd, O(nCoppersmith-Winograd, O(n2.3762.376) (not easily implementable)) (not easily implementable)

In practice faster (not large hidden constants) for relatively In practice faster (not large hidden constants) for relatively smaller n~64, and stable but demonstrated that for some smaller n~64, and stable but demonstrated that for some matrices (Strassen and Strassen-Winograd) are too unstable.matrices (Strassen and Strassen-Winograd) are too unstable.