Numerical Methods for Quantum ComputingNumerical Methods for Quantum Computing Thomas Huckle...

Post on 25-May-2020

9 views 0 download

Transcript of Numerical Methods for Quantum ComputingNumerical Methods for Quantum Computing Thomas Huckle...

Numerical Methods forQuantum Computing

Thomas HuckleComputer Science / MathematicsTechnische Universität München

Collaboration with MPQO, Garching

Introduction to Quantum Computing

Photon Polarization:

source filter A detector

Photon: Polarization direction

Two orthogonal Filter:

source filter A filter C detector

Three filter in 45°

source A B C detector

Explanation:

Photon polarization can be described as a directionthat is a linear combination of up↑ or right→ .

Description in a vector space of the form

ψ = a |↑> + b |→> with |a|2 + |b|2 = 1

|a|2 the probability for state |↑>

The three filters in the previous exampleare related to polarization directions

↑ (A) , (B) , and → (C ).

Filter A restricts the photon ψ to ist component ↑ .

If this photon beam reaches filter C, it is restricted to its→ orthogonal component, that is zero.

With additional filter B in between, ↑ is restrictedto its 45° component φ.

Additional filter C restricts φ to → .

A

C: Output 0

A

B

C: Output: 1/8

State space representation for quantum system:

Quantum state can be measured as |0> or |1>.

General state can be described as a |0> + b |1>with |a|2 + |b|2 =1 as superposition:

⎟⎟⎠

⎞⎜⎜⎝

⎛+⎟⎟⎠

⎞⎜⎜⎝

⎛10

01

baa

b

Measuring forces the state to be |0> or |1> .

Inner product between state vectors |x> and |y>:<x| |y> = <x|y> in bra-ket notation (equiv. to xTy).

Hilbert space containing the state vectors.

Physical quantities are described by observables a.

Observables are described by Hermitian operator A acting on the Hilbert space.

Let λ1 and λ2 be eigenvalues of A.Assume system is in superposition c1|λ1> + c2|λ2>.

Measuring a the system undergoes a change to oneof the eigenstates related to the observed eigenvalue.

c1|λ1> + c2|λ2> |λ1>.

ci denotes the amplitude of the wave function and theProbability for a to be in state |λi> is given by |ci|2.

Outer product (matrix): |x><y|, e.g. ( ) ⎟⎟⎠

⎞⎜⎜⎝

⎛=⎟⎟

⎞⎜⎜⎝

⎛=><

0010

1001

|10|

Outer product can also be used to describe transformationsof quantum states:

⎟⎟⎠

⎞⎜⎜⎝

⎛=><+><=

0110

|01||10|X

( ) ( )( )>+>=

=>+>><+><=>+>0|1|

1|0||01||10|1|0|ba

babaX

⎟⎟⎠

⎞⎜⎜⎝

⎛=⎟⎟

⎞⎜⎜⎝

⎛⋅⎟⎟⎠

⎞⎜⎜⎝

⎛ab

ba

0110

>→>>→>

0|1|1|0|:X

Different descriptions for operator X:

Quantum Bit = QubitQubit is a unit vector in a two-dimensional complexvector space with fixed basis, denoted by { |0>, |1> }

The orthonormal basis can be related e.g. to - polarization ↑ and →- spin up and down (1/2 or -1/2) of an electron or a nucleus.

The basis states |0> and |1> represent the classicalbit values 0 and 1. Each measurement gives only |0> or |1>.

But qubits can be in a superposition of |0> and |1>, e.g.a |0> + b |1> with complex a,b of norm 1.

|a|2 and |b|2 giving the probabilities of state |0> or |1>.

Quantum bit can be in infinitely many superposition states,but we can extract only a single bit‘s worth of information:

Measurement forces the state to |0> or |1> with only twopossible results.

Measurement changes the system!

Realization of Qubits?

Qubits can be realized by Nuclear Magnetic Resonance NMRas spin of a number of nuclei of a molecule in a liquid thatcontains a large number of these molecules.

Molecule: 13C trichloroethylene (TCE).

Nuclei: hydrogen nucleus (proton) has strong magnetic moment.

Inside powerful external magnetic field, each proton‘s spinprefers to align itself with the field.

By RF pulses spin direction can be induced to tip off-axisstatic field leads to precession around the main axis.

The magnetic field induced by the precessing protons can bedetected by magnetic induction.

Two possible states of spin can have different energy levelin view of the external magnetic field.

Multiple Qubits:

Consider n-particle quantum system.

Classical: the space of n-particle system, where eachparticle has two possible states, can be describedby a 2n-dimensional vector space.Cartesian product:

Quantum system: Description in 2n-dimensional vector space!Tensor product: )dim()dim()dim( YXYX ⋅=⊗

)dim()dim()dim( YXYX +=×

Four basis vector for two states:

>⊗>>⊗>>⊗>>⊗> 1|1|,0|1|,1|0|,0|0|

Excursion: Tensor product of matrices and vectors:

( ) ( ) ( ) mnkjkj

vusrsr

mnkjkj BaBAbBaA ,

1,,,

1,,,

1,, ,===

⋅=⊗⇒==

Example:

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

=⎟⎟⎠

⎞⎜⎜⎝

⎛⊗⎟⎟⎠

⎞⎜⎜⎝

52483936444033302624131222201110

13121110

4321

( ) ( ) ( )yxyxyxyyxx nmrr

njj L111

, =⊗⇒== ==

Matrices

Vectors

( ) ( ) ( )3330222011101110321 =⊗

Abbreviation:first second qubit

>>>>>=⊗> 11|,10|,01|,00|0|0|

>+>+>+>>= 11|10|01|00|| dcbax

4 basis states for two qubits:

Dimension of two qubit state space: 2*2=4

⎟⎟⎠

⎞⎜⎜⎝

⎛>⎟

⎠⎞

⎜⎝⎛+>⎟

⎠⎞

⎜⎝⎛⊗>+⎟⎟

⎞⎜⎜⎝

⎛>⎟

⎠⎞

⎜⎝⎛+>⎟

⎠⎞

⎜⎝⎛⊗>=

=>+>⊗>+>+>⊗>=>=+>+>+>>=

1|0|1|1|0|0|

)1|0|(1|)1|0|(0|11|10|01|00||

vd

vcv

ub

uau

dcbadcbax

Measuring the first bit to be |0> with probability u2=a2+b2.

Basis is given by |000> , … , |111> and each state canbe described in this basis as superposition by

a1 |000> + a2 |001> + … + a8 |111> with a1 … a8, , ||a||=1.

>>>=⎟⎟⎠

⎞⎜⎜⎝

⎛⊗⎟⎟⎠

⎞⎜⎜⎝

⎛⊗⎟⎟⎠

⎞⎜⎜⎝

⎛=

⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜

>= 0|0|0|01

01

01

00000001

000|

3 qubits lead to 23=8 dimensional space, q qubits to 2q dimensional space.

Example: Three qubits

Some states cannot be described by the decompositioninto two separated component states by a tensor product:

>+>>= 11|00|| y( ) ( ) ybdbcadacdcba >≠+>+>+>=>+>⊗>+> 11|10|01|00|1|0|1|0|

Such states are called entangled states, they have noclassical counterpart.Opposite: Pure states.

Exponential growth of state space suggests possibleexponential speed-up of computationson quantum computers!

Entangled States:

EPR ParadoxConsider two maximally entangled particles

( )>+> 11|00|2

1

One particles is send to Alice and the other to Bob.

Alice measures her particle and observes |0>.Therfore, the combined state of the two particles is |00>.

Hence, a measurement by Bob has to deliver also |0>.

There is no possibility to use this mechanism forcommunication!Alice first or Bob first?

Quantum Gates – Pauli Matrices

>−→>

>→>

>→>

>−→>

>→>

>→>

>→>

>→>

1|1|

0|0|:

0|1|

1|0|:

0|1|1|0|:

1|1|

0|0|:

Z

i

iY

X

I⎟⎟⎠

⎞⎜⎜⎝

⎛1001

⎟⎟⎠

⎞⎜⎜⎝

⎛0110

⎟⎟⎠

⎞⎜⎜⎝

⎛− 01

10i

⎟⎟⎠

⎞⎜⎜⎝

⎛−1001

Four unitary basis matrices for four-dim. space of unitary 2 x 2 matrices.

Possible transformations (quantum gate) for one qubit.

Identity

Negation, Px

Py = i PzPx

Phase shift, Pz

Two-Qubit Gate: Cnot

>→>>→>>→>>→>

10|11|11|10|01|01|00|00|:notC

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

0100100000100001

Cnot acts as the identity on the second qubit iff thefirst qubit is in state |0>;If the first qubit is |1>, the second qubit is changed like X.

( )( ) 2/)()(

2/XZIIZI

XZXIIZIICnot

⊗−+⊗+==⊗−⊗+⊗+⊗=

Representation by tensor product and Pauli matrices:

2mod:,||| +⊕>⊕>>= jiiijCnot

Graphical representation of gates:

control bit

conditional negation

Single bit operations: X

Y

Z

Two-Qubit Gate: Walsh-Hadamard Transformation

( )

( )>−>→>

>+>→>

1|0|2

11|

1|0|2

10|:HOne-dim. case:

n-qubit case: HHHW ⊗⊗⊗= L

( )

( ) ( ) ( )( )

∑−

=

>=

=>+>⊗⊗>+>⊗>+>=

>=⊗⊗⊗

12

0|

21

1|0|1|0|1|0|21

000|

n

jn

n

j

HHH

L

LL

Application generates a superposition of all 2n

possible states:

⎟⎟⎠

⎞⎜⎜⎝

⎛−

=11

112

1H

Toffoli GateControlled-controlled-not CCnot

Negates the last bit of three iff the first two are both 1.

( ) ||| >⊕∧>→ kjiijk

NAND:

i

j

1 ( ) ( ) 1⊕∧=¬ jiij

No cloningAn unknown system cannot be cloned by a unitary transform!

Proof: Assume unitary U does cloning.>>>→ ϕϕϕϕ ||0:| anyforU

( )( ) ( )

( ) ( )( )>+>+>+>=

=>+>⊗>+>>==

>+>=>+>>=>+>=>

>>=>>=

φφφϕϕφϕϕ

φϕφϕψψ

φφϕϕφϕψφϕψ

φφφϕϕϕ

|||||||||

||0|0|0|||:|

|0||0|

2

2!

cc

cUUcUc

UandU

which is not true in general.

Then it must hold:

Quantum Gate ArraysQuantum gates are always unitary operators.In order to model a classical function f(x) we considera quantum gate array Uf defined by

>⊕→> )(,|,|: xfyxyxU f

where denotes the bitwise exclusive-OR.

Uf is unitary and can be realized by quantum gate array.

To compute f(x) we apply Uf on |x,0>

( )⎩⎨⎧

=>=>

>=⊕=>0)(0,|1)(1,|

)(0,|0,|xfifxxfifx

xfxxU f

( )( ) ( ) ( ) >>=⊕⊕=>⊕=> yxxfxfyxxfyxUyxUU fff ,|)()(,|)(,|,|

Quantum ParallelismUf is applied to an input vector in superposition.Hence, Uf is applied to all basis vectors in thesuperposition simultaneously and will generatea superposition of the results.In this way it is possible to compute f(x) for 2n values of x in a single application of Uf.

Start with n-qubit state |00…0>.Apply Walsh-Hadamard transformation W superposition

( )

( ) ( )( ) ∑−

=

>=>+>⊗⊗>+>=

>=⊗⊗⊗12

0|

211|0|1|0|

21

000|n

xnn

x

HHH

L

LL

( )

∑∑−

=

=

=

>=

=>=⎟⎟⎠

⎞⎜⎜⎝

⎛>

12

0

12

0

12

0

)(,|21

0,|210,|

21

n

nn

xn

xfn

xnf

xfx

xUxU

computes f(x) by n qubits with 2n states simultaneously.Problem: Measurement and interpretation of output.

Famous quantum algorithms:-Quantum Fourier Transform

(of length 2m with m(m+1)/2 gates)-Shor‘s algorithm for factoring n-digit numbers

(in polynomial time)- Grover‘s search algorithm (O(√n))

Unitary Matrices and Lie Algebras

Quantum Algorithm Unitary Matrix

Space of unitary matrices U is a Lie Group, that is:U is a smooth manifold and a group, where multiplicationand inversion are smooth mappings.

The tangential space T to U in the identity I is a Lie Algebra:

hermitianHUiHU U ,),exp( ∈=

Mapping from the vector space of hermitian matricesin the unitary Lie group:

is called the exponential mapping.

Important Lie Groups and Lie Algebras:

SU(n): unitary matrices with det(U)=1su(n): hermitian matrices with trace = 0

exp(iH): su(n) → SU(n)

trace(H)=0 →

U(n): unitary matricesu(n): hermitian matrices

exp(iH): u(n) → U(n)HH UiUUiUiH ⋅Λ⋅=Λ= )exp()exp()exp(

1)0exp(exp

)exp())det(exp())det(exp(

==⎟⎟⎠

⎞⎜⎜⎝

⎛=

==Λ=

jj

jj

i

iiiH

λ

λ

Universal Gates

A set of elementary gates such that every QuantumTransform can be composed

A set of elementary unitary matrices such thatevery unitary matrix can be represented as a product.

Classical: NAND is universal

Quantum setting: C not with together with all single qubit gates (U(2)) in unitary group

h>>=

∂∂

)(|)()(| ttHitt

ψψ

Quantum DynamicsWave function ψ(t) describes the state of a quantum systemdepending of time t.Vector |ψ(t)> element of Hilbert space with orthogonalbasis |ψk(t)>, k=1,2,…

∑ >=<>>=k kkkk ctct ψψψψ |,)(|)(|

Change of |ψ(t)> in time is described by theHamilton operator H and follows the Schrödinger equation

Stationary solution: >⋅−>=>= )0(|)exp()0(|)()(| ψψψ iHttUt

Hamiltonian H is Hermitian.

The real eigenvalues of H are the energy levelsof stationary states described by the relatedeigenvectors:

>>= kkk EH ψψ ||

Hamiltonian H = Hdrift + Hcontrolinternal coupling external pulse

Density MatrixLet a state |ψi> appear with probability pi. Then the expectation value of observable a is <ψi|A|ψi>.Mean value of a is

∑=

><>=<n

iiii ApA

1

|| ψψ

The density matrix is defined by

∑=

><=n

iiiip

1|| ψψρ

Therefore, the expectation value can be written as

( )ATraceA ρ>=<

QFT and ShorQuantum Fourier Transform maps a quantum state intoanother state where the new amplitudes are the Fouriertransform of the original ones:

∑∑ >→>y

QFT

xyyGxxg |)(|)(

where

and x and c range over the binary representations ofintegers between 0 and N-1 = 2m-1.

))(()( xgDFTyG =

∑−

=

>→>12

0

22 |21:|

mm

c

icx

mQFT cexU π

Quantum gates for the QFT:

Shor shows that QFT can be implemented with m(m+1)/2 gates

Fourier transform is itself a unitary matrix

⎟⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜⎜

⎟⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜⎜

=

⎟⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜⎜

−−−−−

− 1

2

1

0

)1)(1()1(21

)1(242

12

1

2

1

0

1

11

1111

1

nnnnn

n

n

n v

vvv

n

c

ccc

M

L

MMMM

L

L

L

M

ωωω

ωωωωωω

)2exp()( niconj πωω −==

Special case n=2, 1 qubit:

HF =⎟⎟⎠

⎞⎜⎜⎝

⎛−

=11

112

12

Special case n=4, 2 qubit:

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

−−−−

−−=

ii

iiF

111111

111111

21

4

Fn can be reduced to F2=H, permutations, and diagonal gates:

122,

001

+−− =⎟⎟⎠

⎞⎜⎜⎝

⎛= jkjkijk jke

B πθθ SWAP:

2 qubit QFT:

|x1>

|x0>

SWAP

H

H

B12

UQFT,2(x)

If g(x) has a period r, that is a power of 2, then

∑∑ >=⎟⎠

⎞⎜⎝

⎛>

j

m

jx

QFT rjcxxgU 2||)(

So only coefficients to multiples of 2m/r are nonzero.When the period r does not divide 2m, the transformapproximates the exact case, so most of the amplitudeis attached to integers close to multiples of 2m/r.

Outline of ShorAim: Find prime factors of given integer N=pq.

Step 1: Take m < N randomly. Calculate the greatest commondivisor gcd(m,N) (Euclid).If gcd(m,N) neq 1: Great, done.

Step 2: Define function fN(a)=ma mod N.Find the smallest P with mP=1 mod N.P is called the order or period.This is done by QFT. All the rest is classical.

Step 3: If P is odd, it cannot be used: Go back to step 1.Step 4: P is even, and therefore it holds

Nmmm pPPmod0111 22 =−=⎟

⎠⎞⎜

⎝⎛ +⎟⎠⎞⎜

⎝⎛ −

If second factor = 0 mod N, go back and try other m

Step 4 continued:If second factor is not 0 mod N, thenmP/2-1 contains either p or q.(Note, that mP/2-1 cannot be a multiple of N in this case;otherwise it would hold mP/2=1 mod N which contradictsthe assumption that P is the smallest number withmP=1 mod N.)

Step 5: The number d = gcd(mP/2 -1,N) is either p or q.

Numerical Problems

1. Mathematical properties of related matrices

2. Matrix exponential

3. Quantum compiler and parallel matrix multiplication

4. Approximating the smallest eigenvalue of huge H

5. Solving special linear systems

1. Properties of Matrices in Quantum Computing

Numerical methods should take into account special propertiesof the considered matrices, as

-sparsity (e.g. PDE,…)

-structure (e.g. FFT, symplectic, tensor structure)

-general dense

Typical Matrices I

∑ ⊗⊗⊗k

kp

kkk QQQ )()(

2)(

1 Lα

{ } { } matricesPauliPPPIQ zyxk

j :,,)( ∪∈

Qj(k) describing the interaction between different

quantum states.Usually, most of the Qj

(k) are I.The other are Px, Py, or Pz

2222 IIPIIj

x ⊗⊗⊗⊗⊗⊗ LL

222222 IIPIIPIIk

z

j

z LLL ⊗⊗⊗⊗⊗⊗⊗

Typical Matrices II1D spin chain, drift Hamiltonian

1 2 3 4 5

xx

xx

xx

xx

PPIIIIPPIIIIPPIIIIPPH

⊗⊗⊗⊗⋅+⊗⊗⊗⊗⋅+⊗⊗⊗⊗⋅+⊗⊗⊗⊗⋅=

4

3

2

1:

αααα

Typical Pattern

Sparsity:O(n log(n))

Structured:constantalongdiagonals

1D spin chain, control Hamiltonian

2. Computation of exponential of a matrix

Definition: ∑∞

=

==0

!/)exp(k

kA kAeA

„19 dubious ways to compute the exponential of a matrix“Moler, van Loan

Example: Taylor expansion is numerically instable

⎟⎟⎟⎟

⎜⎜⎜⎜

−=⎟⎟

⎞⎜⎜⎝

−=⎟⎟

⎞⎜⎜⎝

⎛− ∑

∑∑

!)100(0

0!

100

!/)100(0

010010000100

exp

k

kk k

k

k

k

Cancellation for exp(-x)

Basic facts:

)exp()exp()exp( BABA ⋅≠+

Scaling and Squaring:pp

BAA p 22)2/exp()exp( ==

Allows numerical stable computation byB = exp(A/2p);for j = 1 : p

B = B*B;end

Compute exp(A/2p) and recover exp(A)!

Fastest evaluation of dense matrix polynomial:

1: 2 −= knk

( )[ ( )[ ( )

( ) ] ]LL

LL

L

L

11,11,0

12,12,0

11,11,0

10,10,0)(

−−−−

−−

−−

−−

++

⋅++++

⋅+++

⋅+++=

kkkk

kkk

kkk

kkkn

AaIa

AAaIa

AAaIa

AAaIaxs

Takes matrix multiplications: A2 ,…, Ak-1, Ak, and in theHorner scheme the products of Ak with partial polynomials:

(k-1) + (k-1) = 2k-2 = O(√n) matrix multiplications.

Total costs: O(√n N3) for N x N matrix A.

Example: n=8 and k=3

[ ( )( ) ]2

2,22,12,0

321,21,11,0

320,20,10,0)(

AaAaIa

AAaAaIa

AAaAaIaxsn

++

⋅+++

⋅+++=

Takes matrix multiplications for: A2 , A3, and two Hornerproducts of A3 with partial polynomials:

4 matrix multiplications (instead of 7 with standard Horner).

For sparse N x N matrix A:

Use Horner scheme and sparse-dense matrix multiplications.

( )( )( )( )LL

L

AaaAaAaAaaAIaAaAaIaAs

nn

nnn

++++++==+++=

−143210

10)(

sparse – dense

Costs: n-1 sparse-dense products = O(nN2)

3. Quantum Compiler

Quantum algorithm is described by unitary matrix W.Find optimal implementation of W, e.g. on NMR, using elementary Quantum gates!

Find short factorization of W in terms of a sum ofelementary tensor products of Pauli matrices.

Determine M control pulses, that lead in M time steps to W.

)()exp()exp()exp()( 11

!TUtHitHitHitU MMM =Δ−Δ−⋅Δ−= − L

∑ ⊗⊗=k

kpj

kjkjj QQH )()(

1 Lα , Qrj(k) Pauli matrices or I.

Optimization Problem( ) ( )( )

( )( )( ))(22min

)()(min)(min

)(

)(

2

)(

MH

tu

MH

MtuFMtu

tUWtrN

WtUWtUtrWtU

kj

kjkj

ℜ−=

−⋅−=−

Maximize

( )( )1

))(())(( 100:))((

UUWtr

eeWtrtuf

MH

HtuHiHtuHiHMj

j jjj jMj

L

L

ℜ=

⎟⎠⎞

⎜⎝⎛

⎟⎠⎞⎜

⎝⎛ ∑∑ℜ=

+−+−

Partial derivatives:

( )( )

( )( ))()(

)()())((

1

11

kjH

k

kjHk

HM

H

kj

kj

tUiHWtr

UUiHUUWtrtutuf

−ℜ=

−ℜ=∂

+

+ LL

GRAPE1. Define quality funktion f(uj(tk)) 2. Set initial control amplitudes u(0)

j(tk) for all times tk3. For l=1,2,…

a: Calculate the forward-propagation U(tk) for all k=1,…,Mb: Calculate the back-propagation W(tk+1) for all kc: Calculate the gradientd: Update the controls uj(tk):

)()()( )()1(

kjk

ljk

lj tu

ftutu∂∂

+=+ ε

Repeat gradient step until convergence.

Numerical Tasks

- Computation of exp(iH)- Computation of all products Uk…U1- Computation of all products Uk…U1W- Computation of the trace(WH(-iHj)U)

Parallel Multiple Matrix Multiplication

Compute

for all k=1,2,…,N with n x n – matrices U1, … , UN

kk UUUH ⋅⋅⋅= L21,1

NUUUU

UUUUU

U

L

M

321

321

21

1

⋅⋅

⋅⋅⋅

There exist fast matrix-matrix algorithms that arefaster than n3 (Strassen, group-theoretic approach)Conjecture: O(n2+ε)

Total costs sequentially: N*n3

Block Column Parallel

U1 U2 U3 U4 U5 U6 U7 U8

Distribute U8 on k processors p1 … pk together with full U7 .

p1….…..pk

):1(:,):1(:,):1(:, 1872187187

21

nnUUnnUUnUU

ppp

k

k

+⋅+⋅⋅ −L

L

Gives H7,8 = U7 U8

Block Column Parallel

U1 U2 U3 U4 U5 U6 H78

Send full U6 to all processors p1 … pk .

p1….…..pk

):1(:,):1(:,):1(:, 1786217861786

21

nnHUnnHUnHU

ppp

k

k

+⋅+⋅⋅ −L

L

Gives H6,8 = U6 U7 U8

Block Column Parallel

U1 U2 U3 U4 U5 H68

Send full U5 to all processors p1 … pk .

p1….…..pk

):1(:,):1(:,):1(:, 1685216851685

21

nnHUnnHUnHU

ppp

k

k

+⋅+⋅⋅ −L

L

Gives H5,8 = U5 U6 U7 U8

Block Column Parallel

U1 H28

Send full U1 to all processors p1 … pk .

p1….…..pk

):1(:,):1(:,):1(:, 1281212811281

21

nnHUnnHUnHU

ppp

k

k

+⋅+⋅⋅ −L

L

Gives H1,8 = U1 … U6 U7 U8

Costs in Parallel:

U1 U2 U3 U4 U5 U6 U7 U8

N-1 times n2 * n/k = (N-1)*n3 / k

For N matrices of n x n size with k processors.

Especially for 8 matrices and 4 processors: (7/4)*n3

p1….…..pk

Parallel Prefix Tree

U1 U2 U3 U4 U5 U6 U7 U8

P1 P2 P3 P4

U1 U2 U3 U4 U5 U6 U7 U8

U12 U3 U12 U34 U56 U7 U56 U78

U1234 U5 U1234U56 U1234 U567 U1234

U5678

Parallel Prefix Tree

P1 P2 P3 P4

U1 U2 U3 U4 U5 U6 U7 U8

U12 U34 U56 U78

U123 U1234 U567 U5678

U12345 U123456 U1234567 U12345678

Parallel Prefix Tree

Costs: log(N)*n3 with N/2 processors

Especially: 3*n3

A little bit more expensive than the columnwise method,

but less communication/storage.