Numerical Methods for Quantum ComputingNumerical Methods for Quantum Computing Thomas Huckle...
Transcript of Numerical Methods for Quantum ComputingNumerical Methods for Quantum Computing Thomas Huckle...
Numerical Methods forQuantum Computing
Thomas HuckleComputer Science / MathematicsTechnische Universität München
Collaboration with MPQO, Garching
Introduction to Quantum Computing
Photon Polarization:
source filter A detector
Photon: Polarization direction
Two orthogonal Filter:
source filter A filter C detector
Three filter in 45°
source A B C detector
Explanation:
Photon polarization can be described as a directionthat is a linear combination of up↑ or right→ .
Description in a vector space of the form
ψ = a |↑> + b |→> with |a|2 + |b|2 = 1
|a|2 the probability for state |↑>
The three filters in the previous exampleare related to polarization directions
↑ (A) , (B) , and → (C ).
Filter A restricts the photon ψ to ist component ↑ .
If this photon beam reaches filter C, it is restricted to its→ orthogonal component, that is zero.
With additional filter B in between, ↑ is restrictedto its 45° component φ.
Additional filter C restricts φ to → .
A
C: Output 0
A
B
C: Output: 1/8
State space representation for quantum system:
Quantum state can be measured as |0> or |1>.
General state can be described as a |0> + b |1>with |a|2 + |b|2 =1 as superposition:
⎟⎟⎠
⎞⎜⎜⎝
⎛+⎟⎟⎠
⎞⎜⎜⎝
⎛10
01
baa
b
Measuring forces the state to be |0> or |1> .
Inner product between state vectors |x> and |y>:<x| |y> = <x|y> in bra-ket notation (equiv. to xTy).
Hilbert space containing the state vectors.
Physical quantities are described by observables a.
Observables are described by Hermitian operator A acting on the Hilbert space.
Let λ1 and λ2 be eigenvalues of A.Assume system is in superposition c1|λ1> + c2|λ2>.
Measuring a the system undergoes a change to oneof the eigenstates related to the observed eigenvalue.
c1|λ1> + c2|λ2> |λ1>.
ci denotes the amplitude of the wave function and theProbability for a to be in state |λi> is given by |ci|2.
Outer product (matrix): |x><y|, e.g. ( ) ⎟⎟⎠
⎞⎜⎜⎝
⎛=⎟⎟
⎠
⎞⎜⎜⎝
⎛=><
0010
1001
|10|
Outer product can also be used to describe transformationsof quantum states:
⎟⎟⎠
⎞⎜⎜⎝
⎛=><+><=
0110
|01||10|X
( ) ( )( )>+>=
=>+>><+><=>+>0|1|
1|0||01||10|1|0|ba
babaX
⎟⎟⎠
⎞⎜⎜⎝
⎛=⎟⎟
⎠
⎞⎜⎜⎝
⎛⋅⎟⎟⎠
⎞⎜⎜⎝
⎛ab
ba
0110
>→>>→>
0|1|1|0|:X
Different descriptions for operator X:
Quantum Bit = QubitQubit is a unit vector in a two-dimensional complexvector space with fixed basis, denoted by { |0>, |1> }
The orthonormal basis can be related e.g. to - polarization ↑ and →- spin up and down (1/2 or -1/2) of an electron or a nucleus.
The basis states |0> and |1> represent the classicalbit values 0 and 1. Each measurement gives only |0> or |1>.
But qubits can be in a superposition of |0> and |1>, e.g.a |0> + b |1> with complex a,b of norm 1.
|a|2 and |b|2 giving the probabilities of state |0> or |1>.
Quantum bit can be in infinitely many superposition states,but we can extract only a single bit‘s worth of information:
Measurement forces the state to |0> or |1> with only twopossible results.
Measurement changes the system!
Realization of Qubits?
Qubits can be realized by Nuclear Magnetic Resonance NMRas spin of a number of nuclei of a molecule in a liquid thatcontains a large number of these molecules.
Molecule: 13C trichloroethylene (TCE).
Nuclei: hydrogen nucleus (proton) has strong magnetic moment.
Inside powerful external magnetic field, each proton‘s spinprefers to align itself with the field.
By RF pulses spin direction can be induced to tip off-axisstatic field leads to precession around the main axis.
The magnetic field induced by the precessing protons can bedetected by magnetic induction.
Two possible states of spin can have different energy levelin view of the external magnetic field.
Multiple Qubits:
Consider n-particle quantum system.
Classical: the space of n-particle system, where eachparticle has two possible states, can be describedby a 2n-dimensional vector space.Cartesian product:
Quantum system: Description in 2n-dimensional vector space!Tensor product: )dim()dim()dim( YXYX ⋅=⊗
)dim()dim()dim( YXYX +=×
Four basis vector for two states:
>⊗>>⊗>>⊗>>⊗> 1|1|,0|1|,1|0|,0|0|
Excursion: Tensor product of matrices and vectors:
( ) ( ) ( ) mnkjkj
vusrsr
mnkjkj BaBAbBaA ,
1,,,
1,,,
1,, ,===
⋅=⊗⇒==
Example:
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
=⎟⎟⎠
⎞⎜⎜⎝
⎛⊗⎟⎟⎠
⎞⎜⎜⎝
⎛
52483936444033302624131222201110
13121110
4321
( ) ( ) ( )yxyxyxyyxx nmrr
njj L111
, =⊗⇒== ==
Matrices
Vectors
( ) ( ) ( )3330222011101110321 =⊗
Abbreviation:first second qubit
>>>>>=⊗> 11|,10|,01|,00|0|0|
>+>+>+>>= 11|10|01|00|| dcbax
4 basis states for two qubits:
Dimension of two qubit state space: 2*2=4
⎟⎟⎠
⎞⎜⎜⎝
⎛>⎟
⎠⎞
⎜⎝⎛+>⎟
⎠⎞
⎜⎝⎛⊗>+⎟⎟
⎠
⎞⎜⎜⎝
⎛>⎟
⎠⎞
⎜⎝⎛+>⎟
⎠⎞
⎜⎝⎛⊗>=
=>+>⊗>+>+>⊗>=>=+>+>+>>=
1|0|1|1|0|0|
)1|0|(1|)1|0|(0|11|10|01|00||
vd
vcv
ub
uau
dcbadcbax
Measuring the first bit to be |0> with probability u2=a2+b2.
Basis is given by |000> , … , |111> and each state canbe described in this basis as superposition by
a1 |000> + a2 |001> + … + a8 |111> with a1 … a8, , ||a||=1.
>>>=⎟⎟⎠
⎞⎜⎜⎝
⎛⊗⎟⎟⎠
⎞⎜⎜⎝
⎛⊗⎟⎟⎠
⎞⎜⎜⎝
⎛=
⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎝
⎛
>= 0|0|0|01
01
01
00000001
000|
3 qubits lead to 23=8 dimensional space, q qubits to 2q dimensional space.
Example: Three qubits
Some states cannot be described by the decompositioninto two separated component states by a tensor product:
>+>>= 11|00|| y( ) ( ) ybdbcadacdcba >≠+>+>+>=>+>⊗>+> 11|10|01|00|1|0|1|0|
Such states are called entangled states, they have noclassical counterpart.Opposite: Pure states.
Exponential growth of state space suggests possibleexponential speed-up of computationson quantum computers!
Entangled States:
EPR ParadoxConsider two maximally entangled particles
( )>+> 11|00|2
1
One particles is send to Alice and the other to Bob.
Alice measures her particle and observes |0>.Therfore, the combined state of the two particles is |00>.
Hence, a measurement by Bob has to deliver also |0>.
There is no possibility to use this mechanism forcommunication!Alice first or Bob first?
Quantum Gates – Pauli Matrices
>−→>
>→>
>→>
>−→>
>→>
>→>
>→>
>→>
1|1|
0|0|:
0|1|
1|0|:
0|1|1|0|:
1|1|
0|0|:
Z
i
iY
X
I⎟⎟⎠
⎞⎜⎜⎝
⎛1001
⎟⎟⎠
⎞⎜⎜⎝
⎛0110
⎟⎟⎠
⎞⎜⎜⎝
⎛− 01
10i
⎟⎟⎠
⎞⎜⎜⎝
⎛−1001
Four unitary basis matrices for four-dim. space of unitary 2 x 2 matrices.
Possible transformations (quantum gate) for one qubit.
Identity
Negation, Px
Py = i PzPx
Phase shift, Pz
Two-Qubit Gate: Cnot
>→>>→>>→>>→>
10|11|11|10|01|01|00|00|:notC
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
0100100000100001
Cnot acts as the identity on the second qubit iff thefirst qubit is in state |0>;If the first qubit is |1>, the second qubit is changed like X.
( )( ) 2/)()(
2/XZIIZI
XZXIIZIICnot
⊗−+⊗+==⊗−⊗+⊗+⊗=
Representation by tensor product and Pauli matrices:
2mod:,||| +⊕>⊕>>= jiiijCnot
Graphical representation of gates:
control bit
conditional negation
Single bit operations: X
Y
Z
Two-Qubit Gate: Walsh-Hadamard Transformation
( )
( )>−>→>
>+>→>
1|0|2
11|
1|0|2
10|:HOne-dim. case:
n-qubit case: HHHW ⊗⊗⊗= L
( )
( ) ( ) ( )( )
∑−
=
>=
=>+>⊗⊗>+>⊗>+>=
>=⊗⊗⊗
12
0|
21
1|0|1|0|1|0|21
000|
n
jn
n
j
HHH
L
LL
Application generates a superposition of all 2n
possible states:
⎟⎟⎠
⎞⎜⎜⎝
⎛−
=11
112
1H
Toffoli GateControlled-controlled-not CCnot
Negates the last bit of three iff the first two are both 1.
( ) ||| >⊕∧>→ kjiijk
NAND:
i
j
1 ( ) ( ) 1⊕∧=¬ jiij
No cloningAn unknown system cannot be cloned by a unitary transform!
Proof: Assume unitary U does cloning.>>>→ ϕϕϕϕ ||0:| anyforU
( )( ) ( )
( ) ( )( )>+>+>+>=
=>+>⊗>+>>==
>+>=>+>>=>+>=>
>>=>>=
φφφϕϕφϕϕ
φϕφϕψψ
φφϕϕφϕψφϕψ
φφφϕϕϕ
|||||||||
||0|0|0|||:|
|0||0|
2
2!
cc
cUUcUc
UandU
which is not true in general.
Then it must hold:
Quantum Gate ArraysQuantum gates are always unitary operators.In order to model a classical function f(x) we considera quantum gate array Uf defined by
>⊕→> )(,|,|: xfyxyxU f
where denotes the bitwise exclusive-OR.
Uf is unitary and can be realized by quantum gate array.
⊕
To compute f(x) we apply Uf on |x,0>
( )⎩⎨⎧
=>=>
>=⊕=>0)(0,|1)(1,|
)(0,|0,|xfifxxfifx
xfxxU f
( )( ) ( ) ( ) >>=⊕⊕=>⊕=> yxxfxfyxxfyxUyxUU fff ,|)()(,|)(,|,|
Quantum ParallelismUf is applied to an input vector in superposition.Hence, Uf is applied to all basis vectors in thesuperposition simultaneously and will generatea superposition of the results.In this way it is possible to compute f(x) for 2n values of x in a single application of Uf.
Start with n-qubit state |00…0>.Apply Walsh-Hadamard transformation W superposition
( )
( ) ( )( ) ∑−
=
>=>+>⊗⊗>+>=
>=⊗⊗⊗12
0|
211|0|1|0|
21
000|n
xnn
x
HHH
L
LL
( )
∑
∑∑−
=
−
=
−
=
>=
=>=⎟⎟⎠
⎞⎜⎜⎝
⎛>
12
0
12
0
12
0
)(,|21
0,|210,|
21
n
nn
xn
xfn
xnf
xfx
xUxU
computes f(x) by n qubits with 2n states simultaneously.Problem: Measurement and interpretation of output.
Famous quantum algorithms:-Quantum Fourier Transform
(of length 2m with m(m+1)/2 gates)-Shor‘s algorithm for factoring n-digit numbers
(in polynomial time)- Grover‘s search algorithm (O(√n))
Unitary Matrices and Lie Algebras
Quantum Algorithm Unitary Matrix
Space of unitary matrices U is a Lie Group, that is:U is a smooth manifold and a group, where multiplicationand inversion are smooth mappings.
The tangential space T to U in the identity I is a Lie Algebra:
hermitianHUiHU U ,),exp( ∈=
Mapping from the vector space of hermitian matricesin the unitary Lie group:
is called the exponential mapping.
Important Lie Groups and Lie Algebras:
SU(n): unitary matrices with det(U)=1su(n): hermitian matrices with trace = 0
exp(iH): su(n) → SU(n)
trace(H)=0 →
U(n): unitary matricesu(n): hermitian matrices
exp(iH): u(n) → U(n)HH UiUUiUiH ⋅Λ⋅=Λ= )exp()exp()exp(
1)0exp(exp
)exp())det(exp())det(exp(
==⎟⎟⎠
⎞⎜⎜⎝
⎛=
==Λ=
∑
∏
jj
jj
i
iiiH
λ
λ
Universal Gates
A set of elementary gates such that every QuantumTransform can be composed
A set of elementary unitary matrices such thatevery unitary matrix can be represented as a product.
Classical: NAND is universal
Quantum setting: C not with together with all single qubit gates (U(2)) in unitary group
h>>=
∂∂
)(|)()(| ttHitt
ψψ
Quantum DynamicsWave function ψ(t) describes the state of a quantum systemdepending of time t.Vector |ψ(t)> element of Hilbert space with orthogonalbasis |ψk(t)>, k=1,2,…
∑ >=<>>=k kkkk ctct ψψψψ |,)(|)(|
Change of |ψ(t)> in time is described by theHamilton operator H and follows the Schrödinger equation
Stationary solution: >⋅−>=>= )0(|)exp()0(|)()(| ψψψ iHttUt
Hamiltonian H is Hermitian.
The real eigenvalues of H are the energy levelsof stationary states described by the relatedeigenvectors:
>>= kkk EH ψψ ||
Hamiltonian H = Hdrift + Hcontrolinternal coupling external pulse
Density MatrixLet a state |ψi> appear with probability pi. Then the expectation value of observable a is <ψi|A|ψi>.Mean value of a is
∑=
><>=<n
iiii ApA
1
|| ψψ
The density matrix is defined by
∑=
><=n
iiiip
1|| ψψρ
Therefore, the expectation value can be written as
( )ATraceA ρ>=<
QFT and ShorQuantum Fourier Transform maps a quantum state intoanother state where the new amplitudes are the Fouriertransform of the original ones:
∑∑ >→>y
QFT
xyyGxxg |)(|)(
where
and x and c range over the binary representations ofintegers between 0 and N-1 = 2m-1.
))(()( xgDFTyG =
∑−
=
>→>12
0
22 |21:|
mm
c
icx
mQFT cexU π
Quantum gates for the QFT:
Shor shows that QFT can be implemented with m(m+1)/2 gates
Fourier transform is itself a unitary matrix
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
⋅
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
=
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
−−−−−
−
−
− 1
2
1
0
)1)(1()1(21
)1(242
12
1
2
1
0
1
11
1111
1
nnnnn
n
n
n v
vvv
n
c
ccc
M
L
MMMM
L
L
L
M
ωωω
ωωωωωω
)2exp()( niconj πωω −==
Special case n=2, 1 qubit:
HF =⎟⎟⎠
⎞⎜⎜⎝
⎛−
=11
112
12
Special case n=4, 2 qubit:
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
−−−−
−−=
ii
iiF
111111
111111
21
4
Fn can be reduced to F2=H, permutations, and diagonal gates:
122,
001
+−− =⎟⎟⎠
⎞⎜⎜⎝
⎛= jkjkijk jke
B πθθ SWAP:
2 qubit QFT:
|x1>
|x0>
SWAP
H
H
B12
UQFT,2(x)
If g(x) has a period r, that is a power of 2, then
∑∑ >=⎟⎠
⎞⎜⎝
⎛>
j
m
jx
QFT rjcxxgU 2||)(
So only coefficients to multiples of 2m/r are nonzero.When the period r does not divide 2m, the transformapproximates the exact case, so most of the amplitudeis attached to integers close to multiples of 2m/r.
Outline of ShorAim: Find prime factors of given integer N=pq.
Step 1: Take m < N randomly. Calculate the greatest commondivisor gcd(m,N) (Euclid).If gcd(m,N) neq 1: Great, done.
Step 2: Define function fN(a)=ma mod N.Find the smallest P with mP=1 mod N.P is called the order or period.This is done by QFT. All the rest is classical.
Step 3: If P is odd, it cannot be used: Go back to step 1.Step 4: P is even, and therefore it holds
Nmmm pPPmod0111 22 =−=⎟
⎠⎞⎜
⎝⎛ +⎟⎠⎞⎜
⎝⎛ −
If second factor = 0 mod N, go back and try other m
Step 4 continued:If second factor is not 0 mod N, thenmP/2-1 contains either p or q.(Note, that mP/2-1 cannot be a multiple of N in this case;otherwise it would hold mP/2=1 mod N which contradictsthe assumption that P is the smallest number withmP=1 mod N.)
Step 5: The number d = gcd(mP/2 -1,N) is either p or q.
Numerical Problems
1. Mathematical properties of related matrices
2. Matrix exponential
3. Quantum compiler and parallel matrix multiplication
4. Approximating the smallest eigenvalue of huge H
5. Solving special linear systems
1. Properties of Matrices in Quantum Computing
Numerical methods should take into account special propertiesof the considered matrices, as
-sparsity (e.g. PDE,…)
-structure (e.g. FFT, symplectic, tensor structure)
-general dense
Typical Matrices I
∑ ⊗⊗⊗k
kp
kkk QQQ )()(
2)(
1 Lα
{ } { } matricesPauliPPPIQ zyxk
j :,,)( ∪∈
Qj(k) describing the interaction between different
quantum states.Usually, most of the Qj
(k) are I.The other are Px, Py, or Pz
2222 IIPIIj
x ⊗⊗⊗⊗⊗⊗ LL
222222 IIPIIPIIk
z
j
z LLL ⊗⊗⊗⊗⊗⊗⊗
Typical Matrices II1D spin chain, drift Hamiltonian
1 2 3 4 5
xx
xx
xx
xx
PPIIIIPPIIIIPPIIIIPPH
⊗⊗⊗⊗⋅+⊗⊗⊗⊗⋅+⊗⊗⊗⊗⋅+⊗⊗⊗⊗⋅=
4
3
2
1:
αααα
Typical Pattern
Sparsity:O(n log(n))
Structured:constantalongdiagonals
1D spin chain, control Hamiltonian
2. Computation of exponential of a matrix
Definition: ∑∞
=
==0
!/)exp(k
kA kAeA
„19 dubious ways to compute the exponential of a matrix“Moler, van Loan
Example: Taylor expansion is numerically instable
⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜
⎝
⎛
−=⎟⎟
⎠
⎞⎜⎜⎝
⎛
−=⎟⎟
⎠
⎞⎜⎜⎝
⎛− ∑
∑∑
!)100(0
0!
100
!/)100(0
010010000100
exp
k
kk k
k
k
k
Cancellation for exp(-x)
Basic facts:
)exp()exp()exp( BABA ⋅≠+
Scaling and Squaring:pp
BAA p 22)2/exp()exp( ==
Allows numerical stable computation byB = exp(A/2p);for j = 1 : p
B = B*B;end
Compute exp(A/2p) and recover exp(A)!
Fastest evaluation of dense matrix polynomial:
1: 2 −= knk
( )[ ( )[ ( )
( ) ] ]LL
LL
L
L
11,11,0
12,12,0
11,11,0
10,10,0)(
−−−−
−−
−−
−−
++
⋅++++
⋅+++
⋅+++=
kkkk
kkk
kkk
kkkn
AaIa
AAaIa
AAaIa
AAaIaxs
Takes matrix multiplications: A2 ,…, Ak-1, Ak, and in theHorner scheme the products of Ak with partial polynomials:
(k-1) + (k-1) = 2k-2 = O(√n) matrix multiplications.
Total costs: O(√n N3) for N x N matrix A.
Example: n=8 and k=3
[ ( )( ) ]2
2,22,12,0
321,21,11,0
320,20,10,0)(
AaAaIa
AAaAaIa
AAaAaIaxsn
++
⋅+++
⋅+++=
Takes matrix multiplications for: A2 , A3, and two Hornerproducts of A3 with partial polynomials:
4 matrix multiplications (instead of 7 with standard Horner).
For sparse N x N matrix A:
Use Horner scheme and sparse-dense matrix multiplications.
( )( )( )( )LL
L
AaaAaAaAaaAIaAaAaIaAs
nn
nnn
++++++==+++=
−143210
10)(
sparse – dense
Costs: n-1 sparse-dense products = O(nN2)
3. Quantum Compiler
Quantum algorithm is described by unitary matrix W.Find optimal implementation of W, e.g. on NMR, using elementary Quantum gates!
Find short factorization of W in terms of a sum ofelementary tensor products of Pauli matrices.
Determine M control pulses, that lead in M time steps to W.
)()exp()exp()exp()( 11
!TUtHitHitHitU MMM =Δ−Δ−⋅Δ−= − L
∑ ⊗⊗=k
kpj
kjkjj QQH )()(
1 Lα , Qrj(k) Pauli matrices or I.
Optimization Problem( ) ( )( )
( )( )( ))(22min
)()(min)(min
)(
)(
2
)(
MH
tu
MH
MtuFMtu
tUWtrN
WtUWtUtrWtU
kj
kjkj
ℜ−=
−⋅−=−
Maximize
( )( )1
))(())(( 100:))((
UUWtr
eeWtrtuf
MH
HtuHiHtuHiHMj
j jjj jMj
L
L
ℜ=
⎟⎠⎞
⎜⎝⎛
⎟⎠⎞⎜
⎝⎛ ∑∑ℜ=
+−+−
Partial derivatives:
( )( )
( )( ))()(
)()())((
1
11
kjH
k
kjHk
HM
H
kj
kj
tUiHWtr
UUiHUUWtrtutuf
−ℜ=
−ℜ=∂
∂
+
+ LL
GRAPE1. Define quality funktion f(uj(tk)) 2. Set initial control amplitudes u(0)
j(tk) for all times tk3. For l=1,2,…
a: Calculate the forward-propagation U(tk) for all k=1,…,Mb: Calculate the back-propagation W(tk+1) for all kc: Calculate the gradientd: Update the controls uj(tk):
)()()( )()1(
kjk
ljk
lj tu
ftutu∂∂
+=+ ε
Repeat gradient step until convergence.
Numerical Tasks
- Computation of exp(iH)- Computation of all products Uk…U1- Computation of all products Uk…U1W- Computation of the trace(WH(-iHj)U)
Parallel Multiple Matrix Multiplication
Compute
for all k=1,2,…,N with n x n – matrices U1, … , UN
kk UUUH ⋅⋅⋅= L21,1
NUUUU
UUUUU
U
L
M
321
321
21
1
⋅⋅
⋅⋅⋅
There exist fast matrix-matrix algorithms that arefaster than n3 (Strassen, group-theoretic approach)Conjecture: O(n2+ε)
Total costs sequentially: N*n3
Block Column Parallel
U1 U2 U3 U4 U5 U6 U7 U8
Distribute U8 on k processors p1 … pk together with full U7 .
p1….…..pk
):1(:,):1(:,):1(:, 1872187187
21
nnUUnnUUnUU
ppp
k
k
+⋅+⋅⋅ −L
L
Gives H7,8 = U7 U8
Block Column Parallel
U1 U2 U3 U4 U5 U6 H78
Send full U6 to all processors p1 … pk .
p1….…..pk
):1(:,):1(:,):1(:, 1786217861786
21
nnHUnnHUnHU
ppp
k
k
+⋅+⋅⋅ −L
L
Gives H6,8 = U6 U7 U8
Block Column Parallel
U1 U2 U3 U4 U5 H68
Send full U5 to all processors p1 … pk .
p1….…..pk
):1(:,):1(:,):1(:, 1685216851685
21
nnHUnnHUnHU
ppp
k
k
+⋅+⋅⋅ −L
L
Gives H5,8 = U5 U6 U7 U8
Block Column Parallel
U1 H28
Send full U1 to all processors p1 … pk .
p1….…..pk
):1(:,):1(:,):1(:, 1281212811281
21
nnHUnnHUnHU
ppp
k
k
+⋅+⋅⋅ −L
L
Gives H1,8 = U1 … U6 U7 U8
Costs in Parallel:
U1 U2 U3 U4 U5 U6 U7 U8
N-1 times n2 * n/k = (N-1)*n3 / k
For N matrices of n x n size with k processors.
Especially for 8 matrices and 4 processors: (7/4)*n3
p1….…..pk
Parallel Prefix Tree
U1 U2 U3 U4 U5 U6 U7 U8
P1 P2 P3 P4
U1 U2 U3 U4 U5 U6 U7 U8
U12 U3 U12 U34 U56 U7 U56 U78
U1234 U5 U1234U56 U1234 U567 U1234
U5678
Parallel Prefix Tree
P1 P2 P3 P4
U1 U2 U3 U4 U5 U6 U7 U8
U12 U34 U56 U78
U123 U1234 U567 U5678
U12345 U123456 U1234567 U12345678
Parallel Prefix Tree
Costs: log(N)*n3 with N/2 processors
Especially: 3*n3
A little bit more expensive than the columnwise method,
but less communication/storage.