Algorithms complexity

76
Algorithms complexity Parallel Parallel computing computing Yair Toaff Yair Toaff 027481498 027481498 Gil Ben Artzi Gil Ben Artzi 025010679 025010679 Orly Margalit Orly Margalit

description

Algorithms complexity. Parallel computing Yair Toaff 027481498 Gil Ben Artzi 025010679 Orly Margalit 037616638. Parallel computing - MST. The problem: Given a graph G= (V , E) with weights. - PowerPoint PPT Presentation

Transcript of Algorithms complexity

Page 1: Algorithms complexity

Algorithms complexity

Parallel computingParallel computingYair Toaff 027481498Yair Toaff 027481498

Gil Ben Artzi 025010679Gil Ben Artzi 025010679

Orly Margalit 037616638Orly Margalit 037616638

Page 2: Algorithms complexity

Parallel computing - MST

The problem:

Given a graph G= (V , E) with weights.

We need to find a minimal spanning tree

with the minimum total weight.

Page 3: Algorithms complexity

Parallel computing - MST

Kruskal algorithm

• Sort the graphs edges by weight.

• In each step add the edge with the minimal weight that doesn’t close a cycle.

Page 4: Algorithms complexity

Parallel computing - MST

Complexity

Single processor:

Sorting – O(m log m) = O( n2 log n)

For each step O(1) there are O(n2) steps

Total – O(n2 log n )

Page 5: Algorithms complexity

Parallel computing - MST

O(m) processors:

Sorting O( log 2 m )

Each step O(1)

Total O( n2 )

Page 6: Algorithms complexity

Parallel computing - MST

Prim algorithm

• Randomly choose a vertex for tree initialization.

• In every step choose the edge with minimal weight form a vertex in the tree to a vertex not in the tree.

Page 7: Algorithms complexity

Parallel computing - MST

Complexity

Single processor:

Find the edge in step i O( n * i)

Total n + 2n + … + n2 = O(n3)

Page 8: Algorithms complexity

Parallel computing - MST

O(n) processors:

There is a processor for each vertex so

every step takes O(n)

Total O(n2)

Page 9: Algorithms complexity

Parallel computing - MST

O(m) processors

In each step there are more processors then edges so

finding the minimum takes O( log n)

Total O ( n log n)

Page 10: Algorithms complexity

Parallel computing - MST

O(m2) processors

In each step finding the minimum takes O( 1)

Total O ( n)

Page 11: Algorithms complexity

Parallel computing - MST

Sulin algorithm

• Treat every vertex as a tree

• In each step randomly choose a tree and

find the edge with the minimal weight from

a vertex in the tree to a vertex not in the tree

Page 12: Algorithms complexity

Parallel computing - MST

Complexity:

Single processor

Same as Kruskal algorithm

Page 13: Algorithms complexity

Parallel computing - MST

O(n) processors:

There is a processor for every vertex so finding the

minimum takes O( n )

In each step only half of the trees remain so there are

O ( log n ) steps

Total O( n log n)

Page 14: Algorithms complexity

Parallel computing - MST

O( n2 ) processors:

There are n processors for every vertex

so finding the minimum takes O(log n)

Total O(log 2 n )

Page 15: Algorithms complexity

Parallel computing - MST

O( n3 ) processors:

There are n2 processors for every vertex

so finding the minimum takes O(1)

Total O(log n )

Page 16: Algorithms complexity

Merge Sort

MS( p,q,c) - p,q indexes c is the arrayIf ( p < q )

{MS( p , (p+q)/2 , c )

MS( (p+q)/2 , q , c )

merge( p , (p+q)/2 , q , c)

}

Page 17: Algorithms complexity

Merge Sort

Single processor

In every step the merge takes O(n), there are

O(log n) steps.

Total O( n log n )

Page 18: Algorithms complexity

Merge Sort

O(n) processors:

In every step the merge is done in parallel

time( MS(n)) = O(1) + time(merge( n / 2))

By using regular merge we get

O( 1 + 2 + 4 + … + n ) = (2log n + 1) = O(n)

Page 19: Algorithms complexity

Merge Sort

Parallel merge

The problem: given 2 sorted arrays A,B

with size n/2 we need to merge them

efficiently while keeping them sorted

Page 20: Algorithms complexity

Merge Sort

Let us define 2 sub arrays:

ODD A = [a1 , a3 , a5 …]

EVEN A = [a0 , a2 , a4 …]

Page 21: Algorithms complexity

Merge Sort

And 2 functions:

Combine( A , B ) = [ a0 , b0 , a1 , b1 , … ]

Sort-combined( A ) – for each pair a2i a(2i+1) if

they are in the right order do nothing else

replace each of them with the other

Page 22: Algorithms complexity

Merge Sort

Parallel merge ( A , B )

{C = parallel merge ( ODD A , EVEN B )

D = parallel merge ( ODD B , EVEN A )

L = combine ( C , D )

Return (sort-combined ( L ) )

}

Page 23: Algorithms complexity

Merge Sort

Complexity:

Time ( parallel merge ( n ) ) =

Time ( parallel merge ( n/2) ) + O(1)

= O(log n)

Page 24: Algorithms complexity

Merge Sort

What is left is to prove the algorithm.

Theorem: if an algorithm sort every array of

(0 , 1) it will sort every array.

Page 25: Algorithms complexity

Merge Sort

Let us mark the number of ‘1’ in A as 1a

and in B as 1b

The number of ‘1’ in ODD A is 1a /2

The number of ‘1’ in EVEN A is 1a /2

Page 26: Algorithms complexity

Merge Sort

As a result of it the difference between the

number of ‘1’ in C and in D is 0 or 1.

Array L will be sorted except maybe one

point where the ‘0’ and ‘1’ meet

sort-combined will do 1 swap at most.

Page 27: Algorithms complexity

Merge Sort

Complexity of merge sort using parallel merge:

Log 1 + log 2 + log 4 + log 8 + … + log n =

0 + 1+ 2 + 3 + … + log n = O( log 2 n)

Page 28: Algorithms complexity

Sum

• Input : Array of n elements of type integer.

• Output : Sum of elements.

• One processor - O(n) operations.

• Two processors - Still O(n) operations.

Page 29: Algorithms complexity

Sum• What could we do if we have O(n) processors ?• Parallel algorithm

– For each phase till we have only one element• Each processor adds two elements together• We have now N/2 new elements

• Complexity– We have done more operations , so what have we

gained ?– Since in each phase we stay with only half of the

elements, we can view it as a binary tree where each level represents the new current elements, overall depth is O(logn) levels. Each level in the tree is O(1), total of O(logn) time.

Page 30: Algorithms complexity

Max1 – Max2

• Input : Array of n elements of type integer.• Output : The first and the second maximum

elements in the array• One processor , 2n operations.• Two processors , each insertion takes 3

operation (compare to each of the other elements that are candidates ) , 2n/3 operations

Page 31: Algorithms complexity

Max1 – Max2

• Parallel algorithm - recursive solution– Divide 2 groups (G1,G2).– Find MAX for each group (LocalM1,LocalM2)– If LocalM1>LocalM2

• Create new group G3 := (LocalM2+G1)

• MAX2 must be in G3, since in G2 there is no element that is bigger than LocalM2

Page 32: Algorithms complexity

Max1 – Max2

• Example– End of recursiveM1[10] * M1[7] * M1[1] * M1[3] * M1[100] * M1[8] * M1[55] * M1[6]

– Up one phase

M1[10],M2[7] * M1[3],M2[1] * M1[100],M2[8] * M1[55],M2[6]

– Up one phaseM1[10],M2[7,3] * M1[100],M2[8,55]

– The resultM1[100] * M2 [10,8,55]

Page 33: Algorithms complexity

Max1 – Max2

• Complexity– 1 processor

• n operations of comparing all elements in tree for Max1 , logn operation comparing elements for Max2, Total (n+logn)

– O(n) processors• We could find Max1and rerun the algorithm to find Max2,

each in logn, total of 2logn.

• However , we can use the previous algorithm and add G3 in parallel , and we get logn for finding Max1, loglogn for finding Max2

Page 34: Algorithms complexity

Max & Min groups

• Input : 2 groups ( G1,G2) of sorted elements• Output : 2 groups (G1`,G2`), where in one

group all elements are bigger than all the elements in the other group

• One processor - Insert all elements into 2 stack, always compare the stack heads, the minimum is inserted into the Min group.

• Complexity - O(n) operations

Page 35: Algorithms complexity

Max & Min groups

• There is a major subtle in the previous algorithm when trying to apply it to parallel computing – each element must be compared until we will find an element that is higher himself.

• We would like to find a method to compare as less as we can each elements with the others , the best is only one comparison per element.

• Any member of the min group is necessarily smaller than at least half of the elements.

• If we could conclude this, we can classified the element in the right group immediately

• Any suggestion ?

Page 36: Algorithms complexity

Max & Min groups• Parallel algorithm

– Insert all elements from G1 into list L1 in a reverse order , and all elements of G2 into list L2 in regular order

– Element j in L1 is bigger than n-j-1 elements of his list– Element j in L2 is bigger than j-1 elements of his list– So , by comparing element i in both lists we get

• If L1[i]>L2[i] , L1[i] is bigger than n-i-1 elements in L1 , and i+1(including L2[i]) elements in L2 , total of n elements. L2[i] is smaller than n-i elements of L2 and i+1 elements element of L1 , total of n elements.

• And vice versa

– We can now insert the element immediately to their groups

Page 37: Algorithms complexity

Max & Min groups

• Example– Groups

• G1 = 7,10,100,101• G2 = 1,11,18,99

– Lists• L1 = 101,100,10,7 • L2 = 1, 11,18, 99

– Comparing : (101,1),(100,11),(10,18),(7,99)– Result : G1’= 101,100,18,99 ,G2’ = 1,11,10,7

Page 38: Algorithms complexity

Max & Min groups

• Complexity– We have compare element i of each lists– Each element has only one comparison – O(n) processor , O(1) time !– Can we do better for one processor now ?

Page 39: Algorithms complexity

Signed elements• Input : Array of elements , some of them are signed• Output : 2 Arrays of elements , one contain the signed , the

other the unsigned, keeping the order between the elements• One processor

– Make one pass , drop each element into the correct array– O(n) operations

• Since we need to maintain the order between the elements , we must know for each element , how many elements should be before him

• how could we improve the Algorithm by adding more processors ?

Page 40: Algorithms complexity

Signed elements array

• Parallel algorithm– Create another array (A2) of elements, where in

each location of a signed element insert 1 and in each location of unsigned elements insert 0

– Now we can do the parallel prefix algorithm and obtaining each element position in the destination array

– We can do the same for the unsigned elements

Page 41: Algorithms complexity

Signed elements array

• Example– Input : [x1,x2,x3`,x4,x5`,x6,x7`,x8`,x9]– A2 : [0 , 0 , 1 , 0 , 1 ,0 ,1 , 1 ,0 ]– Prefix: [0 , 0 , 1 , 1 , 2 , 2 ,3 , 4 , 4 ]– Result: x3’1 , x5`2 , x7`3 , x8`4

• Complexity– O(n) processor , O(logn) time !

Page 42: Algorithms complexity

Scheduling

• Input : Array of jobs , contains the time for executing each job , and the deadline for finishing it.

• Output : Is there a scheduling satisfying the above condition ?

• Parallel algorithm– Sort the deadlines– Create prefix for executing time of each job– In order to exist a scheduling , PrefixExecTime(i)<DeadLine[i]

• Complexity O(n) processors– O(lognlogn) to sort, O(logn) to do prefix , O(1) to compare

Page 43: Algorithms complexity

CAG - Clique

• Input : CAG• Output : maximum clique exist• Reminder

– Clique : A vertex is in a clique iff there is an edge from each of the vertex in the clique to himself

– CAG : Circular Arc Graph , A graph where each vertex is on a circle . There is an edge between two vertex iff there is a join segment on the circle between those two vertex

Page 44: Algorithms complexity

CAG – Clique

• Examples– Clique [V1,V2,V3]

– CAG

v1

v2 v3

v4

v1

v2

v3

v4

Page 45: Algorithms complexity

CAG - Clique

• Parallel algorithm – Loop through element list twice

• If Element == start of a vertex , BoundriesArray[i]=+1;

• If Element == end of a vertex , and we already pass the start of this vertex , BoundriesArray[i]= -1 ;

– PrefixArray := Prefix ( BoundriesArray)– MaxClique := Max ( PrefixArray)

Page 46: Algorithms complexity

CAG - Clique

• Example , CAG from previous slide– BoundriesArray [ (v1,+),(v2,+),(v1,-),(v4,+),(v3,-),(v4,-),(v2,+),(v1,+ ),(v3,+ )(v2,-),(v1,-)]

– PrefixArray[1,2,1,2,1,0,1,2,3,2,1]– MaxClique is 3 !

• Note : There is a need to loop twice trough the list of vertex since we consider only end of vertex that we already pass the start.

Page 47: Algorithms complexity

CAG – Clique

• Complexity– One processor , O(n) – O(n) processors , logn + logn– O( n^2) processors , logn + o(1)

Page 48: Algorithms complexity

Exclusive Read & Exclusive Write

• EREW

• Most simple computer

• Only one processor can read/write to a certain memory block at a time

Page 49: Algorithms complexity

Concurrent Read & Exclusive Write

• CREW

• Only one processor can write to a certain memory block at a time.

• Multiple processors can simultaneously read from a common memory block.

Page 50: Algorithms complexity

Exclusive Read & Concurrent Write

• ERCW

• Only one processor can read a certain memory block at a time.

• Multiple processors can simultaneously write to a common memory block.

Page 51: Algorithms complexity

Concurrent Read & Concurrent Write

• CRCW

• Most powerful computer

• Very complex memory control

• Multiple processors can simultaneously read/write to a common memory block

Page 52: Algorithms complexity

Concurrent Write

Problem:

• Multiple processors writing different values to a common memory block every processor overwrites on previous processor’s value.

MemoryBlock

Processor 1

Processor 2

Processor 3

Page 53: Algorithms complexity

Concurrent Write

Solution1:

• Restrict Write – a unique value can only be written to the memory block.

1

Processor 1

Processor 2

Processor 3

1

1

1

Page 54: Algorithms complexity

Concurrent Write

Solution2:• Combine Write – a unique value is stored

for every distinct processor in the shared memory block.

1,2,4

Processor 1

Processor 2

Processor 3

1

2

4

Page 55: Algorithms complexity

Restrict Write

A good example of Restrict Write is a Boolean problem.

X1 X2 X3 Result

Page 56: Algorithms complexity

Restrict Write

X1 X2 X3 Result Initial value: Result = 0Only value one is written to Result

result = 0;

For i = 1 to n doip (do in parallel) {

if (Xi = = 1)

then result = 1;

}

Page 57: Algorithms complexity

Max Value - O(n2) Processors

Reminder:

One processor : O(n) operations.

O(n) processors : O(log2n) operations.

O(n2) processors : ?

We can represent the comparison between numbers as a matrix. If x1< x2 then coordinate (1,2) gets a value of one, else it gets a value of zero.

Page 58: Algorithms complexity

Max Value - O(n2) Processors

• A processor is allocated for each cell in the matrix.• All the processors with “value = 1” write

simultaneously to the result cell in their row.

X1

X2

X3

Result

(1,1) (1,2) (1,3)

(2,1) (2,2) (2,3)

(3,1) (3,2) (3,3)

X1 X2 X3

Row1

Row2

Row3

Page 59: Algorithms complexity

Max Value - O(n2) Processors

Total operations with O(n2) processors : O(1)– Generating the Matrix : O(1) operations

(one processor per cell)– Generating the result column : O(1) operations

3

6

4

Result

0 1 1

0 0 0

0 1 0

3 6 4

1

0

1

Max Value

Page 60: Algorithms complexity

Sort - O(n2) Processors

Reminder:

One processor : O(nlog2n) operations.

O(n) processors : O(log22n) operations (merge sort)

O(n2) processors : ?

• As before, we generate a comparison matrix.• The result cells will receive the sum of the current row.

Each row has O(n) processors, therefore the sum operation takes O(log2n) operations.

• The result column represents the index of the sorted array in descending order.

Page 61: Algorithms complexity

Sort - O(n2) Processors

Total operations with O(n2) processors : O(log2n)

– Generating the Matrix : O(1) operations

(one processor per cell)– Generating the result column : O(log2n) operations

3

6

4

Result

0 1 1

0 0 0

0 1 0

3 6 4

2

0

1

Page 62: Algorithms complexity

Multiplication Of Matrix

• Matrixes that can be multiplied must obeyed the dimension law : RnCm * RmCk

a11

a21

a12

a22

b11

b21

b12

b22

a11b11 + a12b21

a21b11 + a22b21

a11b12 + a12b22

a21b12 + a22b22

Page 63: Algorithms complexity

Multiplication Of Matrix

Input: Two matrixes of size n*n (Mnn)

Output: One matrix Mnn

Total operations with one processor : O(n3)

• n2 cells • Sum of each cell with O(n) variables and one

processor, O(n) operations

Page 64: Algorithms complexity

Multiplication Of Matrix

Total operations with o(n) processors : O(n2)• Processor per cell in a column. • n columns • Sum of each cell with O(n) variables and one

processor, O(n) operations

O(n)sum * ncolumn = O(n2)

Page 65: Algorithms complexity

Multiplication Of Matrix

Total operations with O(n2) processors : O(n)

• n2 cells

• Processor per cell

• Sum of each cell with O(n) variables and one processor, O(n) operations

O(n)sum * 1cell = O(n)

Each cell is summed simultaneously

Page 66: Algorithms complexity

Multiplication Of Matrix

Total operations with O(n3) processors : O(log2n)

• n2 cells

• O(n) processors per cell

• Sum of each cell with O(n) variables and O(n) processor, O(log2n) operations

O(log2n)sum * 1cell = O(log2n)

Each cell is summed simultaneously

Page 67: Algorithms complexity

Multiplication Of Boolean Matrix

Total operations with O(n3) processors : O(1)

• n2 cells

• O(n) processors per cell

• Sum of each cell with O(n) variables and O(n) processor, O(1) operations

O(1)sum * 1cell = O(1)

Each cell is summed simultaneously

Page 68: Algorithms complexity

Shortest Path Between Vertexes

Problem:• Finding if path exists between 2 vertexes• Finding the shortest path between 2

vertexes

1 1

11

V2

V1

V3

V4

Page 69: Algorithms complexity

Shortest Path Between Vertexes• Represent the graph as a matrix Ann. • If an arc exists between vertex X1 and X2, then coordinates

(1,2) & (2,1) get a value of one, otherwise zero.• Matrix Ann - all the vertexes that are of one arc distance from

each other.

V1

V2

V3

V4

1 0 1

0 1 0

1 0 1

0

1

0

0 1 0 1

V1 V2 V3 V4

1 1

11

V2

V1

V3

V4

Page 70: Algorithms complexity

Shortest Path Between Vertexes

• Matrix Ann2 - all the vertexes that are of two arcs distance

from each other.

• Ann + Ann

2 = all routes of distance of one and two arcs.

V1

V2

V3

V4

2 0 2

0 2 0

2 0 2

0

2

0

0 2 0 2

V1 V2 V3 V4

1 1

11

V2

V1

V3

V4

Page 71: Algorithms complexity

Shortest Path Between Vertexes

• Ann + Ann

2 + Ann3 + …Ann

n = B - all routes of distance 1 to n arcs.

• Any zero values in matrix B, represents no link exists between the two vertexes.

V1

V2

V3

V4

2 1 2

1 2 1

2 1 2

1

2

1

1 2 1 2

V1 V2 V3 V4

1 1

11

V2

V1

V3

V4

Page 72: Algorithms complexity

Shortest Path Between Vertexes

Total operations with 1 processors : O(n4) • Building of Matrix Ann : O(n) operations

• Multiplication of matrix : O(n3) operations

• Creation of Ann,Ann

2 ,Ann3 , … ,Ann

n : O(n4) operations

• Sum of the Matrixes : O(n3) operations

Page 73: Algorithms complexity

Shortest Path Between Vertexes

Total operations with O(n) processors : O(n3)

• Building of Matrix Ann : O(1) operations

• Multiplication of matrix : O(n2) operations

• Creation of Ann,Ann

2 ,Ann3 , … ,Ann

n : O(n3) operations

• Sum of the Matrixes : O(n2) operations (ncell * ncolumn)

Page 74: Algorithms complexity

Shortest Path Between Vertexes

Total operations with O(n2) processors: O(n2) • Building of Matrix Ann : O(1) operations

• Multiplication of matrix : O(n) operations

• Creation of Ann,Ann

2 ,Ann3 , … ,Ann

n : O(n2) operations

• Sum of the Matrixes : O(n) operations (process per cell)

Page 75: Algorithms complexity

Shortest Path Between Vertexes

Total operations with O(n3) processors: O(nlog2n)

• Building of Matrix Ann : O(1) operations

• Multiplication of matrix : O(log2n) operations

• Creation of Ann,Ann

2 ,Ann3 , … ,Ann

n : O(nlog2n) operations

• Sum of the Matrixes : O(log2n) operations (o(n)

processors per cell)

Page 76: Algorithms complexity

Shortest Path Between Vertexes

Total operations with O(n4) processors : O(log22n)

• Building of Matrix Ann : O(1) operations

• Multiplication of matrix : O(log2n) operations with O(n3) processors

• Creation of Ann,Ann

2 ,Ann3 , … ,Ann

n : O(log22n) operations (prefix

algorithm)

• Sum of the Matrixes : O(log2n) operations

• Boolean Output (link exist True or False) : O(log2n) operations