1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern...

32
1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker: Chen Ming-Chiang

Transcript of 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern...

Page 1: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

1

A Simpler 1.5-Approximation Algorithm for Sorting by

Transpositions

Combinatorial Pattern Matching (CPM) 2003

Authors: T. Hartman & R. Shamir

Speaker: Chen Ming-Chiang

Page 2: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

2

Outline

Sorting by Transpositions Previous Works Linear & Circular Permutation The Breakpoint Graph Algorithm Performance Ratio & Running Time Discussions References

Page 3: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

3

Sorting by Transpositions

Given two sequences representing two species, find the smallest number of transpositions needed to transform a sequence to the other sequence.

Transposition: Swap two adjacent substrings of any length

without changing the order in the permutation.

2 4 1 3 1 3 2 4 1 2 3 4

Page 4: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

4

Previous Works

O(n2) 1.5-approximation algorithm [BP98] V. Bafna & P.A. Pevzner, SIAM J.D.M., 1998.

O(n4) 1.5-approximation algorithm [C99] D.A. Christie, 1999.

At most 2n/3 transpositions for sorting given permutation of size n [EEKSW2001]

Page 5: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

5

Linear & Circular Permutations Theorem 1

The problem of sorting by transpositions on linear permutation is equivalent to circular permutation.

Linear

1 4 7 2 6 5 3 1 2 6 4 7 5 3 Circular

1

2

3

45

6 7

1

4

3

25

7 6

Page 6: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

6

The Breakpoint Graph

Permutation : 1 6 5 4 7 3 2 Step 1: Replace each element i by 2i-1 and 2i.

Permutation f( ): 1 2 11 12 9 10 7 8 13 14 5 6 3 4

1

2

3

4

5

6

7

12

3

4

5

6

78

14

13

11

10

9

12

Page 7: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

7

The Breakpoint Graph

Graph G( ) is an edge-colored graph.

For every ,

Black edge:

Gray edge:

ni 1

),( 122 ii

)12,2( ii

1 2

3

4

5

6

78

14

13

11

10

9

12

Page 8: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

8

The Breakpoint Graph : The number of cycles.

: The number of odd cycles. Odd cycle is the cycle with odd numbers of black edges.

)()(

),(

oddodd

odd

cc

c

)(c

1 2

3

4

5

6

78

14

13

11

10

9

12

)(oddc

3)( c1)( oddc

Page 9: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

9

The Breakpoint Graph

For a sorted permutation of size n,there are n cycles, and all of them are odd.

The concept is to increase the number of cycles

from to n.

1 2

13

14

11

12

78

10

9

3

6

5

4nc )(

ncodd )(

)(c

Page 10: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

10

The Breakpoint Graph

Lemma [BP98] For all permutation and transposition , .

Theorem (Lower bound) For all permutations , .

The goal now is to increase the number of odd cycle.

}2,0,2{),( c}2,0,2{),( oddc

2

)()(

oddcnd

Page 11: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

11

The Breakpoint Graph Simple graph:

A graph is called simple if it contains only cycles which black edges .

1-cycle

2-cycle 3-cycle

3

Page 12: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

12

Algorithm Transforming graph into simple graph .

While 2-cycle exists, apply a 2-transposition.

While 3-cylces exists, If oriented cycle exists, apply a 2-transpostion. If interleaving unoriented cycle exists, apply a

(0,2,2)-transposition. If shattered unoriented cycle exists, apply a

(0,2,2)-transposition.

Mimic the sorting of using the sorting of .

Page 13: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

13

Algorithm

Transforming graph into simple graph ([HP99] & [LX2001]):

Lemma: Every permutation can be transformed into a simple one by safe

splits.

Lemma: Let be a simple permutation that is equivalent to , then

every sorting of mimics a sorting of with the same number of operations.

Page 14: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

14

Algorithm Example:

1 3 5 7 4 2 6 1 3 5 7 4 x 2 6 1 odd-cycle 1 odd-cycle + 1 cycle1

6

7

14 13

10

9

2

3

4

5

8

1112

vb3

w

g

b1

b2

1

6

7

14 13

10

9

2

3

4

5

8

1112

b3

b1

b2

Page 15: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

15

Algorithm

Why the translation is safe?

The process breaks a cycle into a 3-cycle and (k-2)-cylce.

1)()ˆ( oddodd cc1ˆ nn

2

)()(

oddcnd

Page 16: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

16

Algorithm

In the following, there are only three types of cycles in simple graph.

1-cycle

2-cycle

3-cycle

Page 17: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

17

Algorithm

While 2-cycle exists, apply a 2-transposition [C99]:

Lemma: If is a permutation that contains a 2-cycle, then

there exists a 2-tranposition on .

Page 18: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

18

Algorithm

The result is to increase two odd cycle. Therefore, it is a 2-transposition.

1

6

7

2

3

4

5

8

1

6

7

2

3

4

5

8

2),( oddc

Page 19: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

19

Algorithm

Only two possible configurations of 3-cycle:

Oriented cycle:

Unoriented cycle:

Page 20: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

20

Algorithm

While 3-cylces exists, If oriented cycle exists, apply a 2-transpostion [BP98]:

An oriented cycle can be eliminated by a 2-transposition.

Page 21: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

21

Algorithm

It is a 2-transposition, because .

12

9

10

11

127

8

5

6

3

4

12

3

4

11

129

10

5

6

7

8

2),( oddc

Page 22: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

22

Algorithm

Now, we only focus on unoriented cycle.

Lemma: Let C be an unoriented cycle. Then every pair of

black edges in C intersects with some other cycles in .)(G

0 3 2 1 4

Page 23: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

23

For the unoriented cycle, there are two cases, interleaving cycle and shattered cycle.

Interleaving cycle:

Shattered cycle:

Page 24: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

24

Algorithm

While 3-cylces exists, If interleaving unoriented cycle exists, apply a (0,2,2)-

transposition :

Page 25: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

25

1

211

12

56

7

8

3

4

9

10

9

107

8

56

11

12

3

4

1

2

9

1011

12

12

7

8

3

4

5

6

5

67

8

12

11

12

3

4

9

10

Page 26: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

26

Algorithm While 3-cylces exists,

If shattered unoriented cycle exists, apply a (0,2,2)-transposition :

Shattered cycle: Cycle E is shattered by cycles C and D, if E’s black edges belong

to different intervals caused by either C or D.

Page 27: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

27

Algorithm First Case:

two out of three cycles are non-intersecting.

Page 28: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

28

Algorithm Second Case:

Three cycles are mutually intersecting.

1

23

4

5

613

14

9

1011

12

8 7

5

69

10

11

123

4

13

141

2

8 7

1

213

14

11

123

4

9

105

6

8 7

Page 29: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

29

Algorithm Transforming graph into simple graph .

While 2-cycle exists, apply a 2-transposition.

While 3-cylces exists, If oriented cycle exists, apply a 2-transpostion. If interleaving unoriented cycle exists, apply a (0,2,2)-

transposition. If shattered unoriented cycle exists, apply a (0,2,2)-

transposition.

Mimic the sorting of using the sorting of .

Page 30: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

30

Performance Ratio & Running time

Performance ratio is 1.5, since

.

Running time of algorithm is .

5.1

342

2)(

34

)(

odd

odd

cn

cn

2n

Page 31: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

31

Discussions

Working on circular permutation & better running time.

Complexity of the problem is still the open problem.

There are many different sorting problems about genome rearrangement.

Page 32: 1 A Simpler 1.5- Approximation Algorithm for Sorting by Transpositions Combinatorial Pattern Matching (CPM) 2003 Authors: T. Hartman & R. Shamir Speaker:

32

References

[BP98] Sorting by Transpositions, Bafna, V. and Pevzner, P. A., SIAM Journal on Discrete Mathematics, Vol. 11, No. 2, 1998, pp. 224-240.

[C99] Genome Rearrangement Problems, D. A., Christie, PhD thesis, University of Glasgow, 1999.

[EEKSW01] Sorting a Bridge Hand, Eriksson, H., Eriksson, K., Karlander, J., Svensson, L. and Waslund, J., SIAM Journal on Discrete Mathematics, Vol. 241, 2001, pp. 289-300.

[HP99] Transforming Cabbage into Turnip: Polynomial Algorithm for Sorting Signed Permutations by Reversals, Hannenhalli, S. and Pevzner, P. A., Journal of the ACM, Vol. 46, 1999, pp. 1–27.

[LX2001] Signed genome rearrangements by reversals and transpositions: Models and Approximations, G. H. Lin and G. Xue, Theoretical Computer Science, pp. 513-531, 2001.