Sorting Cancer Karyotypes by Elementary Operations
description
Transcript of Sorting Cancer Karyotypes by Elementary Operations
Sorting Cancer Karyotypes Sorting Cancer Karyotypes by Elementary Operationsby Elementary Operations
Michal Ozery-Flato and Ron ShamirSchool of Computer Science,
Tel Aviv University
Outline
• Introduction• Modeling the evolution of cancer karyotypes• The karyotype sorting problem• Combinatorial Analysis• Results
2
3
Introduction
4
http://www.ncbi.nlm.nih.gov/sky/skyweb.cgi
Normal female
karyotype
5
The "Philadelphia chromosome"
6
http://www.ncbi.nlm.nih.gov/sky/skyweb.cgi
Breast cancer
karytype (MCF-7)
7
Chromosomal Instability
• A phenotype of most cancer cells.– Losses or gains of chromosomes result from
errors during mitosis
– Chromosome rearrangements are associated with "double strand breaks"
multi-polar mitoses
8
Double Strand Breaks
• Constitute the most dangerous type of DNA damage– A successful repair ligates two matching
broken ends– Mis-repair can result in rearrangements (e.g.
translocations) or deletions
M.C. Escher, 1953
Double strand break
The Challenge
Analyze the evolution of aberration events in cancer karyotypes
9
10
The Mitelman Database of Chromosome Aberrations in Cancer• Over 55,000 cancer karyotypes, culled from
over 8000 scientific publications• Can be parsed automatically (CyDAS parser
www.cydas.org)• The largest current data resource on cancer
genomes' organization
11
Modeling the Evolution of Cancer Karyotypes
12
The Normal Karyotype • Band = basic unit observable in karyotype. A
unique region in the genome, identified by integer• Normal Chromosome = interval of bands
– Two normal chromosomes are either disjoint or equivalent
• Normal karyotype = a collection of normal chromosomes– Usually contains two copies of each chromosome
(with the possible exception of the sex chromosomes)
13
The Cancer karyotype
• Fragment = a sub-interval (>1 bands)
of a normal chromosome• Chromosome =
– One fragment, or a concatenation of several fragments
– Orientation-less: [1,4]::[37,40] [40,37]::[4,1]• Cancer karyotype = a collection of
chromosomes
concatenation (breakpoint)
14
Elementary Operations
Breakage
Fusion
duplication
deletion
These operations can generate all known chromosomal aberrations!
15
The Karyotype Sorting Problem
16
The Karyotype Sorting (KS) Problem• Find a shortest sequence of elementary
operations that transforms the normal karyotype into given cancer karyotype
• Find the elementary distance = #operations in such a solution to KS.
???
17
The Karyotype Sorting (KS) Problem(inverse formulation)
• Find a shortest sequence of inverse elementary operations that transforms the given cancer karyotype into the normal karyotype
???
18
Inverse Elementary Operations
Breakage
Fusion
duplication
deletion
c-deletion
addition
19
Assumptions
• ~95% of the karyotypes in the Mitelman Database have no recurrent breakpoints
• Assumptions:– The cancer karyotype contains no recurrent
breakpoints– Every added chromosome contains no
breakpoints
[20,39]::[12,1] Breakpoint ID={390,120}
20
The Reduced Karyotype Sorting (RKS) Problem
• Assumptions reduced problem:– No breakpoints in the cancer karyotype
(i.e every chromosome is an interval)– No breakpoints created by fusions / additions All the normal chromosomes are identical
1 2 3 4 5 6 7 8 9 10 110 1 2 3 4 5 6 7 8 9 10 110The normal karyotype The cancer karyotype
breakage, fusion, c-deletion, addition
identical chromosomes
21
Combinatorial Analysis
(RKS Problem)
22
Extending the karyotypes
1 2 3 4 5 6 7 8 9 10 110The normal karyotype
1 2 3 4 5 6 7 8 9 10 110The cancer karyotype
23
Parameter 1: f = #disjoint pairs of complementing interval ends
• Observation: f = -1 for fusion; f = 1 for breakage f {0,-1,-2} for c-deletion f {0,1,2} for addition
f =5
1 2 3 4 5 6 7 8 9 10 110
24
The histogram
• Parameter 2: w = #bricks • Observations:
– w is even w = 0 for breakage / fusion w {0,2} for addition / c-deletion
1 2 3 4 5 6 7 8 9 10 110The cancer karyotype
1 2 3 4 5 6 7 8 9 10 110
The histogram
A wall with 2 bricks
A brick
25
Simple Bricks• A brick is simple if
– no lower brick (in the same wall), and– no complementing interval ends
• Parameter 3: s = #simple bricks• Observation:
s {0,-1} for breakage s =0 for constrained-deletion– |s| 2 for addition
Simple bricks
1 2 3 4 5 6 7 8 9 10110
26
The Weighted Bipartite Graph of Bricks
• Parameter 4: m = the minimum weight of a perfect matching
weight v-,v+:simple v-,v+:non-simple otherwisev- < v+ 2 0 1v+ < v- 0 2 1
1 2 3 4 5 6 7 8 9 10110
Positive bricks
Results
28
Main Theorem
• The elementary-distance, d, satisfies:
w/2+f+s+m-2N d 3w/2+f+s+m-2N
N = #intervals in the normal karyotype
29
Results (2)• Used the main theorem to devise a
polynomial-time 3-Approximation algorithm
• Combined with a greedy heuristic on real data (95% of Mitelman DB) optimal solutions computed for 100% of karyotypes– 99.99% cases : lower bound is achieved
(hence solution is optimal)– 30 cases: lower-bound+2 but actually optimal
(manual verification)
30
Summary
• A new framework for analyzing chromosomal aberrations in cancer
• A 3-approximation algorithm when there are no recurrent breakpoints – 100% success on 57,252 karyotypes (with no
recurrent breakpoints) from the Mitelman DB.• Future work: handle recurrent breakpoints
– Analyze the remaining 5% of the karyotypes in the Mitelman DB.
31
Thank for your attention.
Questions?