The Genome Assembly Problem
Click here to load reader
-
Upload
mark-chang -
Category
Technology
-
view
271 -
download
0
Transcript of The Genome Assembly Problem
TheGenomeAssemblyProblem
alightningtalk
MarkChang
TheGenomeAssemblyProblem
ACCTCAGAACCCCGCAGTCACGTAGCGTTTGTGGGTACCTCGTGTCTAGT
ACCTCAGAACCCCGCAGTCACGTAGCGTTTGTGGGTACCTCGTGTCTAGT
FragmentedandSequenced
FindingOverlaps
Recunstruction
CGTAGCGTTTGTGGGTACCTCAGAACCCAACCCCGCAGTCACGTAG GTGGGTACCTCGTG
TCGTGTCTAGT
Genome
ACCTCAGAACCCAACCCCGCAGTCACGTAG
GTGGGTACCTCGTGCGTAGCGTTTGTGGGTTCGTGTCTAGT
Short-reads
FindingOverlaps
• Itisveryhardtofindtheoverlapsbetweenmillionsofshort-reads
ACCTCAGAACCC
AACCCCGCAGTCACGTAG
GTGGGTACCTCGTGCGTAGCGTTTGTGGGT
TCGTGTCTAGT
FindingOverlaps
• UsingdeBruijn Graphs
AABCDC
ABC
BCC
CCD
CDA
DCC
AABCCDCCDAgraphtraversal
AABCCDCCDAconverttodeBruijn Graph
UsingdeBruijn Graphs
• Converttheshort-reads intok-mers
ACCTCAGAACCC AACCCCGCAGTCACGTAG
GAACCAGAAC
CAGAA
ACCTCCCTCACTCAGTCAGA
AACCC
AACCC
ACGTACACGT
GCAGTCGCAGCCGCACCCGC
ACCCCCCCCG
CGTAG
UsingdeBruijn Graphs
• Builddebruijn graphfromk-mers
GAACCAGAACCAGAA
ACCTCCCTCACTCAGTCAGA
AACCC
GAACC AGAAC CAGAA
ACCTC CCTCA CTCAG TCAGA
AACCC
UsingdeBruijn Graphs
• Builddebruijn graphfromk-mers
GAACC AGAAC CAGAA
ACCTC CCTCA CTCAG TCAGA
AACCC
AACCC
ACGTACACGTGCAGTCGCAGCCGCACCCGC
ACCCCCCCCG
CGTAG
ACGTA CACGT GCAGT CGCAG
CCGCACCCGCACCCC CCCCG
CGTAG
UsingdeBruijn Graphs
• Graphtraversal
ACCTCAGAACCCCGCAGTCACGTAG
GAACC AGAAC CAGAA
ACCTC CCTCA CTCAG TCAGA
AACCC
ACGTA CACGT GCAGT CGCAG
CCGCACCCGCACCCC CCCCG
CGTAG
Reference
• TheGenomeAssemblyProblem• http://homolog.us/Tutorials/index.php?p=1.1&s=1