High-throuhput data on gene function
-
Upload
lucas-tillman -
Category
Documents
-
view
29 -
download
0
description
Transcript of High-throuhput data on gene function
![Page 1: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/1.jpg)
Bioinformatics and Evolutionary GenomicsBioinformatics and Evolutionary Genomics
High throughput “functional” data / functional High throughput “functional” data / functional genomics / Omics genomics / Omics
Bioinformatics and Evolutionary GenomicsBioinformatics and Evolutionary Genomics
High throughput “functional” data / functional High throughput “functional” data / functional genomics / Omics genomics / Omics
![Page 2: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/2.jpg)
High-throuhput data on gene functionHigh-throuhput data on gene functionHigh-throuhput data on gene functionHigh-throuhput data on gene function
• What do I mean: omics, microarray, chip-on-chipWhat do I mean: omics, microarray, chip-on-chip• Why are people generating these data?Why are people generating these data?
– post-genomic era / systems biology: the challenge to post-genomic era / systems biology: the challenge to understand the roles of the e.g. 6,000 gene products in understand the roles of the e.g. 6,000 gene products in yeast and yeast and how they interacthow they interact to create a eukaryotic organism. to create a eukaryotic organism.
– Because they can: apply automation also to other areas of Because they can: apply automation also to other areas of molecular biology beyond sequencingmolecular biology beyond sequencing
– To have “screens” for the research question at hand rather To have “screens” for the research question at hand rather than to have to test each guess at a timethan to have to test each guess at a time
• What about evolutionary genomics?What about evolutionary genomics?• YeastYeast• Accuracy / noiseAccuracy / noise
• What do I mean: omics, microarray, chip-on-chipWhat do I mean: omics, microarray, chip-on-chip• Why are people generating these data?Why are people generating these data?
– post-genomic era / systems biology: the challenge to post-genomic era / systems biology: the challenge to understand the roles of the e.g. 6,000 gene products in understand the roles of the e.g. 6,000 gene products in yeast and yeast and how they interacthow they interact to create a eukaryotic organism. to create a eukaryotic organism.
– Because they can: apply automation also to other areas of Because they can: apply automation also to other areas of molecular biology beyond sequencingmolecular biology beyond sequencing
– To have “screens” for the research question at hand rather To have “screens” for the research question at hand rather than to have to test each guess at a timethan to have to test each guess at a time
• What about evolutionary genomics?What about evolutionary genomics?• YeastYeast• Accuracy / noiseAccuracy / noise
![Page 3: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/3.jpg)
HTP dataHTP dataHTP dataHTP data
• What do they mean: experimental knowledge, but still What do they mean: experimental knowledge, but still what do they in terms of e.g. function?what do they in terms of e.g. function?
• A delugeA deluge
• Bioinformatics is needed for basic data handling; and Bioinformatics is needed for basic data handling; and has IMHO only scratched the surface in terms of has IMHO only scratched the surface in terms of coming up with biological questions with which we coming up with biological questions with which we can probe this datacan probe this data
• What do they mean: experimental knowledge, but still What do they mean: experimental knowledge, but still what do they in terms of e.g. function?what do they in terms of e.g. function?
• A delugeA deluge
• Bioinformatics is needed for basic data handling; and Bioinformatics is needed for basic data handling; and has IMHO only scratched the surface in terms of has IMHO only scratched the surface in terms of coming up with biological questions with which we coming up with biological questions with which we can probe this datacan probe this data
![Page 4: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/4.jpg)
Microarray Microarray datadata
![Page 5: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/5.jpg)
Microarray dataMicroarray dataMicroarray dataMicroarray data
two conditions often used for “screens”two conditions often used for “screens”
![Page 6: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/6.jpg)
(Correlated) (Correlated) mRNA mRNA
expressionexpression
(Correlated) (Correlated) mRNA mRNA
expressionexpression
• mRNA levels are mRNA levels are systematically measured systematically measured under a variety of under a variety of different cellular different cellular conditions, and genes conditions, and genes are grouped if they show are grouped if they show a similar transcriptional a similar transcriptional response to these response to these conditions. conditions.
• mRNA levels are mRNA levels are systematically measured systematically measured under a variety of under a variety of different cellular different cellular conditions, and genes conditions, and genes are grouped if they show are grouped if they show a similar transcriptional a similar transcriptional response to these response to these conditions. conditions.
![Page 7: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/7.jpg)
Profile Similarity Identifies Sterol-Pathway Disturbance Resulting from Deletion of Uncharacterized ORF YER044c (ERG28) and from Dyclonine Treatment
(A) Prominent gene clusters responding to interference with ergosterol biosynthesis,
(B) Comparison of the transcript profile of an erg28Δ strain to that of an erg3Δ strain.
(C) Sterol content of wild-type (left) and erg28Δ (right) strains.
Hughes et al. 2000Cell
![Page 8: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/8.jpg)
Conventional hierarchical clustering of co-expression data could fail, because genes can play a role in multiple cellular processes and their common regulatory element can only be detected in a subset of experiments.
detect genes that are co-expressed under a subset of conditions. a comprehensive set of overlapping ‘transcriptional modules’
Ihmels et al. 2002 Nature Genetics
Ihmels et al. 2002 Nature Genetics
![Page 9: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/9.jpg)
Citric acid cycle? Different activity under different Citric acid cycle? Different activity under different experimental conditions experimental conditions
Citric acid cycle? Different activity under different Citric acid cycle? Different activity under different experimental conditions experimental conditions
![Page 10: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/10.jpg)
Rapid divergence in expression between duplicate genes inferred Rapid divergence in expression between duplicate genes inferred from microarray & promotor datafrom microarray & promotor data
Rapid divergence in expression between duplicate genes inferred Rapid divergence in expression between duplicate genes inferred from microarray & promotor datafrom microarray & promotor data
0.1 = 3.2 My
![Page 11: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/11.jpg)
Clustering conditions Clustering conditions where the conditions are where the conditions are genes: yet another way to genes: yet another way to get to functional “links”get to functional “links”
Clustering conditions Clustering conditions where the conditions are where the conditions are genes: yet another way to genes: yet another way to get to functional “links”get to functional “links”
![Page 12: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/12.jpg)
Yeast-2-hybridYeast-2-hybrid
Pairs of proteins to be tested for Pairs of proteins to be tested for interaction are expressed as interaction are expressed as fusion proteins ('hybrids') in fusion proteins ('hybrids') in yeast: one protein is fused to a yeast: one protein is fused to a DNA-binding domain, the other DNA-binding domain, the other to a transcriptional activator to a transcriptional activator domain. Any interaction domain. Any interaction between them is detected by the between them is detected by the formation of a functional formation of a functional transcription factor.transcription factor.
![Page 13: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/13.jpg)
Examples from the original Ito publication:A autophagyB spindle pole body functionC and vesicular transport
Arrows ~ orientation of two-hybrid interaction, beginning from the bait to the prey.
Examples from the original Ito publication:A autophagyB spindle pole body functionC and vesicular transport
Arrows ~ orientation of two-hybrid interaction, beginning from the bait to the prey.
![Page 14: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/14.jpg)
Accuracy of Y2H and how to improve itAccuracy of Y2H and how to improve itAccuracy of Y2H and how to improve itAccuracy of Y2H and how to improve it
b
![Page 15: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/15.jpg)
Improving reliability using protein complexes reasoning /Improving reliability using protein complexes reasoning /internal consistencyinternal consistency
Improving reliability using protein complexes reasoning /Improving reliability using protein complexes reasoning /internal consistencyinternal consistency
Internal filtering!Internal filtering!
![Page 16: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/16.jpg)
Accuracy of Y2H and how to improve itAccuracy of Y2H and how to improve itAccuracy of Y2H and how to improve itAccuracy of Y2H and how to improve it
B
![Page 17: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/17.jpg)
Mass Mass spectrometry of spectrometry of
purified purified complexes.complexes.
Mass Mass spectrometry of spectrometry of
purified purified complexes.complexes.
• Individual proteins Individual proteins are tagged and are tagged and used as 'hooks' to used as 'hooks' to biochemically biochemically purify whole purify whole protein protein complexes. These complexes. These are then are then separated and separated and their components their components identified by mass identified by mass spectrometry. spectrometry.
• Individual proteins Individual proteins are tagged and are tagged and used as 'hooks' to used as 'hooks' to biochemically biochemically purify whole purify whole protein protein complexes. These complexes. These are then are then separated and separated and their components their components identified by mass identified by mass spectrometry. spectrometry.
![Page 18: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/18.jpg)
![Page 19: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/19.jpg)
b
![Page 20: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/20.jpg)
![Page 21: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/21.jpg)
socio-affinity indices: dotted lines, 5–10; dashed lines, 10–15; plain lines, >15. Bait proteins are shown in bold and shaded circles around groups of proteins indicate cores and modules.
Exosome Ski
Stages in mRNA degradation
![Page 22: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/22.jpg)
pdbpdb Y2HY2H
Cellular FunctionCellular Function Phylogenetic profilePhylogenetic profile
![Page 23: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/23.jpg)
Protein interactions: literature databasesProtein interactions: literature databasesProtein interactions: literature databasesProtein interactions: literature databases
• Literature derived, normally manually curated (as opposed to Literature derived, normally manually curated (as opposed to text mining)text mining)
• Biased? Biased? • No new knowledgeNo new knowledge• Useful for benchmarking & for the study of the evolution of e.g. Useful for benchmarking & for the study of the evolution of e.g.
protein complexesprotein complexes• For example: Munich Informatation center for Protein For example: Munich Informatation center for Protein
Sequences (MIPS) Sequences (MIPS) • Databases that contain literature Databases that contain literature andand omics: Database of omics: Database of
Interacting Proteins (DIP), Biomolecular INteraction Database Interacting Proteins (DIP), Biomolecular INteraction Database (BIND),(BIND),
• Literature derived, normally manually curated (as opposed to Literature derived, normally manually curated (as opposed to text mining)text mining)
• Biased? Biased? • No new knowledgeNo new knowledge• Useful for benchmarking & for the study of the evolution of e.g. Useful for benchmarking & for the study of the evolution of e.g.
protein complexesprotein complexes• For example: Munich Informatation center for Protein For example: Munich Informatation center for Protein
Sequences (MIPS) Sequences (MIPS) • Databases that contain literature Databases that contain literature andand omics: Database of omics: Database of
Interacting Proteins (DIP), Biomolecular INteraction Database Interacting Proteins (DIP), Biomolecular INteraction Database (BIND),(BIND),
![Page 24: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/24.jpg)
Systematic screening for lethality of knockouts on a rich Systematic screening for lethality of knockouts on a rich mediummedium
Systematic screening for lethality of knockouts on a rich Systematic screening for lethality of knockouts on a rich mediummedium
• The functions of many open reading frames (ORFs) identified in genome-The functions of many open reading frames (ORFs) identified in genome-sequencing projects are unknown. New, whole-genome approaches are sequencing projects are unknown. New, whole-genome approaches are required to systematically determine their function. A total of required to systematically determine their function. A total of 6925 6925 Saccharomyces cerevisiaeSaccharomyces cerevisiae strains were constructed, by a high- strains were constructed, by a high-throughput strategy, each with a precise deletion of one of 2026 ORFs Of throughput strategy, each with a precise deletion of one of 2026 ORFs Of the deleted ORFs, 17 percent were essential for viability in rich medium. the deleted ORFs, 17 percent were essential for viability in rich medium.
• The functions of many open reading frames (ORFs) identified in genome-The functions of many open reading frames (ORFs) identified in genome-sequencing projects are unknown. New, whole-genome approaches are sequencing projects are unknown. New, whole-genome approaches are required to systematically determine their function. A total of required to systematically determine their function. A total of 6925 6925 Saccharomyces cerevisiaeSaccharomyces cerevisiae strains were constructed, by a high- strains were constructed, by a high-throughput strategy, each with a precise deletion of one of 2026 ORFs Of throughput strategy, each with a precise deletion of one of 2026 ORFs Of the deleted ORFs, 17 percent were essential for viability in rich medium. the deleted ORFs, 17 percent were essential for viability in rich medium.
Winzeler et al. 1999 Science
![Page 25: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/25.jpg)
Genetic interactions (synthetic lethal/sick)Genetic interactions (synthetic lethal/sick)Genetic interactions (synthetic lethal/sick)Genetic interactions (synthetic lethal/sick)
• Two nonessential Two nonessential genes that cause genes that cause lethality when mutated lethality when mutated at the same time form at the same time form a synthetic lethal a synthetic lethal interaction. Such interaction. Such genes are often genes are often functionally associated functionally associated and their encoded and their encoded proteins may also proteins may also interact physically. interact physically.
• Two nonessential Two nonessential genes that cause genes that cause lethality when mutated lethality when mutated at the same time form at the same time form a synthetic lethal a synthetic lethal interaction. Such interaction. Such genes are often genes are often functionally associated functionally associated and their encoded and their encoded proteins may also proteins may also interact physically. interact physically.
Tong et al. 2001 Science
![Page 26: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/26.jpg)
![Page 27: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/27.jpg)
One thing we can do with synthetic lethalsOne thing we can do with synthetic lethalsOne thing we can do with synthetic lethalsOne thing we can do with synthetic lethals
• Ideker: protein interactionsIdeker: protein interactions• Ideker: protein interactionsIdeker: protein interactions
![Page 28: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/28.jpg)
What do to with What do to with synthetic synthetic lethals?lethals?
What do to with What do to with synthetic synthetic lethals?lethals?
Kelley and Ideker 2005 Nature Biotech
![Page 29: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/29.jpg)
![Page 30: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/30.jpg)
ChIP-on-chipChIP-on-chipChIP-on-chipChIP-on-chip
• Tagged strains (one strain for each regulator). Tagged strains (one strain for each regulator). • Micro-array for a strain to see which pieces of DNA Micro-array for a strain to see which pieces of DNA
are found in excess if you isolate the regulator plus are found in excess if you isolate the regulator plus bound DNA.bound DNA.
• Tagged strains (one strain for each regulator). Tagged strains (one strain for each regulator). • Micro-array for a strain to see which pieces of DNA Micro-array for a strain to see which pieces of DNA
are found in excess if you isolate the regulator plus are found in excess if you isolate the regulator plus bound DNA.bound DNA.
b
![Page 31: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/31.jpg)
Gfp localizationGfp localizationGfp localizationGfp localization
• Mating of fluorescent Mating of fluorescent protein markers specific protein markers specific for organelles plus for organelles plus fluorescent protein tags fluorescent protein tags for each genefor each gene
• Mating of fluorescent Mating of fluorescent protein markers specific protein markers specific for organelles plus for organelles plus fluorescent protein tags fluorescent protein tags for each genefor each gene
![Page 32: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/32.jpg)
Other functional genomics data: the omesOther functional genomics data: the omesOther functional genomics data: the omesOther functional genomics data: the omes
• quantitative proteomicsquantitative proteomics• KinomeKinome• PTMomePTMome
• (almost) All of these data is freely and publicly (almost) All of these data is freely and publicly availableavailable
• Take home message “wow this exists !!!”Take home message “wow this exists !!!”
• quantitative proteomicsquantitative proteomics• KinomeKinome• PTMomePTMome
• (almost) All of these data is freely and publicly (almost) All of these data is freely and publicly availableavailable
• Take home message “wow this exists !!!”Take home message “wow this exists !!!”
![Page 33: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/33.jpg)
Accuracy
Co
ver
age
purifiedcomplexes
TAP
yeast two-hybrid
two methods
three methods
PurifiedComplexesHMS-PCI
combinedevidence
mRNAco-expression
genomic context
syntheticlethality
fra
cti
on
of
refe
ren
ce
se
t c
ov
ere
d b
y d
ata
fraction of data confirmed by reference set
filtered data
raw data
parameter choices
Bioinformatics for Benchmarking & IntegrationBioinformatics for Benchmarking & Integration
![Page 34: High-throuhput data on gene function](https://reader031.fdocuments.in/reader031/viewer/2022032606/56812e50550346895d93f102/html5/thumbnails/34.jpg)
Advanced integrationAdvanced integrationAdvanced integrationAdvanced integration
B