High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of...
Transcript of High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of...
![Page 1: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/1.jpg)
High throughput, multiplex analysis of gene expression*
David Goldman, Laboratory of Neurogenetics, NIH
*With special thanks to Chiara Mazzanti, Ph.D.
![Page 2: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/2.jpg)
Revolutions in the analysis of gene expression
Protein functionProtein concentrationPost-translational modificationTranscriptional controlGenetic variationGenomics
![Page 3: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/3.jpg)
Multiplex gene analysis: Why?
Complex diseases involve expression modulation40,000 genes, perhaps moreWe are not so cleverThe ultimate goal is to reconstruct the biochemical and functional networks of the cell. One can not understand a network by studying one gene.
![Page 4: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/4.jpg)
Methods for multiplex expression analysis
Two dimensional protein electrophoresis [2DE]Differential displayRepresentational difference analysis [RDA]Serial analysis of gene expression [SAGE]RNA array methods
![Page 5: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/5.jpg)
Array Systems
cDNA
Probes: amplified spotted cDNAshttp://cmgm.standford.eduhttp://www.nhgri.nih.gov
Oligonucleotide chips
Probes: Oligos synthesized in situ or conventionally & followed by immobilizationOligos as long as 80bphttp://www.affymetrix.comhttp://www.rii.com (Agilent)
![Page 6: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/6.jpg)
Array preparationMicroarray Genechip
Available genes: 10k – NCI [$50] 20k - Affymetrix
![Page 7: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/7.jpg)
Sample preparation (direct labeling)
Tissue
Total RNA
Cy3 or Cy5 Labelled cDNA
Cy3 Cy5
Hybridization mixing
Hybridization mixing
First strand cDNAsynthesis
![Page 8: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/8.jpg)
![Page 9: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/9.jpg)
10kArrays
Hs-UniGem2-v2p9-060701Blocks: (18x18) x32
Features/Spots: 9984235um spacing
![Page 10: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/10.jpg)
Post-hybridizationmicroarray
analysis
ScannerGenePix 4000A
GenePix Prosoftware
![Page 11: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/11.jpg)
Image scanning
Cy3/Cy5 signal normalization
Image cleaning
GAL (gene array list) loading
Image analysis
Typical software (GenPix Pro)
![Page 12: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/12.jpg)
Case Cy3Control Cy5
Case Cy5Control Cy3
Reciprocal labelling
![Page 13: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/13.jpg)
cDNA Microarray
Oligo-microchip
Case Cy5Control Cy3Ratio >2, < 0.5
Array vs Chip
Ratio Cy5/Cy3
Ratio Array1/Array2
![Page 15: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/15.jpg)
![Page 16: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/16.jpg)
LOCAL APPROACH GLOBAL APPROACH
Schulze et al, Nature Cell Biology, August 2001
![Page 17: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/17.jpg)
Array Analysis Methods• Gene Discovery
– Outlier detection – simple and group logic retrieval tools; single and multiple array viewers
– Scatter plots
• Pattern Discovery– Clustering – Hierarchical, K-means– Principal Components Analysis; Multidimensional
Scaling
• Pattern Prediction
![Page 18: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/18.jpg)
Expression profile
Expressi o n si gn a tu re
Sample1 Sample 2…….Sample N
Gene 1
Gene 2....
Gene N
Expression signatures and profiles
![Page 19: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/19.jpg)
![Page 20: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/20.jpg)
What is Clustering?
Clustering algorithms are discovery algorithms that find structure in data.
Discovery algorithms develop their own sorting parameters without human intervention.
![Page 21: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/21.jpg)
Hierarchical Clustering
![Page 22: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/22.jpg)
Hierarchical clustering
Mock phylogenetic dendrograms
![Page 23: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/23.jpg)
Dendrogram Construction for Hierarchical Clustering
• Merge nearest neighbors• Merge more distant datapoints
![Page 24: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/24.jpg)
K-means clustering
Genes are divided into k user-defined, equally-sized groups.
Gene group centroids and individual genes are then reassigned.
The process is reiterated until group compositions stabilize.
![Page 25: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/25.jpg)
ClusteringClusters are also developed from random data:
• Are clustered genes related in function?
• Do clusters group related samples/tissues/diseases/treatments?
![Page 26: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/26.jpg)
1 2 34
56
7 89
1011
12 1314
15 1617
1819 20
2122
23 2425
26 2728 29
30 31
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Clustering of Melanoma Tumors Using Average Linkage
1 2 3
4
5
6
7 8
910
1112 13
1415 16
17 1819 20
21
2223 24
25
26 27
28 29
30 31
0.2
0.4
0.6
0.8
1.0
Clustering of Melanoma Tumors Using Complete Linkage
1 2 3
4
5
6
7
8
910
1112 13
14
15 16
17 1819 20
21
2223 24
25
26
27
28 29
30
31
0.2
0.3
0.4
0.5
0.6
Clustering of Melanoma Tumors Using Single Linkage
(Bittner et al., Nature, 2000)
Clustering algorithms: Differing results
![Page 27: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/27.jpg)
mAdb BioInformatics Project: Current• Links to external data sources
• dbEST
• GeneCards
• LocusLink
• MGD
• GenBank
• Stanford GO (Genome Ontology)
• PubMed
• UniGene
•Automatic updates of external data sources
![Page 28: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/28.jpg)
![Page 29: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/29.jpg)
![Page 30: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/30.jpg)
Hierarchical Clustering (Alizadeh et al., Nature, Feb. 2000)
![Page 31: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/31.jpg)
Van’t Veer et al,Nature, Jan. 2002
![Page 32: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/32.jpg)
Issues for Study Design
Appropriateness of modelExposure, genotype, timingTissue and cellsControls: Sieving the genes
The issue of completeness# genes expressed in tissue% modulated at chosen level
![Page 33: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/33.jpg)
Pitfalls in multiplex analysis of gene expression
Nonspecific changes in gene expressionQuantitatively small changes can be criticalmRNA modulation is often not the mechanism
Allosteric regulation of enzymesReceptor desensitization
![Page 34: High throughput, multiplex analysis of gene expression* · High throughput, multiplex analysis of gene expression* David Goldman, Laboratory of Neurogenetics, NIH dgneuro@box-d.nih.gov](https://reader034.fdocuments.in/reader034/viewer/2022042708/5f3cb36aff75673483665876/html5/thumbnails/34.jpg)
Conclusions
Multiplex analysis of gene expression has great potential for identifying intermediate processes in neural diseasesRealization of the potential of these inherently industrial methods requires more incisive experimental designsThe downstream biology is inherently nonindustrialAppropriate controls can narrow the field of candidate genesThese methods should lead to the identification of new pathways in disease and an understanding of sharing of pathways between diseases