1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.
-
Upload
rosalyn-taylor -
Category
Documents
-
view
217 -
download
0
Transcript of 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.
![Page 1: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/1.jpg)
1
Two Color Microarrays
EPP 245/298
Statistical Analysis of
Laboratory Data
![Page 2: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/2.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
2
Two-Color Arrays
• Two-color arrays are designed to account for variability in slides and spots by using two samples on each slide, each labeled with a different dye.
• If a spot is too large, for example, both signals will be too big, and the difference or ratio will eliminate that source of variability
![Page 3: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/3.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
3
Dyes
• The most common dye sets are Cy3 (green) and Cy5 (red), which fluoresce at approximately 550 nm and 649 nm respectively (red light ~ 700 nm, green light ~ 550 nm)
• The dyes are excited with lasers at 532 nm (Cy3 green) and 635 nm (Cy5 red)
• The emissions are read via filters using a ccd device
![Page 4: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/4.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
4
![Page 5: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/5.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
5
![Page 6: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/6.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
6
![Page 7: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/7.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
7
File Format
• A slide scanned with Axon GenePix produces a file with extension .gpr that contains the results:http://www.axon.com/gn_GenePix_File_Formats.html
• This contains 29 rows of headers followed by 43 columns of data (in our example files)
• For full analysis one may also need a .gal file that describes the layout of the arrays
![Page 8: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/8.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
8
"Block" "Column" "Row" "Name" "ID" "X" "Y" "Dia." "F635 Median" "F635 Mean" "F635 SD" "B635 Median" "B635 Mean" "B635 SD" "% > B635+1SD" "% > B635+2SD" "F635 % Sat."
![Page 9: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/9.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
9
"F532 Median" "F532 Mean" "F532 SD" "B532 Median" "B532 Mean" "B532 SD" "% > B532+1SD" "% > B532+2SD" "F532 % Sat."
![Page 10: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/10.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
10
"Ratio of Medians (635/532)" "Ratio of Means (635/532)" "Median of Ratios (635/532)" "Mean of Ratios (635/532)" "Ratios SD (635/532)""Rgn Ratio (635/532)" "Rgn R² (635/532)" "F Pixels" "B Pixels" "Sum of Medians" "Sum of Means" "Log Ratio (635/532)" "F635 Median - B635""F532 Median - B532" "F635 Mean - B635" "F532 Mean - B532" "Flags"
![Page 11: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/11.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
11
Analysis Choices
• Mean or median foreground intensity
• Background corrected or not
• Log transform (base 2, e, or 10) or glog transform
• Log is compatible only with no background correction
• Glog is best with background correction
![Page 12: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/12.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
12
d41 <- read.table("037841.gpr",header=T,skip=29)d41 <- d41[,c(4,5,9,10,12,13,18,19,21,22)]
d50 <- read.table("037850.gpr",header=T,skip=29)d50 <- d50[,c(4,5,9,10,12,13,18,19,21,22)]
d46 <- read.table("037846.gpr",header=T,skip=29)d46 <- d46[,c(4,5,9,10,12,13,18,19,21,22)]
d47 <- read.table("037847.gpr",header=T,skip=29)d47 <- d47[,c(4,5,9,10,12,13,18,19,21,22)]
d48 <- read.table("037848.gpr",header=T,skip=29)d48 <- d48[,c(4,5,9,10,12,13,18,19,21,22)]
d49 <- read.table("037849.gpr",header=T,skip=29)d49 <- d49[,c(4,5,9,10,12,13,18,19,21,22)]
d43 <- read.table("037843.gpr",header=T,skip=29)d43 <- d43[,c(4,5,9,10,12,13,18,19,21,22)]
![Page 13: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/13.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
13
dataprep <- function(method="median",bc=F){ if ((method=="mean")&(bc)) cvec <- c(1,0,-1,0) if ((method!="median")&(bc)) cvec <- c(0,1,0,-1) if ((method=="mean")&(!bc)) cvec <- c(1,0,0,0) if ((method!="median")&(!bc)) cvec <- c(0,1,0,0)
d41a <- as.matrix(d41[,3:6]) %*% cvec d41b <- as.matrix(d41[,7:10]) %*% cvec d50a <- as.matrix(d50[,3:6]) %*% cvec d50b <- as.matrix(d50[,7:10]) %*% cvec d46a <- as.matrix(d46[,3:6]) %*% cvec d46b <- as.matrix(d46[,7:10]) %*% cvec ... ... ... ... ... ... ... ... ... ...
d45a <- as.matrix(d43[,3:6]) %*% cvec d45b <- as.matrix(d43[,7:10]) %*% cvec alldata <- cbind(d41a,d41b,d50a,d50b,d46a,d46b,d47a,d47b, d48a,d48b,d49a,d49b,d43a,d43b,d44a,d44b,d42a,d42b,d43a,d43b) return(alldata)}
![Page 14: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/14.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
14
alldata <- dataprep(method="median",bc=F)rownames(alldata) <- d41[,1]dye <- as.factor(rep(c("Cy5","Cy3"),10))slide <- as.factor(rep(1:10,each=2))treat <- c(1,0,0,1,0,1,1,0,0,3,3,0,0,3,3,0,0,1,1,0)
geneID <- d41[,1:2]
![Page 15: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/15.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
15
Array normalization
• Array normalization is meant to increase the precision of comparisons by adjusting for variations that cover entire arrays
• Without normalization, the analysis would be valid, but possibly less sensitive
• However, a poor normalization method will be worse than none at all.
![Page 16: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/16.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
16
Possible normalization methods
• We can equalize the mean or median intensity by adding or multiplying a correction term
• We can use different normalizations at different intensity levels (intensity-based normalization) for example by lowess or quantiles
• We can normalize for other things such as print tips
![Page 17: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/17.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
17
Group 1 Group 2
Array 1 Array 2 Array 3 Array 4
Gene 1 1100 900 425 550
Gene 2 110 95 85 110
Gene 3 80 65 55 80
Example for Normalization
![Page 18: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/18.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
18
> normex <- matrix(c(1100,110,80,900,95,65,425,85,55,550,110,80),ncol=4)> normex [,1] [,2] [,3] [,4][1,] 1100 900 425 550[2,] 110 95 85 110[3,] 80 65 55 80> group <- as.factor(c(1,1,2,2))
> anova(lm(normex[1,] ~ group))Analysis of Variance Table
Response: normex[1, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 262656 262656 18.888 0.04908 *Residuals 2 27812 13906 ---Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
![Page 19: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/19.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
19
> anova(lm(normex[2,] ~ group))Analysis of Variance Table
Response: normex[2, ] Df Sum Sq Mean Sq F value Pr(>F)group 1 25.0 25.0 0.1176 0.7643Residuals 2 425.0 212.5
> anova(lm(normex[3,] ~ group))Analysis of Variance Table
Response: normex[3, ] Df Sum Sq Mean Sq F value Pr(>F)group 1 25.0 25.0 0.1176 0.7643Residuals 2 425.0 212.5
![Page 20: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/20.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
20
Group 1 Group 2
Array 1 Array 2 Array 3 Array 4
Gene 1 975 851 541 608
Gene 2 -15 46 201 168
Gene 3 -45 16 171 138
Additive Normalization by Means
![Page 21: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/21.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
21
> mn <- mean(cmn)> normex - rbind(cmn,cmn,cmn)+mn [,1] [,2] [,3] [,4]cmn 974.58333 851.25 541.25 607.9167cmn -15.41667 46.25 201.25 167.9167cmn -45.41667 16.25 171.25 137.9167> normex.1 <- normex - rbind(cmn,cmn,cmn)+mn
![Page 22: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/22.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
22
> mn <- mean(cmn)> anova(lm(normex.1[1,] ~ group))Analysis of Variance Table
Response: normex.1[1, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 114469 114469 23.295 0.04035 *Residuals 2 9828 4914 > anova(lm(normex.1[2,] ~ group))Analysis of Variance Table
Response: normex.1[2, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 28617.4 28617.4 23.295 0.04035 *Residuals 2 2456.9 1228.5 > anova(lm(normex.1[3,] ~ group))Analysis of Variance Table
Response: normex.1[3, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 28617.4 28617.4 23.295 0.04035 *Residuals 2 2456.9 1228.5
![Page 23: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/23.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
23
Group 1 Group 2
Array 1 Array 2 Array 3 Array 4
Gene 1 779 776 687 679
Gene 2 78 82 137 136
Gene 3 57 56 89 99
Multiplicative Normalization by Means
![Page 24: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/24.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
24
> normex*mn/rbind(cmn,cmn,cmn) [,1] [,2] [,3] [,4]cmn 779.16667 775.82547 687.33407 679.13851cmn 77.91667 81.89269 137.46681 135.82770cmn 56.66667 56.03184 88.94912 98.78378> normex.2 <- normex*mn/rbind(cmn,cmn,cmn)> anova(lm(normex.2[1,] ~ group))
Response: normex.2[1, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 8884.9 8884.9 453.71 0.002197 **Residuals 2 39.2 19.6 > anova(lm(normex.2[2,] ~ group))
Response: normex.2[2, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 3219.7 3219.7 696.33 0.001433 **Residuals 2 9.2 4.6 > anova(lm(normex.2[3,] ~ group))
Response: normex.2[3, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 1407.54 1407.54 57.969 0.01682 *Residuals 2 48.56 24.28
![Page 25: 1 Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649e965503460f94b9a1bb/html5/thumbnails/25.jpg)
November 10, 2004 EPP 245 Statistical Analysis of Laboratory Data
25
Group 1 Group 2
Array 1 Array 2 Array 3 Array 4
Gene 1
Gene 2
Gene 3
Multiplicative Normalization by Medians