Microarray normalization, error models, quality Wolfgang Huber EMBL-EBI Brixen 16 June 2008.
Microarray Technology Types Normalization Microarray Technology Microarray: –New Technology (first...
-
date post
21-Dec-2015 -
Category
Documents
-
view
223 -
download
1
Transcript of Microarray Technology Types Normalization Microarray Technology Microarray: –New Technology (first...
Microarray Technology
• Microarray:– New Technology (first paper: 1995)
• Allows study of thousands of genes at same time
– Glass slide of DNA molecules • Molecule: string of bases (25 bp – 500 bp) • uniquely identifies gene or unit to be studied
http://kbrin.a-bldg.louisville.edu/CECS694/
Differing Conditions
• Ultimate Goal:– Understand expression level of genes under
different conditions
• Helps to:– Determine genes involved in a disease– Pathways to a disease– Used as a screening tool
Gene Conditions
• Cell types (brain vs. liver)
• Developmental (fetal vs. adult)
• Response to stimulus
• Gene activity (wild vs. mutant)
• Disease states (healthy vs. diseased)
Expressed Genes
• Genes under a given condition– mRNA extracted from cells– mRNA labeled– Labeled mRNA is mRNA present in a given
condition– Labeled mRNA will hybridize (base pair) with
corresponding sequence on slide
Two Different Types of Microarrays
• Custom spotted arrays (up to 20,000 sequences)– cDNA– Oligonucleotide
• High-density (up to 100,000 sequences) synthetic oligonucleotide arrays– Affymetrix (25 bases)– SHOW AFFYMETRIX LAYOUT
Custom Arrays
• Mostly cDNA arrays
• 2-dye (2-channel)– RNA from two sources (cDNA created)
• Source 1: labeled with red dye• Source 2: labeled with green dye
Two Channel Microarrays
• Microarrays measure gene expression
• Two different samples:– Control (green label)– Sample (red label)
• Both are washed over the microarray– Hybridization occurs – Each spot is one of 4 colors
Microarray Image Analysis
• Microarrays detect gene interactions: 4 colors: – Green: high control– Red: High sample– Yellow: Equal– Black: None
• Problem is to quantify image signals
Information Extraction
—Spot Intensities—mean (pixel intensities).—median (pixel intensities).
—Background values—Local —Morphological opening—Constant (global)—None
—Quality Information
Take the average
Speed Group Microarray Page
http://stat-www.berkeley.edu/users/terry/zarray/Html/image.html
Signal
Background
Single Color Microarrays
• Prefabricated – Affymetrix (25mers)
• Custom– cDNA (500 bases or so)– Spotted oligos (70-80 bases)
Single Color Microarrays
• Expressed sequences washed over chips
• Expressed genes hybridize
• Light passed under to see intensity (or hybridized oligos show dark color)
Lithography
• It is a printing technology.• Lithography was invented by Alois
Senefelder in Germany in 1798.• The printing and non-printing areas of the
plate are all at the same level, as opposed to intaglio and relief processes in which the design is cut into the printing block.
• Lithography is based on the chemical repellence of oil and water.
Lithography
Designs are drawn or painted with greasy ink or crayons on specially prepared limestone. The stone is moistened with water, which the stone accepts in areas not covered by the crayon. An oily ink, applied with a roller, adheres only to the drawing and is repelled by the wet parts of the stone. The print is then made by pressing paper against the inked drawing.
Affymetrix Technology
Biotin (one dye) instead of 2 colorsOne treatment per chip11, 16, or 20 gene markers pairs per gene
DESOKY, 2003
Affymetrix Data
• Each gene labeled as “present”, “marginal”, or “absent.” – Present: gene expressed and reliably
detected in the RNA sample
• Label chosen based on a p-value
PM to maximize hybridization
MM to ascertain the degree of cross-hybridization
Affymetrix Design of probes
Inferential statistics
Paradigm Parametric test Nonparametric
Compare two unpaired groups Unpaired t-test Mann-Whitney test
Compare twopaired groups Paired t-test Wilcoxon test
Compare 3 or ANOVAmore groups
Inferential statistics
Is it appropriate to set the significance level to p < 0.05?If you hypothesize that a specific gene is up-regulated,you can set the probability value to 0.05.
You might measure the expression of 10,000 genes andhope that any of them are up- or down-regulated. Butyou can expect to see 5% (500 genes) regulated at thep < 0.05 level by chance alone. To account for thethousands of repeated measurements you are making,some researchers apply a Bonferroni correction.The level for statistical significance is divided by thenumber of measurements, e.g. the criterion becomes:
p < (0.05)/10,000 or p < 5 x 10-6