DNA Microarray
-
Upload
lordniklaus -
Category
Documents
-
view
28 -
download
0
description
Transcript of DNA Microarray
T. MohapatraNRC on Plant Biotechnology
IARI, New Delhi
DNA Microarray
A microarray is like a massive RNA or protein blotting experiment...in
reverse
Labeled RNA or protein
‘TARGET(s)’
Fixed surface- microscope slide- silicon chip
Short DNA orprotein (e.g. antibody)‘PROBES’
Basic principles
• Main novelty is one of scale– hundreds or thousands of probes
rather than tens
• Probes are attached to solid supports
• Robotics are used extensively• Informatics is a central component
at all stages
Major technologies
• cDNA probes (> 200 nt), usually produced by PCR, attached to either nylon or glass supports
• Oligonucleotides (25-80 nt) attached to glass support
• Oligonucleotides (25-30 nt) synthesized in situ on silica wafers (Affymetrix)
cDNA chips• Probes are cDNA fragments, usually
amplified by PCR• Probes are deposited on a solid support,
either positively charged nylon or glass slide (poly-L-lysine coated microscope slides)
• Attachment is commonly through ionic bonding
• Samples (normally poly(A)+ RNA) are labelled using fluorescent dyes
• At least two samples are hybridized to chip
• Fluorescence at different wavelengths measured by a scanner
• Technology requires no bioinformatic preselection but weak attachment chemistry
• Yields increased background following hybridizations
• Probes are not gene-specific
cDNA chips
cDNA chip design
• Probe selection– Non-redundant set of probes– Includes genes of interest to project– Corresponds to physically available
clones• Chip layout
– Grouping of probes by function– Correspondence between wells in
microtitre plates and spots on the chip
Probe selection
• Make sure that database entries are cDNA– Preference for RefSeq entries
• Criteria for non-redundancy– >98% identity over >100 nt– Accession number is unique
• Mapping of sequence to clone– Use Unigene clusters– Directly use data from sequence verified
collection (e.g. Research Genetics)– Independently verify sequence
cDNA arrays on nylon and glass
• Nylon arrays– Up to about 1000 probes per filter– Use radiolabeled cDNA target– Can use phosphorimager or X-ray film
• Glass arrays– Up to about 40’000 probes per slide, or
10’000 per 2cm2 area (limited by arrayer’s capabilities)
– Use fluorescent targets– Require specialized scanner
– Long oligonucleotide probes are designed based upon fine-scale bioinformatic analyses of genome or EST databases
– Probes are usually targeted to 3’ UTRs and are gene-specific
– Oligos are synthesized with 5’- or 3’-amine modification that binds covalently to e.g. aminosilane-coated slides
– Cross-hyb amongst family members is minimized– Attachment chemistry forms strong bonds – Hybridization Tm is uniform and background is
relatively low
‘Long oligo (70mer)’ Arrays
Glass chip manufacturing
• Choice of coupling method– Physical (charge), non-specific chemical,
specific chemical (modified PCR primer)
• Choice of printing method– Mechanical pins: flat tip, split tip, pin & ring– Piezoelectric deposition (“ink-jet”)
• Robot design– Precision of movement in 3 axes– Speed and throughput– Number of pins, numbers of spots per pin
load
‘Spotted’ Array Technology
Labeling and hybridization
• Targets are normally prepared by oligo(dT) primed cDNA synthesis– Probes should contain 3’ end of mRNA– Need CoT1 DNA as competitor– Specific activity will limit sensitivity of assay
• Alternative protocol is to make ds cDNA containing bacterial promoter, then cRNA– Can work with smaller amount of RNA– Less quantitative
• Hybridization usually under coverslips
Standard protocol for comparative hybridization
Single-channel Array Hybridization
Two-channel Array Hybridization
Scanning the arrays
• Laser scanners– Excellent spatial resolution– Good sensitivity, but can bleach fluorochromes– Still rather slow
• CCD scanners– Spatial resolution can be a problem– Sensitivity easily adjustable (exposure time)– Faster and cheaper than lasers
• In all cases, raw data are images showing fluorescence on surface of chip
Zeptosens : Planar Waveguide Principle - for High Sensitivity Fluorescence Microarray
Detection
free label
Imaging of surface-confined fluorescence
excitation of bound label
CCD camera
microarray on chip
Image Capture
• Function is to convert digital information into a viewable image linked to quantitative data in each channel for each feature (cDNA or oligo). Will subtract local background and compute ratiometric data.
• Affy system uses GeneChip Operating Software (GCOS)
• For spotted array systems, many many software options! Most popular is arguably the GenePix Pro image analysis software.
The Affymetrix approach
• Probes are oligos synthesized in situ using a photolithographic approach
• There are at least 5 oligos per cDNA, plus an equal number of negative controls
• The apparatus requires a fluidics station for hybridization and a special scanner
• Only a single fluorochrome is used per hybridization
• It is very expensive !
Affymetrix chip production
NimbleGen ArraysMaskless Array Synthesizer (MAS) Technology
Commercial chips
• Clontech, Incyte, Research Genetics - filter-based arrays with up to about 8000 clones
• Incyte / Synteni - 10’000 probe chips, not distributed (have to send them target RNA)
• Affymetrix - oligo-based chips with 12’000 genes of known function (16 oligos/gene) and 4x10’000 genes from ESTs
Alternative technologies
• Synthesis of probes on microbeads– Hybridization in solution– Identification of beads by fluorescent
bar coding by embedding transponders– Readout using micro-flow cells or optic
fiber arrays• Production of “universal” arrays
– Array uses a unique combination of oligos, and probes containing the proper complements
Arrays for genetic analysis
• Mutation detection– Oligos (Affymetrix type) representing
all known alleles– PCR followed by primer extension,
with detection of alleles by MALDI-TOF mass spectroscopy (Sequenom)
• Gene loss and amplification– Measure gene dosage in genomic DNA
by hybridization to genomic probes
Bioinformatics of microarrays
• Array design: choice of sequences to be used as probes
• Analysis of scanned images– Spot detection, normalization, quantitation
• Primary analysis of hybridization data– Basic statistics, reproducibility, data
scattering, etc.
• Comparison of multiple samples– Clustering, classification …
• Sample tracking and databasing of results
Microarrays – A Statisticians Feast
• Chip layout design (mainly for spotted arrays)– how many replicates of each probe? – where to spot or print on the array?
• Non-biological variability introduced through labeling and hybridization steps
• Signal variability per feature– amongst replicate probes within chips (spotted
arrays)– dyes within chips (two-channel array experiments)– amongst chips (one-channel and two-channel)
• Data analyses via multivariate statistical approaches
Experimental Design
• Two-channel (spotted array) experiment – 2 targets
e.g. steady-state wild type versus mutant
wild type x
Chip 2
Chip 1
chip replicate
wild type x
mutantmutant
wild type x
mutant
wild type x
wild type x
mutant
dye swap
Chip 3
Chip 4
chip replicate
Experimental Design• Two-channel (spotted array) experiment – 3 targets
“THE LOOP DESIGN”
dye swap
wild type x
mutant-1
wild type x
mutant-1
mutant-1
wild type x
dye swap
wild type x
mutant-1
mutant-1 xmutant-2
mutant-1 x
mutant-2
mutant-2
mutant-1 x mutant-1
xmutant-2
wild type x
mutant-2
wild type x
mutant-2
mutant-2
wild type x
wild type x
mutant-2
dye swap
Normalization Issues
• Unequal RNA/cDNA used• Different labeling concentrations• Hybridization techniques leading to different rates
of bonding• Gene expression data have meaning only in the
context of the particular biological sample and the exact conditions under which the samples were taken. For instance, if we are interested in finding out how different cell types react to treatments with various chemical compounds, we must record unambiguous information about the cell types and compounds used in the experiments
Normalization Techniques
• Use housekeeping genes as standards – too much fluctuation in biological systems
• Median of all signal intensities – good approximation
• Combination of all samples – most accurate
ROOT (7-days-old seedlings)ROOT (7-days-old seedlings)
Root-specific genes
Seed-specific genes
RO
OT
(7-
days
-old
see
dlin
gs)
SE
ED
(0-
2DA
P)
A B
ROOT (7-days-old seedlings)
SE
ED
(0-
2 D
AP
)
Genes showing low level but highly differential expression
Identification of Seed-specific Genes Using Microarray
(100)
(50)
(10)
(2)
Analysis of the Normalized Intensity Data
• Many, many software options for analyses of microarray data.• Employ an application that uses SOUND STATISTICS, especially
one capable of ANOVA. – ANOVA is the only way that you can evaluate statistical
significance of expression data by comparing more than 2 observations simultaneously (i.e. goes beyond simple ratios).
– ANOVA captures variability introduced by dye bias (spotted arrays) and chip-to-chip differences amongst replicates: The more variability, the higher the p-value on the expression ratio, and the less you should trust the result.
• Good applications that employ ANOVA:– Bioconductor (freeware based in R programming language)– SAS Microarray Suite (commercial offering in SAS language)– S+ArrayAnalyzer (commercial offering in S programming
language)
Uses of Microarrays
• Genome-scale gene expression analysis– Differentiation– Responses to environmental factors– Disease processes– Effects of drugs
• Detection of sequence variation– Genetic typing– Detection of somatic mutations (e.g. in
oncogenes)– Direct sequencing
Uses of Microarrays
• Transcriptome profiling:– Massive parallel analysis can reveal patterns of gene
expression allowing researchers to predict gene associations and pathways
– Microarrays have been used over the past five years for transcriptome analyses in organisms ranging from bacteria to fungi to humans to higher plants under hundreds of different conditions
• Diagnostics: RNA or DNA samples hybridized to small arrays to detect expression of marker genes (stress/pathogens, developmental events)
• Molecular genetics: Genomic DNA samples are hybridized to a limited number of probes to detect single nucleotide polymorphisms (SNPs) for mutational analyses (pedigrees, gene mapping)
• Gene promoter analyses: ChIP to Chip assays – overlapping or ‘tiled’ DNA probes representing an entire genome are hybridized with labeled, putative trans-acting proteins to detect binding to cis elements in promoter regions
Microarray data on the Web
• Many groups have made their raw data available, but in many formats
• Some groups have created searchable databases
• There are several initiatives to create “unified” databases– EBI: ArrayExpress– NCBI: Gene Expression Omnibus
• Companies are beginning to sell microarray expression data (e.g. Incyte)
Web links
• Leming Shi’s Gene-Chips.com page – very rich source of basic information and commercial and academic links
• DNA chips for dummies animation• A step by step description of a
microarray experiment by Jeremy Buhler
• The Big Leagues: Pat Brown and NHGRI microarray projets