DNA Microarray

Post on 26-Dec-2015

30 views 0 download

description

DNA Microarray

Transcript of DNA Microarray

T. MohapatraNRC on Plant Biotechnology

IARI, New Delhi

DNA Microarray

A microarray is like a massive RNA or protein blotting experiment...in

reverse

Labeled RNA or protein

‘TARGET(s)’

Fixed surface- microscope slide- silicon chip

Short DNA orprotein (e.g. antibody)‘PROBES’

Basic principles

• Main novelty is one of scale– hundreds or thousands of probes

rather than tens

• Probes are attached to solid supports

• Robotics are used extensively• Informatics is a central component

at all stages

Major technologies

• cDNA probes (> 200 nt), usually produced by PCR, attached to either nylon or glass supports

• Oligonucleotides (25-80 nt) attached to glass support

• Oligonucleotides (25-30 nt) synthesized in situ on silica wafers (Affymetrix)

cDNA chips• Probes are cDNA fragments, usually

amplified by PCR• Probes are deposited on a solid support,

either positively charged nylon or glass slide (poly-L-lysine coated microscope slides)

• Attachment is commonly through ionic bonding

• Samples (normally poly(A)+ RNA) are labelled using fluorescent dyes

• At least two samples are hybridized to chip

• Fluorescence at different wavelengths measured by a scanner

• Technology requires no bioinformatic preselection but weak attachment chemistry

• Yields increased background following hybridizations

• Probes are not gene-specific

cDNA chips

cDNA chip design

• Probe selection– Non-redundant set of probes– Includes genes of interest to project– Corresponds to physically available

clones• Chip layout

– Grouping of probes by function– Correspondence between wells in

microtitre plates and spots on the chip

Probe selection

• Make sure that database entries are cDNA– Preference for RefSeq entries

• Criteria for non-redundancy– >98% identity over >100 nt– Accession number is unique

• Mapping of sequence to clone– Use Unigene clusters– Directly use data from sequence verified

collection (e.g. Research Genetics)– Independently verify sequence

cDNA arrays on nylon and glass

• Nylon arrays– Up to about 1000 probes per filter– Use radiolabeled cDNA target– Can use phosphorimager or X-ray film

• Glass arrays– Up to about 40’000 probes per slide, or

10’000 per 2cm2 area (limited by arrayer’s capabilities)

– Use fluorescent targets– Require specialized scanner

– Long oligonucleotide probes are designed based upon fine-scale bioinformatic analyses of genome or EST databases

– Probes are usually targeted to 3’ UTRs and are gene-specific

– Oligos are synthesized with 5’- or 3’-amine modification that binds covalently to e.g. aminosilane-coated slides

– Cross-hyb amongst family members is minimized– Attachment chemistry forms strong bonds – Hybridization Tm is uniform and background is

relatively low

‘Long oligo (70mer)’ Arrays

Glass chip manufacturing

• Choice of coupling method– Physical (charge), non-specific chemical,

specific chemical (modified PCR primer)

• Choice of printing method– Mechanical pins: flat tip, split tip, pin & ring– Piezoelectric deposition (“ink-jet”)

• Robot design– Precision of movement in 3 axes– Speed and throughput– Number of pins, numbers of spots per pin

load

‘Spotted’ Array Technology

Labeling and hybridization

• Targets are normally prepared by oligo(dT) primed cDNA synthesis– Probes should contain 3’ end of mRNA– Need CoT1 DNA as competitor– Specific activity will limit sensitivity of assay

• Alternative protocol is to make ds cDNA containing bacterial promoter, then cRNA– Can work with smaller amount of RNA– Less quantitative

• Hybridization usually under coverslips

Standard protocol for comparative hybridization

Single-channel Array Hybridization

Two-channel Array Hybridization

Scanning the arrays

• Laser scanners– Excellent spatial resolution– Good sensitivity, but can bleach fluorochromes– Still rather slow

• CCD scanners– Spatial resolution can be a problem– Sensitivity easily adjustable (exposure time)– Faster and cheaper than lasers

• In all cases, raw data are images showing fluorescence on surface of chip

Zeptosens : Planar Waveguide Principle - for High Sensitivity Fluorescence Microarray

Detection

free label

Imaging of surface-confined fluorescence

excitation of bound label

CCD camera

microarray on chip

Image Capture

• Function is to convert digital information into a viewable image linked to quantitative data in each channel for each feature (cDNA or oligo). Will subtract local background and compute ratiometric data.

• Affy system uses GeneChip Operating Software (GCOS)

• For spotted array systems, many many software options! Most popular is arguably the GenePix Pro image analysis software.

The Affymetrix approach

• Probes are oligos synthesized in situ using a photolithographic approach

• There are at least 5 oligos per cDNA, plus an equal number of negative controls

• The apparatus requires a fluidics station for hybridization and a special scanner

• Only a single fluorochrome is used per hybridization

• It is very expensive !

Affymetrix chip production

NimbleGen ArraysMaskless Array Synthesizer (MAS) Technology

Commercial chips

• Clontech, Incyte, Research Genetics - filter-based arrays with up to about 8000 clones

• Incyte / Synteni - 10’000 probe chips, not distributed (have to send them target RNA)

• Affymetrix - oligo-based chips with 12’000 genes of known function (16 oligos/gene) and 4x10’000 genes from ESTs

Alternative technologies

• Synthesis of probes on microbeads– Hybridization in solution– Identification of beads by fluorescent

bar coding by embedding transponders– Readout using micro-flow cells or optic

fiber arrays• Production of “universal” arrays

– Array uses a unique combination of oligos, and probes containing the proper complements

Arrays for genetic analysis

• Mutation detection– Oligos (Affymetrix type) representing

all known alleles– PCR followed by primer extension,

with detection of alleles by MALDI-TOF mass spectroscopy (Sequenom)

• Gene loss and amplification– Measure gene dosage in genomic DNA

by hybridization to genomic probes

Bioinformatics of microarrays

• Array design: choice of sequences to be used as probes

• Analysis of scanned images– Spot detection, normalization, quantitation

• Primary analysis of hybridization data– Basic statistics, reproducibility, data

scattering, etc.

• Comparison of multiple samples– Clustering, classification …

• Sample tracking and databasing of results

Microarrays – A Statisticians Feast

• Chip layout design (mainly for spotted arrays)– how many replicates of each probe? – where to spot or print on the array?

• Non-biological variability introduced through labeling and hybridization steps

• Signal variability per feature– amongst replicate probes within chips (spotted

arrays)– dyes within chips (two-channel array experiments)– amongst chips (one-channel and two-channel)

• Data analyses via multivariate statistical approaches

Experimental Design

• Two-channel (spotted array) experiment – 2 targets

e.g. steady-state wild type versus mutant

wild type x

Chip 2

Chip 1

chip replicate

wild type x

mutantmutant

wild type x

mutant

wild type x

wild type x

mutant

dye swap

Chip 3

Chip 4

chip replicate

Experimental Design• Two-channel (spotted array) experiment – 3 targets

“THE LOOP DESIGN”

dye swap

wild type x

mutant-1

wild type x

mutant-1

mutant-1

wild type x

dye swap

wild type x

mutant-1

mutant-1 xmutant-2

mutant-1 x

mutant-2

mutant-2

mutant-1 x mutant-1

xmutant-2

wild type x

mutant-2

wild type x

mutant-2

mutant-2

wild type x

wild type x

mutant-2

dye swap

Normalization Issues

• Unequal RNA/cDNA used• Different labeling concentrations• Hybridization techniques leading to different rates

of bonding• Gene expression data have meaning only in the

context of the particular biological sample and the exact conditions under which the samples were taken. For instance, if we are interested in finding out how different cell types react to treatments with various chemical compounds, we must record unambiguous information about the cell types and compounds used in the experiments

Normalization Techniques

• Use housekeeping genes as standards – too much fluctuation in biological systems

• Median of all signal intensities – good approximation

• Combination of all samples – most accurate

ROOT (7-days-old seedlings)ROOT (7-days-old seedlings)

Root-specific genes

Seed-specific genes

RO

OT

(7-

days

-old

see

dlin

gs)

SE

ED

(0-

2DA

P)

A B

ROOT (7-days-old seedlings)

SE

ED

(0-

2 D

AP

)

Genes showing low level but highly differential expression

Identification of Seed-specific Genes Using Microarray

(100)

(50)

(10)

(2)

Analysis of the Normalized Intensity Data

• Many, many software options for analyses of microarray data.• Employ an application that uses SOUND STATISTICS, especially

one capable of ANOVA. – ANOVA is the only way that you can evaluate statistical

significance of expression data by comparing more than 2 observations simultaneously (i.e. goes beyond simple ratios).

– ANOVA captures variability introduced by dye bias (spotted arrays) and chip-to-chip differences amongst replicates: The more variability, the higher the p-value on the expression ratio, and the less you should trust the result.

• Good applications that employ ANOVA:– Bioconductor (freeware based in R programming language)– SAS Microarray Suite (commercial offering in SAS language)– S+ArrayAnalyzer (commercial offering in S programming

language)

Uses of Microarrays

• Genome-scale gene expression analysis– Differentiation– Responses to environmental factors– Disease processes– Effects of drugs

• Detection of sequence variation– Genetic typing– Detection of somatic mutations (e.g. in

oncogenes)– Direct sequencing

Uses of Microarrays

• Transcriptome profiling:– Massive parallel analysis can reveal patterns of gene

expression allowing researchers to predict gene associations and pathways

– Microarrays have been used over the past five years for transcriptome analyses in organisms ranging from bacteria to fungi to humans to higher plants under hundreds of different conditions

• Diagnostics: RNA or DNA samples hybridized to small arrays to detect expression of marker genes (stress/pathogens, developmental events)

• Molecular genetics: Genomic DNA samples are hybridized to a limited number of probes to detect single nucleotide polymorphisms (SNPs) for mutational analyses (pedigrees, gene mapping)

• Gene promoter analyses: ChIP to Chip assays – overlapping or ‘tiled’ DNA probes representing an entire genome are hybridized with labeled, putative trans-acting proteins to detect binding to cis elements in promoter regions

Microarray data on the Web

• Many groups have made their raw data available, but in many formats

• Some groups have created searchable databases

• There are several initiatives to create “unified” databases– EBI: ArrayExpress– NCBI: Gene Expression Omnibus

• Companies are beginning to sell microarray expression data (e.g. Incyte)

Web links

• Leming Shi’s Gene-Chips.com page – very rich source of basic information and commercial and academic links

• DNA chips for dummies animation• A step by step description of a

microarray experiment by Jeremy Buhler

• The Big Leagues: Pat Brown and NHGRI microarray projets