Hybridization and data acquisition –Hybridization –Scanning –Image analysis –Background...

10
• Hybridization and data acquisition – Hybridization – Scanning – Image analysis – Background correction and filtering – Data transformation • Methods for normalization

Transcript of Hybridization and data acquisition –Hybridization –Scanning –Image analysis –Background...

Page 1: Hybridization and data acquisition –Hybridization –Scanning –Image analysis –Background correction and filtering –Data transformation Methods for normalization.

• Hybridization and data acquisition– Hybridization– Scanning– Image analysis– Background correction and filtering– Data transformation

• Methods for normalization

Page 2: Hybridization and data acquisition –Hybridization –Scanning –Image analysis –Background correction and filtering –Data transformation Methods for normalization.

Hybridization• Hybridization of DNA microarrays can be done in two different ways.• The “classical” approach

– placing labeled– denatured target on a slide– carefully covering the slide with a coverslip

• Skillfulness: the coverslip needs to be level to prevent gradients in hybridization and avoid trapped air bubbles.

– The slide is then placed in a humid chamber and incubated at the hybridization temperature.

• to prevent desiccation during hybridization • hybridization temperature

– The slides are washed to remove unspecific bound target. • More stringent washing steps are performed at the end of the washing procedure.

– decreasing the ionic strength or increasing the washing temperature» Typical protocols use decreasing SSC buffer concentrations first with small concentratio

ns of sodium dodecyl sulfate (SDS)» then without SDS.

– The slides are finally dried by centrifugation. • It is important to scan the arrays within several hours after hybridization bec

ause the fluorescence signal deteriorates with time.

Page 3: Hybridization and data acquisition –Hybridization –Scanning –Image analysis –Background correction and filtering –Data transformation Methods for normalization.

Hybridization temperature

• hybridization temperature:– 40 to 65°C for 5 to 12 h. – critical for oligonucleotide slides and has to be carefull

y optimized. • start at a hybridization temperature 15°C below the mean me

lting temperature of the oligonucleotides used.

– depends on the organism studied and the composition of the hybridization buffer

• saline sodium citrate (SSC) buffer with added detergent– Addition of Denhardt’s solution, sheared salmon sperm DNA, or

tRNA reduces the background– The addition of formamide, dextran sulfate, or polyethylene glyc

ol can improve binding of low-copy number transcripts

Page 4: Hybridization and data acquisition –Hybridization –Scanning –Image analysis –Background correction and filtering –Data transformation Methods for normalization.

Hybridization

• Automatic array hybridization stations– They provide hassle-free hybridization and washing of

the slides by running programmed protocols. – The results do not depend on the ability of the researc

her and are very reproducible. – However, hybridization and washing conditions have t

o be fine-tuned in earlier experiments to the probes and slide chemistry used.

• They are therefore most adequate when a large number of arrays based on the same chemistry have to be handled identically.

• They are not well suited to deal with small number arrays based on varying chemistries.

Page 5: Hybridization and data acquisition –Hybridization –Scanning –Image analysis –Background correction and filtering –Data transformation Methods for normalization.

Scanning• The microarrays are scanned with microarray scanners.

– Their appropriate driver and image analysis software determines the raw values. • GenePix and ArrayVision software are examples of widely used softwares for image analysis and raw

data acquisition.

• In principle, it is possible to scan standard-sized slides with any scanner. – Exceptions are a few slide types with a nonplanar surface that exclude confocal scanners.

• For successful data acquisition, a data file is needed that identifies the features and defines their dimensions and locations.

– The GenePix array list (.gal) file format is often used for this purpose.• The scanners mostly use lasers for exciting the surface of the hybridized microarray

with a resolution of a few micrometers• The resolution of scanning should be better than 10% of the spot size, that is, feature

s of 150-μm size need to be scanned at least at 15 μm resolution. • The fluorescence emitted from the dyes hybridized to the features is collected and qu

antified by photomultiplier tubes or charge-coupled device (CCD) cameras. • Normally, the scanner generates gray-scale images of the fluorescence at 532 and 6

35 nm. The data are stored in a lossless tagged image file format (TIFF) that is used for quantification by image analysis.

Page 6: Hybridization and data acquisition –Hybridization –Scanning –Image analysis –Background correction and filtering –Data transformation Methods for normalization.

• A color depth of 2 byte is characteristic for most scanners, which means that each pixel can assume 65,535 different intensity levels.

• The sensitivity of the scanner has to be adjusted to ensure that most of the pixels in the picture do not saturate its dynamic range.

• It is convenient to roughly adjust the sensitivity of the scanner during a prescan so that constitutive controls result in roughly equal signals.

• For microarrays made from double-stranded DNA, this can also be done by spotting chromosomal DNA and adjusting these spots to a ratio of 1 during scanning.

Page 7: Hybridization and data acquisition –Hybridization –Scanning –Image analysis –Background correction and filtering –Data transformation Methods for normalization.

Image analysis• To quantify the fluorescence of the features via image analysis, pixels have

to be assigned either to a spot or the background. • This resulting boundary is often visualized in the software for acquiring raw

values by a circle surrounding the feature and is then called a feature indicator.

• In most cases, the image analysis software allows the placement of the feature indicators in a semiautomatic or automatic manner.

• Even if an algorithm places the feature indicators automatically, it is advisable to manually control this placement.

• In real life, the spots might be irregular, or fluorescent impurities on the chip surface may confuse algorithms. It is common to define all pixels inside a feature indicator as foreground and all adjacent pixels within a radius of three times the feature diameter as the local background.

• The next step is the quantification of the image data by calculating the arithmetic mean or better median of the intensities of the foreground and background pixels. This resulting data are stored in form of a table. Common spreadsheet programs and all sorts of commercial and free software tools can be used for the next steps of data analysis.

Page 8: Hybridization and data acquisition –Hybridization –Scanning –Image analysis –Background correction and filtering –Data transformation Methods for normalization.

Background correction and filtering

• The first and most important of these quality assessments is to exclude features with intensity smaller than the background or assign them a “floor” value, which is often the local or global background. – The assignment of a floor value allows interpretation o

f genes that are transcribed at one condition but are totally switched off at the other condition.

– This is not uncommon with bacteria where some operons are specifically induced by an inducer but are not transcribed in its absence.

Page 9: Hybridization and data acquisition –Hybridization –Scanning –Image analysis –Background correction and filtering –Data transformation Methods for normalization.

Data transformation

• The untransformed values of expression ratios have the disadvantage of treating up- and downregulated genes differently.

• That means that a fourfold upregulated gene has an expression ratio of 4, whereas a fourfold downregulated gene has an expression ratio of 0.25.

• To circumvent this problem, the expression ratios are often handled as their logarithms to the base 2.

• This results in the values 2 and (−2) for a fourfold up- and downregulated gene, which has a number of practical advantages.

Page 10: Hybridization and data acquisition –Hybridization –Scanning –Image analysis –Background correction and filtering –Data transformation Methods for normalization.

Methods for normalization• Systematic biases

– different amounts of RNA used for labeling– different incorporation efficiencies of the Cy-3 and Cy-5 dyes in the labeling protocols– different detection efficiencies of the dyes

• Method 1: Assumption: the total sum of intensities should be equal in both channels, and therefore, the ratio between them should be one.

– A normalization factor is calculated from overall ratio, and ratios for all features are scaled accordingly.

• A somehow similar approach uses spotted chromosomal DNA for normalization, although this strategy is confined to microarrays made by spotting double-stranded DNA

• linear regression analysis • the intensity-dependent locally weighted linear regression (LOWESS) normalization t

hat corrects for intensity-dependent effects sometimes observed in the data. • they imply statistically that there must be a downregulated gene for every upregulated

one. – This assumption might be more appropriate from a statistical point of view with large eukary

otic genomes. – It is not necessarily correct for bacteria because the number of genes is much smaller.