10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006....

36
Microarrays 10-810: Advanced Algorithms and Models for Computational Biology

Transcript of 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006....

Page 1: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Microarrays

10-810: Advanced Algorithms and Models for Computational Biology

Page 2: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Why sequence is not enoughIdentifying genes and control regions is not enough to decipher the inner workings of the cell:• We need to determine the function of genes.• We would like to determine which genes are activated in which cells and under which conditions.• We would like to know the relationships between genes (protein-DNA, protein-protein interactions etc.).•We would like to model the various dynamic systems in the cell

Page 3: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Central Dogma

DNA

mRNA

transcription

translation

Proteins

Transcription factors

Page 4: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Genes and Gene ExpressionTechnologyDisplay of Expression Information

Page 5: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Genomic DNA

Promoter Protein coding sequence Terminator

What is a gene?

Page 6: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

TFIIB

RNAPII

TT

T T

M

T

A

How are Genes Regulated?DNA-binding Activators Are Key To Specific Gene Expression

G

Cc

Ta

Page 7: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

T

R

TT

T T

M

T

A

How are Genes Regulated?DNA-binding activators are key, but there are additional factors

G

Cc

Ta a

rcctcRRR

Page 8: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Activators

Genome-wide Gene Expression (mRNA) can be Measured with DNA Microarrays

GeneRNAPIITFIIH

Transcription apparatus

mRNA

mRNA label hybridization ATGC

TACG

Page 9: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’
Page 10: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Some Additional Numbers

Yeast:6200 genes~200 transcriptional regulators

Human:30,000 - 50,000 genes~1700 transcriptional regulators>100 cell types

Page 11: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Genes and Gene ExpressionTechnologyDisplay of Expression Information

Page 12: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Microarray Hybridization• Watson-Crick base pairing of complementary DNA

sequences.

• Microarrays have thousands of spots, each representing a piece of one gene, immobilized on a glass slide.

• The intensity (or intensity ratio) of each spot indicates the amount of labeled cDNA hybridized, thus, representing the starting mRNA transcript abundance.

Page 13: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Two major technologies• cDNA arrays

- probes are placed on the slides- allows comparison of different cell types

• Oligonucleotide arrays- partial sequences are printed on the array- measure values in one tissue type

Page 14: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Hybridization and Scanning— cDNA arrays

- Prepare Cy3, Cy5-labeled ss cDNA

- Hybridize 600 ng of labeled ss cDNA toglass slide array

- Scan

Page 15: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Cartesian PixSys 5500 withquill printing technology

• Complete subsequences are printed on the array•10,000 spots/slide• Spots are 100-200 µm in diameter• Hybridization volumes: 20-100ul

Page 16: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Array Scanning

Laser based - fluorescent emission

Page 17: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Hybridization and Scanning—oligo arrays

Page 18: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

cDNA vs. Oligo: Pros and Cons

cDNA

• Does not require sequence

• Cheap

• Direct comparisons

• Inaccurate

• Cannot measure individual samples

Oligo

• Can be designed to minimize cross hybridization

• Allows for internal control

• Both lead to better accuracy

• expensive

• limited to certain species

Page 19: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Errors

Microarrays introduce many errors which should be taken into account when working with measured expression values:• Scanning errors• Spotting errors• Cross hybridization• Errors related to day / reading device / experimentalist• Background differences between slides

Page 20: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Error typesMicroarrays introduce many types of errors which should be taken into account when working with measured expression values:• Scanning errors additive + multiplicative• Spotting errors multiplicative• Cross hybridization multiplicative• Errors related to day / reading device / experimentalistadditive + multiplicative• Background differences between slides additive

Page 21: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Handling the Different Errors• Scanning errors• Spotting errors• Cross hybridization• Errors related to day / reading device / experimentalist• Background differences between slides

Analysis of image data (we assume it was performed)

Page 22: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Handling the Different Errors• Scanning errors• Spotting errors• Cross hybridization• Errors related to day / reading device / experimentalist• Background differences between slides

Use ratio instead of individual values:

Yi = Ri / Gi

Page 23: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Handling the Different Errors• Scanning errors• Spotting errors• Cross hybridization• Errors related to day / reading device / experimentalist• Background differences between slides

For Oligo arrays, use the match / mismatch spots

Page 24: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Match / Mismatch

• Presence and absent calls can be made using the Match / Mismatch information.

• However, it has been reported that in some cases the mismatch was higher than the match.

Page 25: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Handling the Different Errors• Scanning errors• Spotting errors• Cross hybridization• Errors related to day / reading device / experimentalist• Background differences between slides

Normalization (next lecture)

Page 26: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Binding arrays

• Instead of printing the genes on the microarray, we can print the intergenic region (an area upstream of the gene).

• We tag a protein of interest (a transcription factor) and fuse all proteins to DNA.

• Next, we hybridize the extracted portions of DNA onto the array, resulting in areas that are bound by the TF being spotted on the microarray.

Page 27: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Genes and Gene ExpressionTechnologyDisplay of Expression Information

Page 28: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Yeast cell cycle

expression program

genes

Experiments (over time)

100 20 70 80

gene 1

Higherexpression compared to baseline

Lower expression compared to baseline

baseline expression

Spellman et al Mol. Biol. Cell 1998

Page 29: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

alph

a 0’

alph

a 7’

alph

a 14

’al

pha

21’

alph

a 28

’al

pha

35’

alph

a 42

’al

pha

49’

alph

a 56

’al

pha

63’

alph

a 70

’al

pha

77’

alph

a 84

’al

pha

91’

alph

a 98

’al

pha

105’

alph

a 11

2’al

pha

119’

/Ndd1

Mcm1

1 3.5 >5

1/3 1 3

Yeast Cell Cycle Gene Expression Program

Spellman et al. and Cho et al., 1998

800 Genes

Page 30: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

>9 >6 >3 1:1 >3 >6 >9Fold repression Fold induction

amming in Response to Oxidant0 10 20 40 60 120

minutes

enome expression is rogrammed

6218 genes

ROS response

Page 31: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

600 Conditions/Mutations

6200

Gen

es

Single-gene MutationsEnvironment

Exercising the Genome

Page 32: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Visualization:Relative vs. absolute expression

Page 33: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Using annotation databases• Statistical tests to identify

the overlap with various functional categories

Page 34: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’
Page 35: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

Genome wide binding

genes

experiments (transcription factors)

TF1 TF8

gene 1

Probably bound by this TF

no binding by this TF

Lee et al Science 2002

Page 36: 10-810: Advanced Algorithms and Models for Computational …epxing/Class/10810-06/new7.pdf · 2006. 11. 27. · alpha 0’ alpha 7’ alpha 14’ alpha 21’ alpha 28’ alpha 35’

What you should know• The basic idea behind microarray profiling• The two different microarray technologies• Pros and cons for each• Noise factors in microarray experiments (more next time)