Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to...

57
Microarrays
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    224
  • download

    0

Transcript of Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to...

Page 1: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Microarrays

Page 2: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Regulation of Gene Expression

Page 3: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Cells respond to environment

Heat

FoodSupply

Responds toenvironmentalconditions

Various external messages

Page 4: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Where gene regulation takes place

• Opening of chromatin

• Transcription

• Translation

• Protein stability

• Protein modifications

Page 5: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Transcriptional Regulation

• Strongest regulation happens during transcription

• Best place to regulate: No energy wasted making intermediate products

• However, slow response timeAfter a receptor notices a change:

1. Cascade message to nucleus

2. Open chromatin & bind transcription factors

3. Recruit RNA polymerase and transcribe

4. Splice mRNA and send to cytoplasm

5. Translate into protein

Page 6: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Transcription Factors Binding to DNA

Transcription regulation:

Certain transcription factors bind DNA

Binding recognizes DNA substrings:

Regulatory motifs

Page 7: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

RNA Polymerase

TBP

Promoter and Enhancers

• Promoter necessary to start transcription

• Enhancers can affect transcription from afar

Enhancer 1 Enhancer 1 Enhancer 1

TATA box

Gene X

DNA binding sites

Transcription factors

Page 8: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Example: A Human heat shock protein

• TATA box: positioning transcription start

• TATA, CCAAT: constitutive transcription• GRE: glucocorticoid response• MRE: metal response• HSE: heat shock element

TATASP1CCAAT AP2HSEAP2CCAATSP1

promoter of heat shock hsp70

0--158

GENE

Motifs:

Page 9: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

The Cell as a Regulatory Network

A B Make DC

If C then D

If B then NOT D

If A and B then D D

Make BD

If D then B

C

gene D

gene B

B

Promoter D

Promoter B

Page 10: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

DNA Microarrays

Measuring gene transcription in a high-throughput fashion

Page 11: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Measuring transcription

AAAAAAAAA

Gene (DNA)

Transcript (RNA)

RNA polymerase – cellular enzyme

AAAAAAAAATTTTTTTTT

Synthetic primer (oligo dT)

Reverse transcriptase (RT) – Retroviral enzyme

- Flourescence tags

Extract RNA

Complementary DNA (cDNA)

Expression ~ RNA ~ flourescence

Page 12: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

What is a microarray

Page 13: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

What is a microarray (2)

• A 2D array of DNA sequences from thousands of genes

• Each spot has many copies of same gene

• Allow mRNAs from a sample to hybridize

• Measure number of hybridizations per spot

Page 14: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

How to make a microarray

• Method 1: Printed Slides (Stanford)– Use PCR to amplify a 1 kb portion of each gene /

EST– Apply each sample on glass slide

• Method 2: DNA Chips (Affymetrix)– Grow oligonucleotides (20bp) on glass– Several words per gene (choose unique words)

If we know the gene sequences,

Can sample all genes in one experiment!

Page 15: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Microarray Experiment

RT-PCR

RT-PCR

LASER

DNA “Chip”

High glucose

Low glucose

Page 16: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Raw data – images

• Red (Cy5) dot – overexpressed or up-regulated

• Green (Cy3) dot – underexpressed or down-

regulated• Yellow dot

– equally expressed

• Intensity - “absolute” levelcDNA plotted microarray

Page 17: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Levels of analysis

• Level 1: Which genes are induced / repressed?Gives a good understanding of the biologyMethods: Factor-2 rule, t-test.

• Level 2: Which genes are co-regulated? Inference of function.-Clustering algorithms.

•Level 3: Which genes regulate others?Reconstruction of networks.- Transcriptions factor binding sites.

Page 18: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Experiment: time course

Time 0G

enes

Sample annotations

Gene annotations

Intensity (Red)Intensity (Green)

Page 19: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Experiment: time course

Time 0.5

Gen

esIntensity (Red)Intensity (Green)

Time 0

Page 20: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Experiment: time courseG

enes

00 0.50 20 50 70 90 110

Time (hours)

Page 21: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Gene expression database

Gen

es

Gene expression levels

Samples Sample annotations

Gene annotations

Gene expression matrix

Page 22: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Gene expression database

SamplesG

enes Gene expression

matrix

Timeseries,Conditions A, B, …Mutants in genes a, b …Etc.

Page 23: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Data normalization expression of gen x in experiment i expression of gen x in reference

Logarithm of ratio - treats induction and repression of identical

magnitude as numerical equal but with opposite sign.

red/green - ratio of expression– 2 - 2x overexpressed– 0.5 - 2x underexpressed

log2( red/green ) - “log ratio”– 1 2x overexpressed– -1 2x underexpressed

Xi log(Ei / Ri).

Page 24: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Analysis of multiple experiments

Xi log(Ei / Ri).

.,...,1 mXXX

Expression of gene x in m experiments can berepresented by an expression vector with m elements

Z-transformation:If

X ~ N(),

.

)(Xstdev

XXX i

i

.1

m

XX

m

ii

.

X

Z

Page 25: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Level 1

• 2-fold rule: Is a gene 2-fold up (or down) regulated?

• Students t-test: Is the regulation significantly different from background variation? (Needs repeated measurements)

Page 26: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

T-test

X ~ N(), .: XH a

.:0 XHCannot reject H0

Reject H0 .

m

XZ

The p-value is the probability of drawing the wrong conclusion by rejecting a null hypothesis

Page 27: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Multiple testing

In a microarray experiment, we perform 1 test / gene

Prob (correct) = 1 – c

Prob (globally correct) = (1 – cn

Prob (wrong somewhere) = 1 - (1 – cn

e = 1 - (1 – cn

For small e : c en

Bonferroni correction for multiple testing ofindependent events

Single comparison

Experiment comparison

Page 28: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Multiple testing

Genes Treated 1 Treated 2 Control 1 Control 2 p-value

Gene 1 0.659081 0.97234 0.372675 0.69511 0.010362

Gene 2 0.341119 0.100549 0.56026 0.285965 0.052948

Gene 3 0.667136 0.29554 0.498284 0.019279 0.150739

Gene 4 0.880788 0.871784 0.552085 0.208167 0.20722

Gene 5 0.092942 0.756629 0.488266 0.84595 0.358535

Gene 6 0.07958 0.736049 0.022873 0.406469 0.391526

Gene 7 0.534497 0.146925 0.659746 0.951731 0.401714

Gene 8 0.062087 0.678039 0.979814 0.795904 0.418683

Gene 9 0.224166 0.17082 0.650215 0.16222 0.512849

Gene 10 0.372998 0.184738 0.353879 0.451197 0.545602

Gene 11 0.537619 0.853997 0.606766 0.083149 0.556954

Gene 12 0.232855 0.77575 0.275746 0.438622 0.58056

Gene 13 0.760863 0.508516 0.823947 0.074637 0.591919

Gene 14 0.568507 0.932771 0.72373 0.027096 0.60806

Gene 15 0.838437 0.549377 0.92673 0.100789 0.623721

Gene 16 0.017407 0.723751 0.310977 0.220452 0.836162

Gene 17 0.893638 0.293472 0.542273 0.886285 0.840617

Gene 18 0.536479 0.887943 0.859521 0.382404 0.861986

Gene 19 0.675622 0.604696 0.445713 0.916473 0.904506

Gene 20 0.836653 0.397073 0.438522 0.778742 0.986562

0.05

Significance

level

Page 29: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Clustering

Hierachical clustering: - Transforms n (genes) * m (experiments) matrixinto a diagonal n * n similarity (or distance) matrix

Similarity (or distance) measures:Euclidic distancePearsons correlation coefficent

Eisen et al. 1998 PNAS 95:14863-14868

Page 30: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Vectors in space: distances

Gene 1

Gene 2

Experiment 1

Experiment 3Experiment 2

d

Page 31: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Distance Measures: Minkowski Metric

r rm

iii

m

m

yxyxd

yyyy

xxxx

myx

||),(

)(

)(

1

21

21

by defined is metric Minkowski The

:features have both and objects two Suppose

Page 32: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Most Common Minkowski Metrics

||max),(

||),(

1

||),(

2

1

1

2 2

1

iimi

m

iii

m

iii

yxyxd

r

yxyxd

r

yxyxd

r

)distance sup"(" 3,

distance) (Manhattan 2,

) distance (Euclidean 1,

Page 33: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

An Example

.4}3,4{max

.734

.5342 22

:distance sup"" 3,

:distance Manhattan 2,

:distance Euclidean 1,

4

3

x

y

Page 34: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Similarity Measures: Correlation Coefficient

. and :averages

)()(

))((),(

1

1

1

1

1 1

22

1

m

iim

m

iim

m

i

m

iii

m

iii

yyxx

yyxx

yyxxyxs

1),( yxs

Page 35: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Similarity Measures: Correlation Coefficient

Time

Gene A

Gene B Gene A

Time

Gene B

Expression LevelExpression Level

Expression Level

Time

Gene A

Gene B

Page 36: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Clustering of Genes and Conditions

• Unsupervised:– Hierarchical clustering– K-means clustering– Self Organizing Maps (SOMs)

Page 37: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Ordered dendrograms

Hierachical clustering:Hypothesis: guilt-by-associationCommon regulation -> common function

Eisen98

Page 38: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Hierarchical Clustering

Given a set of n items to be clustered, and an n*n distance (or similarity) matrix, the basic process hierarchical clustering is this:

1. Start by assigning each item to its own cluster, so that if you have n items, you now have n clusters, each containing just one item. Let the distances (similarities) between the clusters equal the distances (similarities) between the items they contain.

2. Find the closest (most similar) pair of clusters and merge them into a single cluster, so that now you have one less cluster.

3. Compute distances (similarities) between the new cluster and each of the old clusters.

4. Repeat steps 2 and 3 until all items are clustered into a single cluster of size N.

Page 39: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Merge two clusters by:

• Single-Link Method / Nearest Neighbor (NN): minimum of pairwise dissimilarities

• Complete-Link / Furthest Neighbor (FN): maximum of pairwise dissimilarities

• Unweighted Pair Group Method with Arithmetic Mean (UPGMA): average of pairwise dissimilarities

Page 40: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

453652

cba

dcb

453,

cba

dc

Single-Link Method

453652

cba

dcb

Diagonal n*n distance Matrix

Euclidean Distance

ba

c d

(1)

c d

a,b

(2)

a,b,cd

(3)

a,b,c,d

4,, cbad

Page 41: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

453652

cba

dcb

Complete-Link Method

ba

453652

cba

dcb

Distance Matrix

Euclidean Distance

465,

cba

dc6,,

badc

(1) (2) (3)

a,b

cc d

a,b

d c,da,b,c,d

Page 42: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Compare Dendrograms

a b c d a b c d

2

4

6

0

Single-Link Complete-Link

Page 43: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Serum stimulation of human fibroblasts (24h) Cholesterol biosynthesis

Celle cyclusI-E responseSignalling/ Angiogenesis

Wound healning

Page 44: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Partitioning

• k-means clustering• Self organizing maps (SOMs)

Page 45: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

k-means clustering

Tavazoie et al. 1999 Nature Genet. 22:281-285

Page 46: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

k-Means Clustering Algorithm

1) Select an initial partition of k clusters

2) Assign each object to the cluster with the closest centre

3) Compute the new centres of the clusters

4) Repeat step 2 and 3 until no object changes cluster

Page 47: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.
Page 48: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

1. centroide

Page 49: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

1. centroide

2. centroide

3. centroide

4. centroide

5. centroide

6. centroide

k = 6

Page 50: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

1. centroide

2. centroide

3. centroide

5. centroide

6. centroide

k = 6

Page 51: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

1. centroide2. centroide

3. centroide

4. centroide

5. centroide

6. centroide

k = 6

Page 52: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

Self organizing maps

Tamayo et al. 1999 PNAS 96:2907-2912

Page 53: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.
Page 54: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

1. centroide 2. centroide 3. centroide

4. centroide 5. centroide 6. centroide

k = (2,3) = 6

Page 55: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

k = 6

Page 56: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

k = 6

Page 57: Microarrays. Regulation of Gene Expression Cells respond to environment Heat Food Supply Responds to environmental conditions Various external messages.

k = 6