ChIP-chip Data

18
ChIP-chip Data

description

ChIP-chip Data. DNA-binding proteins. Constitutive proteins (mostly histones) Organize DNA Regulate access to DNA Have many modifications Acetylation, methylation, … Sporadic proteins (Transcription Factors) Mediate docking of transcription apparatus Modify histones Methylate DNA. - PowerPoint PPT Presentation

Transcript of ChIP-chip Data

Page 1: ChIP-chip Data

ChIP-chip Data

Page 2: ChIP-chip Data

DNA-binding proteins

• Constitutive proteins (mostly histones)– Organize DNA– Regulate access to DNA– Have many modifications

• Acetylation, methylation, …

• Sporadic proteins (Transcription Factors)– Mediate docking of transcription apparatus– Modify histones– Methylate DNA

Page 3: ChIP-chip Data

Histones

Histones are an ancient family of proteins which serve as the scaffold for DNA

Four types of histones assemble in pairs to form a nucleosome

DNA is wrapped twice around each nucleosome

Page 4: ChIP-chip Data

Histones and Modifications

DNA contacts histones on their tails Histone tails can be modified

Histones can stay loose or assemble tightly – this compacts the DNA

Page 5: ChIP-chip Data

Transcription Factors

• General – help to set up transcription of many genes

• Specific – draw in general factors or RNA Pol II to specific genes

TATABindingProtein

Page 6: ChIP-chip Data

DNA Methylation

Adding a Methyl to Cytosine

Cytosine methylation is passed on to daughter cells

Page 7: ChIP-chip Data

Chromatin Immuno-precipitation

Page 8: ChIP-chip Data

Tiling Array

• One probe every n base pairs over some length of chromosome

– Interrupted by repeat regions

• Promoter array: each (known) promoter tiled

An Affymetrix tiling design

Page 9: ChIP-chip Data

What the data look like

__ _

_

__ _ __

_

_____

1206600 1206800 1207000 1207200 1207400

-2-1

01

23

4

loc[nn]

lr(e

co

g1

.h3

k9

)[n

n, ]

_

__

_

_ _ _ ___ _____

__ _

_

_ __

___ ___

__

__

__

__

____

_____

__

_ _

_ __

___

____

_

_

_

_

_

_ _ _ ___

_

____

__

_

_

_ __

____

___

_

__

_ _

_ _

__

__ _

__

__

___

_

__

_ ___ _

_

___

__ _

_

_

__ ___

_

____

___

_

__ _

___

_

___

_

__

_ _

_ _

_

_

_

_

_

_

___

__ _

_

_ _

_ ____

_

___

__ __

_ _ _

___ _

____

histone acetylation on 15 samples over one promoter (raw)

Page 10: ChIP-chip Data

Multiple Promoters

----

--

--

----

-------------

--------

-

-

-

- ---

--

--

-

-

-

------------- -

-

------------------

---------

---

10120000 10125000 10130000 10135000

-4-3

-2-1

01

2

loc[mm]

log

.R[m

m, ] -

log

.G[m

m, ]

---

-

------

-

------

---------

-----------

-----

--------

---------- --

---

-------

-----------------

- -

-

-

----

-

--

--

---

-

-

---------------------- --------

------

---

------

-- -

--

-------

--

--

----

--------

--

--

-

-------

--

-

-

-

--

---

-

-

------------

-

-

-

----

--

-

--

-----

-

-

----

-

---

-

-

----

----

---

---

-

-

-

--

-

-

-

--

-----

--

Page 11: ChIP-chip Data

Normalized by Medians

----

--

--

----

---------

----

---

---

--

-

-

-

- ---

--

--

-

-

-

------

---

---- -

-

------------------

-----

----

---

10120000 10125000 10130000 10135000

-2-1

01

23

loc[mm]

xx

---

-

------

-

-----

-

---------

----------

---

---

------------------ --

---

-------

----

-------------- -

--

---

-

-

--

--

---

-

-

---------------------- --------

------

---

------

-

- -

--

-------

--

--

----

--------

--

--

-

-

------

--

-

-

-

--

---

-

-

------------

-

-

-

----

--

-

--

-----

-

-

----

-

---

-

-

----

----

---

---

-

-

-

--

-

-

-

--

-----

--

Page 12: ChIP-chip Data

Methods and Issues

• Normalization– Different enrichment ratios– Different probe thermodynamics– Dye and probe bias

• Estimation– Categorical or continuous?– Individual values are noisy:

• For TF binding: where is the peak?----

--

--

----

---------

----

---

---

--

-

-

-

- ---

--

--

-

-

-

------

---

---- -

-

------------------

-----

----

---

10120000 10125000 10130000 10135000

-2-1

01

23

loc[mm]

xx

---

-

------

-

-----

-

---------

----------

---

---

------------------ --

---

-------

----

-------------- -

--

---

-

-

--

--

---

-

-

---------------------- --------

------

---

------

-

- -

--

-------

--

--

----

--------

--

--

-

-

------

--

-

-

-

--

---

-

-

------------

-

-

-

----

--

-

--

-----

-

-

----

-

---

-

-

----

----

---

---

-

-

-

--

-

-

-

--

-----

--

Page 13: ChIP-chip Data

Normalization

• Basic idea: compensate technical variables

• Technique differences should affect different probes differently

• Try to estimate what part of signal can be attributed to technical factors

• Easiest variable to access: sequence

Page 14: ChIP-chip Data

MAT

• One color Affy array– Needs separate array for comparison

• Normalizes probe thermodynamics & enrichment ratio

• Estimation by (robust) moving average

Page 15: ChIP-chip Data

Normalized Data – Rare Event

Page 16: ChIP-chip Data

Normalized Data – Common Event

Page 17: ChIP-chip Data

Estimation

• Try to build an intelligent moving average

• Not all neighbors will be similar

• Typical TF binds to 8bp– Pol II may spread wider

• Typical fragment is 100-200 bp

• Cannot resolve < 200 bp----

--

--

----

---------

----

---

---

--

-

-

-

- ---

--

--

-

-

-

------

---

---- -

-

------------------

-----

----

---

10120000 10125000 10130000 10135000

-2-1

01

23

loc[mm]

xx

---

-

------

-

-----

-

---------

----------

---

---

------------------ --

---

-------

----

-------------- -

--

---

-

-

--

--

---

-

-

---------------------- --------

------

---

------

-

- -

--

-------

--

--

----

--------

--

--

-

-

------

--

-

-

-

--

---

-

-

------------

-

-

-

----

--

-

--

-----

-

-

----

-

---

-

-

----

----

---

---

-

-

-

--

-

-

-

--

-----

--

Pol II binding on a 100 bp grid

Page 18: ChIP-chip Data

TileMap

• Ignores normalization

• ‘Shrinkage’ estimator of variance– Improves individual scores

• Smooths noise by moving average