Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

27
Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth

Transcript of Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Page 1: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Transcription as a Permutation Algorithm

By M. Nickenig

Mentor: Prof. Robert Vellanoweth

Page 2: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Transcription: Overview

• Where does transcription fit into the larger genetic schema:

DNA ------------> DNA --------------> mRNA ------------> ProteinReplication Transcription Translation

|---- Regulation

| |---- Regulation

| |---- Regulation

|

• Specifically, regulation at the transcriptional level involves four

major modes:

1) Regulation via the combinatronics of components of the transcriptional machinery (basal transcription machinery).

2) Induction of response elements through inducible transcription factors (TFs).

3) Regulation through the action of interfering RNAs.

4) Chromatin remodeling.

Page 3: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Biochemistry, C. Matthews, K.E. van Holde, K.G. Ahern- 3rd ed.

Transcription: Regulation by TFs• Transcription factor- a protein that binds DNA at a specific promoter

or enhancer site, where it regulates transcription. • Basal transcription factors are involved in the formation of a

pre-initiation complex.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 4: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

http://homepages.strath.ac.uk/~dfs97113/BB310/Lect15/lect15.htm

Transcription Factors: Structure• Regulatory Factors: Activate or Repress transcription

Note: Not all transcription factors bind to DNA- some just bind other transcription factors.

• Basic Structural features of transcription factors:

1) Activation Domain - Three different types: acidic domain, glutamine-rich domain, proline-rich domain

* Activation domains interact with the basal machinery to activate transcription

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 5: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

http://homepages.strath.ac.uk/~dfs97113/BB310/Lect15/lect15.htm

Transcription Factors: Structure2) DNA binding domain -

•Helix-turn-helix (HTH) bind the major groove of the DNA.Two anti-parallel alpha-helical regions interrupted by a turn region.

QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.-------------------->DNA binding via

recognition helix

•Zinc fingers function as structural platforms for DNA binding. This type of transcription factor has an absolute requirement for zinc for their formation.•Two types: 2-His, 2-Cys Zn finger and Multi-Cys Zn finger

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 6: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

http://homepages.strath.ac.uk/~dfs97113/BB310/Lect15/lect15.htm

Transcription Factors: Structure

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

•B-Zip: Leucine zippers function in associating the transcription factors with each other. Posses a basic DNA binding domain (B-domain) adjacent to a leucine zipper dimerization domain. Function as dimers.

•The leucine zipper dimerization domain is found in many transcription factors:

Basic Domain

Page 7: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

http://homepages.strath.ac.uk/~dfs97113/BB310/Lect15/lect15.htm

Transcription Factors: Regulation

• Consider the following example of transcriptional regulation by an inducible TF:

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 8: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Transcription Factors: Regulation

• Consider an additional example of transcriptional regulation by an inducible TFs:

Page 9: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Bioinformatics Vol. 15, 1999, 563-577

Transcription: Probabilities

Goal:

• We want to find all transcription factor binding sequences in the Arabidopsis thaliana transcriptome using a suitable motif-finding program

Assumptions:

• Functionally related DNA sequences are generally expected to share some common sequence elements.

• The pattern shared by a set of functionally related sequences is commonly identified during the process of aligning the sequences to maximize sequence conservation.

• A good alignment is assumed to be one whose alignment matrix is rarely expected to occur by chance.

• Furthermore, we assume that the distribution of letters is independent and is randomly distributed. Thus, the probability of an alignment matrix is determined by the multinomial distribution;

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 10: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Transcription: Probabilities

Mathematical Terms:• Where, i , refers to the rows of the alignment matrix( i.e. the bases A, C, G, T), j, refers

to the columns of the matrix (i.e. the letters within the alignment pattern), A is the total number of letters in the sequence alphabet, L, is the total number of columns in the matrix, pi, is the a priori probability of the letter , i , nij, is the occurrence of the letter i at the position j and N is the total number of sequences in the alignment (Reference: Bioinformatics Vol.15, 1999, 563-577).

• Furthermore, the above formula can be extended to calculate the probabilities associated with cis-regulatory modules:

such that the sum is taken over all sequences in a module (Lall), the factor, (1/m ), is a normalization constant where, m, equals the number of sequences of lengths, L, comprising the module.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.QuickTime™ and a

TIFF (LZW) decompressorare needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.

Page 11: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Thesis: A. Mortazavi, 2004 Vellanoweth Lab, CSULA

Transcription: Probabilities

• cis-Regulatory Module- a set of motifs that bind transcription factors cooperatively.

• For example, consider the following Cistematic derived sequence data which corresponds to the Lipid Transfer Protein (LTP) module (Thesis: A. Mortazavi, 2004 Vellanoweth Lab):

First we calculate the probability associated with this alignment using the method of Hertz and Stromo; then this is followed by a calculation where aligned sequences are broken up into blocks and each block is treated as a mutually exclusive event.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 12: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Thesis: A. Mortazavi, 2004 Vellanoweth Lab

Transcription: Probabilities

• An alignment matrix can be formed from a gap alignment and the probability subsequently calculated, e.g the T-COFFE derived gap alignment of the LTP module:

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.1 2

2 3

Page 13: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Bioinformatics Vol.15, 1999, 563-577

Transcription: Probabilities

Sample Calculation: LTP module regions 1 and 3 - (Method of Hertz and Stromo)

ltp30 REGION 1 T G G G A G Altp 3 REGION 1 T G G G A G Altp 4 REGION 1 T G G G A G A

* * * * * * *ALIGNMENT MATRIX REGION 1:

A 0 0 0 0 3 0 3 N=3T 3 0 0 0 0 0 0 L=7G 0 3 3 3 0 3 0

3.22E-02 6.03E-03 6.03E-03 6.03E-03 3.22E-02 6.03E-03 3.22E-021 1 1 1 1 1 1

3.22E-02 6.03E-03 6.03E-03 6.03E-03 3.22E-02 6.03E-03 3.22E-02 = 4.39E-14

ltp30 REGION 3 A A T T G G A C G G T Altp 3 REGION 3 A A T C A G A C G G C Altp 4 REGION 3 A A T T G G A C G G T A

ALIGNMENT MATRIX REGION 3:A 3 3 0 0 1 0 3 0 0 0 0 3 N=3T 0 0 3 2 0 0 0 0 0 0 2 0 L=12G 0 0 0 0 2 3 0 0 3 3 0 0

C 0 0 0 1 0 0 0 3 0 0 1 0* * * * * * * * * * * *

3.22E-02 3.22E-02 3.22E-02 1.84E-02 1.05E-02 6.03E-03 3.22E-02 6.03E-03 6.03E-03 6.03E-03 1.84E-02 3.22E-021 1 1 3 3 1 1 1 1 1 3 1

3.22E-02 3.22E-02 3.22E-02 5.52E-02 3.16E-02 6.03E-03 3.22E-02 6.03E-03 6.03E-03 6.03E-03 5.52E-02 3.22E-02 = 4.37E-213.327E-05 7.54E-14 2.51E-18

Page 14: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Transcription: Probabilities

Sample Calculation: LTP module region 2 - (as Mutually Exclusive events)

ltp30 REGION 2 T T G T C A T G T T T G Altp 3 REGION 2 T T C T G A T G T C T A Gltp 4 REGION 2 T T G T G A T G T T T A G

ALIGNMENT MATRIX REGION 2:A 0 0 0 0 0 3 0 0 0 0 0 2 1T 3 3 0 3 0 0 3 0 3 2 3 0 0G 0 0 2 0 2 0 0 3 0 0 0 1 2

C 0 0 1 0 1 0 0 0 0 1 0 0 0* * * * * * * * * * * * *

3.22E-02 3.22E-02 6.03E-03 3.22E-02 6.03E-03 3.22E-02 3.22E-02 6.03E-03 3.22E-02 1.84E-02 3.22E-02 1.84E-02 1.05E-021 1 3 1 3 1 1 1 1 3 1 3 3

3.22E-02 3.22E-02 1.81E-02 3.22E-02 1.81E-02 3.22E-02 3.22E-02 6.03E-03 3.22E-02 5.52E-02 3.22E-02 5.52E-02 3.16E-022.1809E-15 1.20E-16

T G T T G G A T A C A C G T GT G T T G G G T A C A C G T GT G T T G G A T A C A C G T G

0 0 0 0 0 0 2 0 3 0 3 0 0 0 03 0 3 3 0 0 0 3 0 0 0 0 0 3 00 3 0 0 3 3 1 0 0 0 0 0 3 0 30 0 0 0 0 0 0 0 0 3 0 3 0 0 0* * * * * * * * * * * * * * *3.22E-02 6.03E-03 3.22E-02 3.22E-02 6.03E-03 6.03E-03 1.84E-02 3.22E-02 3.22E-02 6.03E-03 3.22E-02 6.03E-03 6.03E-03 3.22E-02 6.03E-031 1 1 1 1 1 3 1 1 1 1 1 1 1 1

3.22E-02 6.03E-03 3.22E-02 3.22E-02 6.03E-03 6.03E-03 5.52E-02 3.22E-02 3.22E-02 6.03E-03 3.22E-02 6.03E-03 6.03E-03 3.22E-02 6.03E-032.26E-17

T C C C T C A G T T A T A C A T T G C A C TT C C G T C A G T T A C A C A T A G C A T TT C T A T C A G C T G C A C A T C G C A T T

0 0 0 1 0 0 3 0 0 0 2 0 3 0 3 0 1 0 0 3 0 0 N=33 0 1 0 3 0 0 0 2 3 0 1 0 0 0 3 1 0 0 0 2 3 L=500 0 0 1 0 0 0 3 0 0 1 0 0 0 0 0 0 3 0 0 0 00 3 2 1 0 3 0 0 1 0 0 2 0 3 0 0 1 0 3 0 1 0* * * X * * * * * * * * * * * * X * * * * *3.22E-02 6.03E-03 1.05E-02 1.05E-02 3.22E-02 6.03E-03 3.22E-02 6.03E-03 1.84E-02 3.22E-02 1.84E-02 1.05E-02 3.22E-02 6.03E-03 3.22E-02 3.22E-02 1.84E-02 6.03E-03 6.03E-03 3.22E-02 1.84E-02 3.22E-021 1 3 6 1 1 1 1 3 1 3 3 1 1 1 1 6 1 1 1 3 13.22E-02 6.03E-03 3.16E-02 6.32E-02 3.22E-02 6.03E-03 3.22E-02 6.03E-03 5.52E-02 3.22E-02 5.52E-02 3.16E-02 3.22E-02 6.03E-03 3.22E-02 3.22E-02 1.10E-01 6.03E-03 6.03E-03 3.22E-02 5.52E-02 3.22E-02 = 7.94E-84

2.74E-19 3.76E-08 2.324E-16 4.59E-17 = 4.22E-16

Page 15: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Transcription: Probabilities

Table 1: Probabilities- LTP Module (Hertz/Stromo Method: Reference- Bioinformatics Vol.15, 1999, 563-577)

Note: Calculations adjusted according to background model based on Arabidopsis genome base frequencies-

A: 0.3180185 T: 0.318015 G: 0.1819815 C: 0.1819815

Hertz/Stromo Mutually Exclusive

Width Region P matrix P matrix

7 1 4.39E-14 4.39E-14

40 2 1.23E-66 4.22E-16

10 3 2.51E-18 2.51E-18

71 Module ?1.35E-97? 2.22E-14

Page 16: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Proceedings of the 3rd International Conference on Bioinformatics and Genome Research, G.Z. Hertz and G. Stromo

Transcription: P-Value Probabilities

• By the calculation of probabilities resulting statements concerning statistical significance can be formulated through estimations of the P-value using large-deviation statistics.

• In particular, Hertz and Stromo provide a statistical analysis method based upon the observation that when “the information content is small and the number of sequences is large, 2NI tends to a chi-squared distribution…” with L(A-1) degrees of freedom.

• In particular, the probability of sequence alignment containing gaps is:

where, n-j, the occurrence of a gap at the position j in the alignment. N, L, A and, nij, have been defined previously.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 17: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Proceedings of the 3rd International Conference on Bioinformatics and Genome Research, G.Z. Hertz and G. Stromo

Transcription: P-Value

• Then the information content (large-deviation rate function) of the corresponding sequence alignment is:

Where fij = nij/N.

• “To calculate the overall statistical significance, we consider the probability distribution of

and it’s large-deviation rate function of

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 18: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Proceedings of the 3rd International Conference on Bioinformatics and Genome Research, G.Z. Hertz and G. Stromo

Transcription: P-Value

• The overall statistical significance…. is equal to the inverse of the product of 2NL and the probability of a large-deviation rate function greater than or equal to (I gap matrix + L ln 2) based on the probability distribution, P, above.

Sample Calculation: P-Value LTP module region 1 - (Based on method of Hertz/Stromo)

ltp30 REGION 1 T G G G A G Altp 3 REGION 1 T G G G A G Altp 4 REGION 1 T G G G A G A

* * * * * * *ALIGNMENT MATRIX REGION 1:

A 0 0 0 0 3 0 3 N=3T 3 0 0 0 0 0 0 L=7G 0 3 3 3 0 3 0

4.02E-03 7.53E-04 7.53E-04 7.53E-04 4.02E-03 7.53E-04 4.02E-031 1 1 1 1 1 1

4.02E-03 7.53E-04 7.53E-04 7.53E-04 4.02E-03 7.53E-04 4.02E-03 = 6.8793E-07 P-Value Region 12.09E-20 P

Page 19: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Transcription: Probabilities

Table 1: Probabilities- LTP Module (Hertz/Stromo Method: Reference- Bioinformatics Vol.15, 1999, 563-577)

Note: Calculations adjusted according to background model based on Arabidopsis genome base frequencies-

A: 0.3180185 T: 0.318015 G: 0.1819815 C: 0.1819815

Width Region Prob matrix

(Hertz/Stromo)

Probmatrix

(Mutually Exclusive)

P-Value

7 1 4.39E-14 4.39E-14 6.88E-07

40 2 1.23E-66 4.22E-16 1.01E-45

10 3 2.51E-18 2.51E-18 2.10E-11

71 Module ?1.35E-97? 2.22E-14 1.54E-66

Page 20: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Transcription: Permutation

• Furthermore, it is desired to devise a method to arrive at groupings of genes that are coregulated.

• These coexpressed gene clusters are expected to respond to either internal or external stimuli which can be visualized, as a first approximation, in a microarray.

• This concerted genetic response is presumed to be governed by the action of a conserved set of response elements interacting with a distinct set of transcription factors.

• By focusing on gene clustering we expect to detect the presence of transcription factor binding sites using the motif finding program Cistematic augmented with a statistical method, which will be described below.

Page 21: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Transcription: Permutation

Statistical Method:

6-mer Reference Distribution

0100200300400500600700800900

10001100120013001400

0.0000363210.0000634730.00011090.00019380.00033870.0005920.0010345

Probabilities

Occurences

It occurred to the author that a simple plot of occurrences by probabilities would yield visualizations of data trends output by Cistematic.

QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.

0 0 0 0 1 1 1 1 1 1 10 0 0 1 0 1 1 1 1 1 10 0 0 1 1 0 1 1 1 1 10 0 0 1 1 1 0 1 1 1 10 0 0 1 1 1 1 0 1 1 10 0 0 1 1 1 1 1 0 1 10 0 0 1 1 1 1 1 1 0 10 0 0 1 1 1 1 1 1 1 00 0 1 0 1 1 1 1 1 1 00 0 1 1 0 1 1 1 1 1 00 0 1 1 1 0 1 1 1 1 00 0 1 1 1 1 0 1 1 1 00 0 1 1 1 1 1 0 1 1 00 0 1 1 1 1 1 1 0 1 00 0 1 1 1 1 1 1 1 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

Page 22: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

www.arabidopsis.org

Transcription: Permutation

• Microarray 17808T7 was designed to identify gene expression changes that occur during shoot development in Arabidopsis.

• Root explants were incubated on a callus induction medium (CIM) during which time they acquire 'competence' to respond to hormones that induce shoot formation. Explants are then transferred to cytokinin-rich shoot induction medium (SIM) where they organize meristems and undergo shoot morphogenesis.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.QuickTime™ and a

TIFF (LZW) decompressorare needed to see this picture.

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Shoot Developm

ent Scan 1

Shoot Developem

nt Scan 2

Vascular D

evolpment

Shoot Devlopm

ent Scan 3

Shoot Devlopm

ent Scan

Vascular D

evelopment 2

Shoot Developm

ent in tissue culture 1

Shoot Developm

ent in tissue culture 2

Page 23: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

www.arabidopsis.org 0 0 0 0 1 1 1 1 1 1 10 0 0 1 0 1 1 1 1 1 10 0 0 1 1 0 1 1 1 1 10 0 0 1 1 1 0 1 1 1 1 0 0 0 1 1 1 1 0 1 1 10 0 0 1 1 1 1 1 0 1 1

0 0 0 1 1 1 1 1 1 0 10 0 0 1 1 1 1 1 1 1 00 0 1 0 1 1 1 1 1 1 00 0 1 1 0 1 1 1 1 1 00 0 1 1 1 0 1 1 1 1 00 0 1 1 1 1 0 1 1 1 00 0 1 1 1 1 1 0 1 1 00 0 1 1 1 1 1 1 0 1 00 0 1 1 1 1 1 1 1 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0 1 1 1 1 1 1 10 0 0 1 0 1 1 1 1 1 10 0 0 1 1 0 1 1 1 1 10 0 0 1 1 1 0 1 1 1 1 0 0 0 1 1 1 1 0 1 1 10 0 0 1 1 1 1 1 0 1 1

0 0 0 1 1 1 1 1 1 0 10 0 0 1 1 1 1 1 1 1 00 0 1 0 1 1 1 1 1 1 00 0 1 1 0 1 1 1 1 1 00 0 1 1 1 0 1 1 1 1 00 0 1 1 1 1 0 1 1 1 00 0 1 1 1 1 1 0 1 1 00 0 1 1 1 1 1 1 0 1 00 0 1 1 1 1 1 1 1 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

0 0 0 00 0 0 0

0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0

Genes Gene Info

AT2G22430 Homeobox-leucine zipper protein 6 (HB-6) / HD-ZIP transcription factor 6, identical to homeobox-leucine zipper protein ATHB-6 (HD-ZIP protein ATHB-6)

AT5G01870 Lipid transfer protein, putative, similar to lipid transfer protein 6 from Arabidopsis thaliana (gi:8571927); contains Pfam protease inhibitor/seed storage/LTP family domain PF00234

AT5G59330 (AT5G59330: hypothetical protein)

AT5G59310 Lipid transfer protein 4 (LTP4), identical to lipid transfer protein 4 from Arabidopsis thaliana (gi:8571923); contains Pfam protease inhibitor/seed storage/LTP family domain PF00234

AT5G59320 Lipid transfer protein 3 (LTP3), identical to lipid transfer protein 3 from Arabidopsis thaliana (gi:8571921); contains Pfam protease inhibitor/seed storage/LTP family domain PF00234)

AT1G50570 C2 domain-containing protein, low similarity to cold-regulated gene SRC2 (Glycine max) GI:2055230; contains Pfam profile PF00168: C2 domain

AT2G05380 Glycine-rich protein (GRP3S), identical to cDNA glycine-rich protein 3 short isoform (GRP3S) GI:4206766)

AT2G38540 Nonspecific lipid transfer protein 1 (LTP1), identical to SP|Q42589

Transcription: Genes

Page 24: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Transcription: Permutation

• Typical Cistematic output (1-mismatch):

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Page 25: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Transcription: Permutation

• Here we have a plot of occurrences versus probabilities of the 15-mer data derived from microarray 17808T7.

• Notice the definite skew in the 15-mer Motifs(X20) graph.

15-mer

0500

100015002000250030003500400045005000

1.389E-11

4.243E-11

1.296E-10

3.957E-10

1.208E-09

3.690E-09

1.127E-08

Probabilities

Occurences

15merRef

15-mer Motifs

15-mer Motifs(X20)

Page 26: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Transcription: Permutation Results

• The following motifs have been found thus far:

YTCAYAYCMARYARC CAWCAYCWCSCRCTT CCATMYRAATCCCT

AT5G59310 X X X

AT5G59320 X X X

AT5G59330 X X X

AT2G05380 X X

AT2G38540 X X

AT1G50570 X X X

AT2G22430 X X

AT5G01870 X X X

Page 27: Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.

Acknowledgements

• Prof. Robert Vellanoweth

• CSULA