Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.
-
Upload
neal-doyle -
Category
Documents
-
view
213 -
download
0
Transcript of Transcription as a Permutation Algorithm By M. Nickenig Mentor: Prof. Robert Vellanoweth.
Transcription as a Permutation Algorithm
By M. Nickenig
Mentor: Prof. Robert Vellanoweth
Transcription: Overview
• Where does transcription fit into the larger genetic schema:
DNA ------------> DNA --------------> mRNA ------------> ProteinReplication Transcription Translation
|---- Regulation
| |---- Regulation
| |---- Regulation
|
• Specifically, regulation at the transcriptional level involves four
major modes:
1) Regulation via the combinatronics of components of the transcriptional machinery (basal transcription machinery).
2) Induction of response elements through inducible transcription factors (TFs).
3) Regulation through the action of interfering RNAs.
4) Chromatin remodeling.
Biochemistry, C. Matthews, K.E. van Holde, K.G. Ahern- 3rd ed.
Transcription: Regulation by TFs• Transcription factor- a protein that binds DNA at a specific promoter
or enhancer site, where it regulates transcription. • Basal transcription factors are involved in the formation of a
pre-initiation complex.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
http://homepages.strath.ac.uk/~dfs97113/BB310/Lect15/lect15.htm
Transcription Factors: Structure• Regulatory Factors: Activate or Repress transcription
Note: Not all transcription factors bind to DNA- some just bind other transcription factors.
• Basic Structural features of transcription factors:
1) Activation Domain - Three different types: acidic domain, glutamine-rich domain, proline-rich domain
* Activation domains interact with the basal machinery to activate transcription
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
http://homepages.strath.ac.uk/~dfs97113/BB310/Lect15/lect15.htm
Transcription Factors: Structure2) DNA binding domain -
•Helix-turn-helix (HTH) bind the major groove of the DNA.Two anti-parallel alpha-helical regions interrupted by a turn region.
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.-------------------->DNA binding via
recognition helix
•Zinc fingers function as structural platforms for DNA binding. This type of transcription factor has an absolute requirement for zinc for their formation.•Two types: 2-His, 2-Cys Zn finger and Multi-Cys Zn finger
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
http://homepages.strath.ac.uk/~dfs97113/BB310/Lect15/lect15.htm
Transcription Factors: Structure
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
•B-Zip: Leucine zippers function in associating the transcription factors with each other. Posses a basic DNA binding domain (B-domain) adjacent to a leucine zipper dimerization domain. Function as dimers.
•The leucine zipper dimerization domain is found in many transcription factors:
Basic Domain
http://homepages.strath.ac.uk/~dfs97113/BB310/Lect15/lect15.htm
Transcription Factors: Regulation
• Consider the following example of transcriptional regulation by an inducible TF:
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Transcription Factors: Regulation
• Consider an additional example of transcriptional regulation by an inducible TFs:
Bioinformatics Vol. 15, 1999, 563-577
Transcription: Probabilities
Goal:
• We want to find all transcription factor binding sequences in the Arabidopsis thaliana transcriptome using a suitable motif-finding program
Assumptions:
• Functionally related DNA sequences are generally expected to share some common sequence elements.
• The pattern shared by a set of functionally related sequences is commonly identified during the process of aligning the sequences to maximize sequence conservation.
• A good alignment is assumed to be one whose alignment matrix is rarely expected to occur by chance.
• Furthermore, we assume that the distribution of letters is independent and is randomly distributed. Thus, the probability of an alignment matrix is determined by the multinomial distribution;
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Transcription: Probabilities
Mathematical Terms:• Where, i , refers to the rows of the alignment matrix( i.e. the bases A, C, G, T), j, refers
to the columns of the matrix (i.e. the letters within the alignment pattern), A is the total number of letters in the sequence alphabet, L, is the total number of columns in the matrix, pi, is the a priori probability of the letter , i , nij, is the occurrence of the letter i at the position j and N is the total number of sequences in the alignment (Reference: Bioinformatics Vol.15, 1999, 563-577).
• Furthermore, the above formula can be extended to calculate the probabilities associated with cis-regulatory modules:
such that the sum is taken over all sequences in a module (Lall), the factor, (1/m ), is a normalization constant where, m, equals the number of sequences of lengths, L, comprising the module.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.QuickTime™ and a
TIFF (LZW) decompressorare needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.
Thesis: A. Mortazavi, 2004 Vellanoweth Lab, CSULA
Transcription: Probabilities
• cis-Regulatory Module- a set of motifs that bind transcription factors cooperatively.
• For example, consider the following Cistematic derived sequence data which corresponds to the Lipid Transfer Protein (LTP) module (Thesis: A. Mortazavi, 2004 Vellanoweth Lab):
First we calculate the probability associated with this alignment using the method of Hertz and Stromo; then this is followed by a calculation where aligned sequences are broken up into blocks and each block is treated as a mutually exclusive event.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Thesis: A. Mortazavi, 2004 Vellanoweth Lab
Transcription: Probabilities
• An alignment matrix can be formed from a gap alignment and the probability subsequently calculated, e.g the T-COFFE derived gap alignment of the LTP module:
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.1 2
2 3
Bioinformatics Vol.15, 1999, 563-577
Transcription: Probabilities
Sample Calculation: LTP module regions 1 and 3 - (Method of Hertz and Stromo)
ltp30 REGION 1 T G G G A G Altp 3 REGION 1 T G G G A G Altp 4 REGION 1 T G G G A G A
* * * * * * *ALIGNMENT MATRIX REGION 1:
A 0 0 0 0 3 0 3 N=3T 3 0 0 0 0 0 0 L=7G 0 3 3 3 0 3 0
3.22E-02 6.03E-03 6.03E-03 6.03E-03 3.22E-02 6.03E-03 3.22E-021 1 1 1 1 1 1
3.22E-02 6.03E-03 6.03E-03 6.03E-03 3.22E-02 6.03E-03 3.22E-02 = 4.39E-14
ltp30 REGION 3 A A T T G G A C G G T Altp 3 REGION 3 A A T C A G A C G G C Altp 4 REGION 3 A A T T G G A C G G T A
ALIGNMENT MATRIX REGION 3:A 3 3 0 0 1 0 3 0 0 0 0 3 N=3T 0 0 3 2 0 0 0 0 0 0 2 0 L=12G 0 0 0 0 2 3 0 0 3 3 0 0
C 0 0 0 1 0 0 0 3 0 0 1 0* * * * * * * * * * * *
3.22E-02 3.22E-02 3.22E-02 1.84E-02 1.05E-02 6.03E-03 3.22E-02 6.03E-03 6.03E-03 6.03E-03 1.84E-02 3.22E-021 1 1 3 3 1 1 1 1 1 3 1
3.22E-02 3.22E-02 3.22E-02 5.52E-02 3.16E-02 6.03E-03 3.22E-02 6.03E-03 6.03E-03 6.03E-03 5.52E-02 3.22E-02 = 4.37E-213.327E-05 7.54E-14 2.51E-18
Transcription: Probabilities
Sample Calculation: LTP module region 2 - (as Mutually Exclusive events)
ltp30 REGION 2 T T G T C A T G T T T G Altp 3 REGION 2 T T C T G A T G T C T A Gltp 4 REGION 2 T T G T G A T G T T T A G
ALIGNMENT MATRIX REGION 2:A 0 0 0 0 0 3 0 0 0 0 0 2 1T 3 3 0 3 0 0 3 0 3 2 3 0 0G 0 0 2 0 2 0 0 3 0 0 0 1 2
C 0 0 1 0 1 0 0 0 0 1 0 0 0* * * * * * * * * * * * *
3.22E-02 3.22E-02 6.03E-03 3.22E-02 6.03E-03 3.22E-02 3.22E-02 6.03E-03 3.22E-02 1.84E-02 3.22E-02 1.84E-02 1.05E-021 1 3 1 3 1 1 1 1 3 1 3 3
3.22E-02 3.22E-02 1.81E-02 3.22E-02 1.81E-02 3.22E-02 3.22E-02 6.03E-03 3.22E-02 5.52E-02 3.22E-02 5.52E-02 3.16E-022.1809E-15 1.20E-16
T G T T G G A T A C A C G T GT G T T G G G T A C A C G T GT G T T G G A T A C A C G T G
0 0 0 0 0 0 2 0 3 0 3 0 0 0 03 0 3 3 0 0 0 3 0 0 0 0 0 3 00 3 0 0 3 3 1 0 0 0 0 0 3 0 30 0 0 0 0 0 0 0 0 3 0 3 0 0 0* * * * * * * * * * * * * * *3.22E-02 6.03E-03 3.22E-02 3.22E-02 6.03E-03 6.03E-03 1.84E-02 3.22E-02 3.22E-02 6.03E-03 3.22E-02 6.03E-03 6.03E-03 3.22E-02 6.03E-031 1 1 1 1 1 3 1 1 1 1 1 1 1 1
3.22E-02 6.03E-03 3.22E-02 3.22E-02 6.03E-03 6.03E-03 5.52E-02 3.22E-02 3.22E-02 6.03E-03 3.22E-02 6.03E-03 6.03E-03 3.22E-02 6.03E-032.26E-17
T C C C T C A G T T A T A C A T T G C A C TT C C G T C A G T T A C A C A T A G C A T TT C T A T C A G C T G C A C A T C G C A T T
0 0 0 1 0 0 3 0 0 0 2 0 3 0 3 0 1 0 0 3 0 0 N=33 0 1 0 3 0 0 0 2 3 0 1 0 0 0 3 1 0 0 0 2 3 L=500 0 0 1 0 0 0 3 0 0 1 0 0 0 0 0 0 3 0 0 0 00 3 2 1 0 3 0 0 1 0 0 2 0 3 0 0 1 0 3 0 1 0* * * X * * * * * * * * * * * * X * * * * *3.22E-02 6.03E-03 1.05E-02 1.05E-02 3.22E-02 6.03E-03 3.22E-02 6.03E-03 1.84E-02 3.22E-02 1.84E-02 1.05E-02 3.22E-02 6.03E-03 3.22E-02 3.22E-02 1.84E-02 6.03E-03 6.03E-03 3.22E-02 1.84E-02 3.22E-021 1 3 6 1 1 1 1 3 1 3 3 1 1 1 1 6 1 1 1 3 13.22E-02 6.03E-03 3.16E-02 6.32E-02 3.22E-02 6.03E-03 3.22E-02 6.03E-03 5.52E-02 3.22E-02 5.52E-02 3.16E-02 3.22E-02 6.03E-03 3.22E-02 3.22E-02 1.10E-01 6.03E-03 6.03E-03 3.22E-02 5.52E-02 3.22E-02 = 7.94E-84
2.74E-19 3.76E-08 2.324E-16 4.59E-17 = 4.22E-16
Transcription: Probabilities
Table 1: Probabilities- LTP Module (Hertz/Stromo Method: Reference- Bioinformatics Vol.15, 1999, 563-577)
Note: Calculations adjusted according to background model based on Arabidopsis genome base frequencies-
A: 0.3180185 T: 0.318015 G: 0.1819815 C: 0.1819815
Hertz/Stromo Mutually Exclusive
Width Region P matrix P matrix
7 1 4.39E-14 4.39E-14
40 2 1.23E-66 4.22E-16
10 3 2.51E-18 2.51E-18
71 Module ?1.35E-97? 2.22E-14
Proceedings of the 3rd International Conference on Bioinformatics and Genome Research, G.Z. Hertz and G. Stromo
Transcription: P-Value Probabilities
• By the calculation of probabilities resulting statements concerning statistical significance can be formulated through estimations of the P-value using large-deviation statistics.
• In particular, Hertz and Stromo provide a statistical analysis method based upon the observation that when “the information content is small and the number of sequences is large, 2NI tends to a chi-squared distribution…” with L(A-1) degrees of freedom.
• In particular, the probability of sequence alignment containing gaps is:
where, n-j, the occurrence of a gap at the position j in the alignment. N, L, A and, nij, have been defined previously.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Proceedings of the 3rd International Conference on Bioinformatics and Genome Research, G.Z. Hertz and G. Stromo
Transcription: P-Value
• Then the information content (large-deviation rate function) of the corresponding sequence alignment is:
Where fij = nij/N.
• “To calculate the overall statistical significance, we consider the probability distribution of
and it’s large-deviation rate function of
“
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Proceedings of the 3rd International Conference on Bioinformatics and Genome Research, G.Z. Hertz and G. Stromo
Transcription: P-Value
• The overall statistical significance…. is equal to the inverse of the product of 2NL and the probability of a large-deviation rate function greater than or equal to (I gap matrix + L ln 2) based on the probability distribution, P, above.
Sample Calculation: P-Value LTP module region 1 - (Based on method of Hertz/Stromo)
ltp30 REGION 1 T G G G A G Altp 3 REGION 1 T G G G A G Altp 4 REGION 1 T G G G A G A
* * * * * * *ALIGNMENT MATRIX REGION 1:
A 0 0 0 0 3 0 3 N=3T 3 0 0 0 0 0 0 L=7G 0 3 3 3 0 3 0
4.02E-03 7.53E-04 7.53E-04 7.53E-04 4.02E-03 7.53E-04 4.02E-031 1 1 1 1 1 1
4.02E-03 7.53E-04 7.53E-04 7.53E-04 4.02E-03 7.53E-04 4.02E-03 = 6.8793E-07 P-Value Region 12.09E-20 P
Transcription: Probabilities
Table 1: Probabilities- LTP Module (Hertz/Stromo Method: Reference- Bioinformatics Vol.15, 1999, 563-577)
Note: Calculations adjusted according to background model based on Arabidopsis genome base frequencies-
A: 0.3180185 T: 0.318015 G: 0.1819815 C: 0.1819815
Width Region Prob matrix
(Hertz/Stromo)
Probmatrix
(Mutually Exclusive)
P-Value
7 1 4.39E-14 4.39E-14 6.88E-07
40 2 1.23E-66 4.22E-16 1.01E-45
10 3 2.51E-18 2.51E-18 2.10E-11
71 Module ?1.35E-97? 2.22E-14 1.54E-66
Transcription: Permutation
• Furthermore, it is desired to devise a method to arrive at groupings of genes that are coregulated.
• These coexpressed gene clusters are expected to respond to either internal or external stimuli which can be visualized, as a first approximation, in a microarray.
• This concerted genetic response is presumed to be governed by the action of a conserved set of response elements interacting with a distinct set of transcription factors.
• By focusing on gene clustering we expect to detect the presence of transcription factor binding sites using the motif finding program Cistematic augmented with a statistical method, which will be described below.
Transcription: Permutation
Statistical Method:
6-mer Reference Distribution
0100200300400500600700800900
10001100120013001400
0.0000363210.0000634730.00011090.00019380.00033870.0005920.0010345
Probabilities
Occurences
It occurred to the author that a simple plot of occurrences by probabilities would yield visualizations of data trends output by Cistematic.
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.
0 0 0 0 1 1 1 1 1 1 10 0 0 1 0 1 1 1 1 1 10 0 0 1 1 0 1 1 1 1 10 0 0 1 1 1 0 1 1 1 10 0 0 1 1 1 1 0 1 1 10 0 0 1 1 1 1 1 0 1 10 0 0 1 1 1 1 1 1 0 10 0 0 1 1 1 1 1 1 1 00 0 1 0 1 1 1 1 1 1 00 0 1 1 0 1 1 1 1 1 00 0 1 1 1 0 1 1 1 1 00 0 1 1 1 1 0 1 1 1 00 0 1 1 1 1 1 0 1 1 00 0 1 1 1 1 1 1 0 1 00 0 1 1 1 1 1 1 1 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
www.arabidopsis.org
Transcription: Permutation
• Microarray 17808T7 was designed to identify gene expression changes that occur during shoot development in Arabidopsis.
• Root explants were incubated on a callus induction medium (CIM) during which time they acquire 'competence' to respond to hormones that induce shoot formation. Explants are then transferred to cytokinin-rich shoot induction medium (SIM) where they organize meristems and undergo shoot morphogenesis.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.QuickTime™ and a
TIFF (LZW) decompressorare needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Shoot Developm
ent Scan 1
Shoot Developem
nt Scan 2
Vascular D
evolpment
Shoot Devlopm
ent Scan 3
Shoot Devlopm
ent Scan
Vascular D
evelopment 2
Shoot Developm
ent in tissue culture 1
Shoot Developm
ent in tissue culture 2
www.arabidopsis.org 0 0 0 0 1 1 1 1 1 1 10 0 0 1 0 1 1 1 1 1 10 0 0 1 1 0 1 1 1 1 10 0 0 1 1 1 0 1 1 1 1 0 0 0 1 1 1 1 0 1 1 10 0 0 1 1 1 1 1 0 1 1
0 0 0 1 1 1 1 1 1 0 10 0 0 1 1 1 1 1 1 1 00 0 1 0 1 1 1 1 1 1 00 0 1 1 0 1 1 1 1 1 00 0 1 1 1 0 1 1 1 1 00 0 1 1 1 1 0 1 1 1 00 0 1 1 1 1 1 0 1 1 00 0 1 1 1 1 1 1 0 1 00 0 1 1 1 1 1 1 1 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0 1 1 1 1 1 1 10 0 0 1 0 1 1 1 1 1 10 0 0 1 1 0 1 1 1 1 10 0 0 1 1 1 0 1 1 1 1 0 0 0 1 1 1 1 0 1 1 10 0 0 1 1 1 1 1 0 1 1
0 0 0 1 1 1 1 1 1 0 10 0 0 1 1 1 1 1 1 1 00 0 1 0 1 1 1 1 1 1 00 0 1 1 0 1 1 1 1 1 00 0 1 1 1 0 1 1 1 1 00 0 1 1 1 1 0 1 1 1 00 0 1 1 1 1 1 0 1 1 00 0 1 1 1 1 1 1 0 1 00 0 1 1 1 1 1 1 1 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
0 0 0 00 0 0 0
0 0 0 00 0 0 00 0 0 00 0 0 00 0 0 0
Genes Gene Info
AT2G22430 Homeobox-leucine zipper protein 6 (HB-6) / HD-ZIP transcription factor 6, identical to homeobox-leucine zipper protein ATHB-6 (HD-ZIP protein ATHB-6)
AT5G01870 Lipid transfer protein, putative, similar to lipid transfer protein 6 from Arabidopsis thaliana (gi:8571927); contains Pfam protease inhibitor/seed storage/LTP family domain PF00234
AT5G59330 (AT5G59330: hypothetical protein)
AT5G59310 Lipid transfer protein 4 (LTP4), identical to lipid transfer protein 4 from Arabidopsis thaliana (gi:8571923); contains Pfam protease inhibitor/seed storage/LTP family domain PF00234
AT5G59320 Lipid transfer protein 3 (LTP3), identical to lipid transfer protein 3 from Arabidopsis thaliana (gi:8571921); contains Pfam protease inhibitor/seed storage/LTP family domain PF00234)
AT1G50570 C2 domain-containing protein, low similarity to cold-regulated gene SRC2 (Glycine max) GI:2055230; contains Pfam profile PF00168: C2 domain
AT2G05380 Glycine-rich protein (GRP3S), identical to cDNA glycine-rich protein 3 short isoform (GRP3S) GI:4206766)
AT2G38540 Nonspecific lipid transfer protein 1 (LTP1), identical to SP|Q42589
Transcription: Genes
Transcription: Permutation
• Typical Cistematic output (1-mismatch):
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Transcription: Permutation
• Here we have a plot of occurrences versus probabilities of the 15-mer data derived from microarray 17808T7.
• Notice the definite skew in the 15-mer Motifs(X20) graph.
15-mer
0500
100015002000250030003500400045005000
1.389E-11
4.243E-11
1.296E-10
3.957E-10
1.208E-09
3.690E-09
1.127E-08
Probabilities
Occurences
15merRef
15-mer Motifs
15-mer Motifs(X20)
Transcription: Permutation Results
• The following motifs have been found thus far:
YTCAYAYCMARYARC CAWCAYCWCSCRCTT CCATMYRAATCCCT
AT5G59310 X X X
AT5G59320 X X X
AT5G59330 X X X
AT2G05380 X X
AT2G38540 X X
AT1G50570 X X X
AT2G22430 X X
AT5G01870 X X X
Acknowledgements
• Prof. Robert Vellanoweth
• CSULA