Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating...

29
Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space. Each additional sequence position adds another dimension, doubling the diagram for the shorter sequence. Shown is the progression from a single sequence position (line) to a tetramer (hypercube). A four (or twenty) letter code can be accommodated either through allowing four (or twenty) values for each dimension (Rechenberg 1973; Casari et al. 1995), or through additional dimensions (Eigen and Winkler-Oswatitsch 1992). Eigen, M. and R. Winkler-Oswatitsch (1992). Steps Towards Life: A Perspective on Evolution. Oxford; New York, Oxford University Press. Eigen, M., R. Winkler-Oswatitsch and A. Dress (1988). "Statistical geometry in sequence space: a method of quantitative comparative sequence analysis." Proc Natl Acad Sci U S A 85(16): 5913-7
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    0

Transcript of Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating...

Page 1: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Ways to construct Protein Space

Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space. Each additional sequence position adds another dimension, doubling the diagram for the shorter sequence. Shown is the progression from a single sequence position (line) to a tetramer (hypercube). A four (or twenty) letter code can be accommodated either through allowing four (or twenty) values for each dimension (Rechenberg 1973; Casari et al. 1995), or through additional dimensions (Eigen and Winkler-Oswatitsch 1992).Eigen, M. and R. Winkler-Oswatitsch (1992). Steps Towards Life: A Perspective on Evolution. Oxford; New York, Oxford University Press. Eigen, M., R. Winkler-Oswatitsch and A. Dress (1988). "Statistical geometry in sequence space: a method of quantitative comparative sequence analysis." Proc Natl Acad Sci U S A 85(16): 5913-7 Casari, G., C. Sander and A. Valencia (1995). "A method to predict functional residues in proteins." Nat Struct Biol 2(2): 171-8 Rechenberg, I. (1973). Evolutionsstrategie; Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. Stuttgart-Bad Cannstatt, Frommann-Holzboog.

Page 2: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Diversion: From Multidimensional Sequence Space

to Fractals

Page 3: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

one symbol -> 1D

coordinate of dimension = pattern length

Page 4: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Two symbols -> Dimension = length of pattern

length 1 = 1D:

Page 5: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Two symbols -> Dimension = length of pattern

length 2 = 2D:

dimensions correspond to positionFor each dimension two possibiities

Note: Here is a possible bifurcation: a larger alphabet could be represented as more choices along the axis of position!

Page 6: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Two symbols -> Dimension = length of pattern

length 3 = 3D:

Page 7: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Two symbols -> Dimension = length of pattern

length 4 = 4D:

aka Hypercube

Page 8: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Two symbols -> Dimension = length of pattern

Page 9: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Three Symbols (the other fork)

Page 10: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Four Symbols:

I.e.: with an alphabet of 4, we have a hypercube (4D) already with a pattern size of 2, provided we stick to a binary pattern in each dimension.

Page 11: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

hypercubes at 2 and 4 alphabets

2 character alphabet, pattern size 4

4 character alphabet, pattern size 2

Page 12: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Three Symbols Alphabet suggests fractal representation

Page 13: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

3 fractal

enlarge fill in

outer pattern repeats inner pattern= self similar= fractal

Page 14: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

3 character alphabet3 pattern fractal

Page 15: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

3 character alphapet4 pattern fractal Conjecture:

For n -> infinity, the fractal midght fill a 2D triangle

Note: check Mandelbrot

Page 16: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Same for 4 character alphabet

1 position

2 positions

3 positions

Page 17: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

4 character alphabet continued

(with cheating I didn’t actually add beads)

4 positions

Page 18: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

4 character alphabet continued

(with cheating I didn’t actually add beads)

5 positions

Page 19: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

4 character alphabet continued

(with cheating I didn’t actually add beads)

6 positions

Page 20: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

4 character alphabet continued

(with cheating I didn’t actually add beads)

7 positions

Page 21: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Animated GIf 1-12 positions

Page 22: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Protein Space in JalView

Page 23: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Alignment of V F A ATPase ATP binding SU(catalytic and non-catalytic SU)

Page 24: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

UPGMA tree of V F A ATPase ATP binding SU with line dropped to partition (and colour) the 4 SU types (VA cat and non cat, F cat and non cat). Note that details of the tree $%#&@.

Page 25: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

PCA analysis of V F A ATPase ATP binding SU using colours from the UPGMA tree

Page 26: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Same PCA analysis of V F A ATPase ATP binding SU using colours from the UPGMA tree, but turned slightly. (Giardia A SU selected in grey.)

Page 27: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Same PCA analysis of V F A ATPase ATP binding SU Using colours from the UPGMA tree, but replacing the 1st with the 5th axis. (Eukaryotic A SU selected in grey.)

Page 28: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Same PCA analysis of V F A ATPase ATP binding SU Using colours from the UPGMA tree, but replacing the 1st with the 6th axis. (Eukaryotic B SU selected in grey - forgot rice.)

Page 29: Ways to construct Protein Space Construction of sequence space from (Eigen et al. 1988) illustrating the construction of a high dimensional sequence space.

Problems• Jalview’s approach requires an alignment - only

homologous sequences can be depicted in the same space

• Solution: One could use pattern absence / presence as coordinates