Finding Protein Folding Funnels in Random Networkskikuchi/presentation/CCP2017.pdfFinding Protein...
Transcript of Finding Protein Folding Funnels in Random Networkskikuchi/presentation/CCP2017.pdfFinding Protein...
Finding Protein Folding Funnels inRandom Networks
Macoto [email protected]
Cybermedia Center, Osaka Univ.
CCP2017 (11 Jul. 2017)paper in preparation
Outline
1 Introduction: Rareness, Fold, Funnel picture,Variation of the funnel structure,Networkrepresentation
2 Question3 Model4 Method5 Results
Probability distribution of the ideal funnelsNetwork size dependence of the probability andS.D.
6 Summary and Discussions
Introduction: rareness
A protein folds into its specific nativeconformation spontaneously under thephysiological conditions.
Anfinsen’s dogma: The native conformation is thethermodynamically stable one determined by theamino acid sequence.
Proteins are not at all typical random polypeptide
If we make a random polypeptide, itslow-energy state will become glassy.
Many different conformations have low energyclose to each other.
ex. The native conformation of lysozyme
Such a special property of proteins have beendeveloped through Darwinian evolution.
Proteins are good examples of the fact that theevolution can create very rare states of matter.
Introduction: funnel picture
Funnel picture of the energy landscape has beenaccepted widely after 1990s as a mechanism behindthe protein folding
Consistency principle (Go)
Minimum frustration principle (Bryngelson andWolynes)
The energy landscape is determined by the nativestructure.
Energy
funnel picture
The number ofconformationsdecreasesmonotonously asenergy lowers andapproaches the nativestate.
It is well known that the GO-like model, in whichthe interaction between the residues are determinedby the native structure, can nicely reproduce theexperimentally observed folding processes,
GO-like model is the simplest model thatrealizes the funnel-like energy landscape.
The reason is not yet so clear. But anyway thefunnel picture well describes the protein folding.
Introduction: folds
Number of ”folds” is very small compared to thenumber of proteins.
Fold: skelton of the native structures.
ex. classification by SCOP reported the number offolds is 1195 in the year 2009.
http://scop.mrc-lmb.cam.ac.uk
Althought the definition of the ”fold” is stillambiguous, it is clear that many proteins have thesame ”fold”.
Why do the folds so scarce?
Introduction: variation of the funnelstructure
Free-energy landscape can have a variety evenwithin the framework of the GO-like model.
Folding pathways are restricted by the nativestructure, but not uniquely determined.
ex. All the experimentally observed variety of thefolding processes of lysozyme family can bereproduced by the extended GO-like model, in whichthe relative interaction strength of two domains aretuned as a parameter.
Kanzaki and MK, Chem. Phys. Lett. 427(2006) 414.
Variety of the free-energy landscape at the foldingtemperature for lysozyme by the extended GO-likemodel
Kanzaki and MK, Chem. Phys. Lett. 427(2006) 414.
Three points to be considered
Proteins are rare states of polypeptedes.
Folds are very scarece.
Free-energy landscape and folding pathway can havea variety within the funnel picture.
Motivation of the study
We consider the following question.
Main question
How rare are the funnel-like energy landscapes?
To answer this question, at least partly, weintroduce a simple and abstract model based on therandom energy model on random networks, whichexpresses the energy landscape of the proteins.
Structural network
Network model has widely been used so far tounderstand the protein folding dynamics.
Markov state model has been used to describethe relationship between many conformationsobtained during the MD simulations.
Hori et al. (PNAS 2009) tried to determineinterconnection between all the conformationsof some proteins including the conformationsthat do not appear in MD. And compare theobtained network with that of randompolypeptide.
Model
conformation network1 Give a random network, which represent the
connections between conformations.Node: metastable ensenble of conformations.Edge: possible transitions between the nodesWe consider that the network structure isdetremined by the native conformations
1 to 1 correspondence between the nativeconformation and the network structure
We assume any random network corresponds tosome native state.
random order model1 Each node is assigned an integer randomly.
Integer represents the energy of the node (RandomEnergy Model)
We need only the order of the nodes according to theenegy.
We consider that the arrangement of the numbersis determined by the amino acid sequence
We assume all the arrangements of the numbers arepossible.
Construction of the model 1
Make a simplerandom graph
N : number ofnodesL: averagenumber of edgesconnected toeach node
2
Select one of thenodes having thelargest number ofedges as U(unfolded state)
Select one of thefarthest nodesfrom U as F(native foldedstate)
3
The network isrejected if it isseparated when Uis deleted,
4
Assign integersfrom 1 to N − 2randomly toremaining nodes.
U and F arefixed to 0 andN − 1,respectively
5Draw arrows fromthe node ofsmaller number tothat of largernumber, if twonode areconnected (DAG).
The arrowsrepresent thedirections oftransitions.We assume onlytheenergy-loweringtransitionsrealize.Obtainednetwork is aDirectedAcycled Graph(DAG)
Ideal Funnel
We introduce a concept of ideal funnel
definitionStarting from U node, if all the energy-lowering pathlead to F node, we call the network is ideal funnel
The energy-lowering transitions between nodeseventually leads to the native state without beingtrapped by misfolded state.
non-ideal funnel(misfolded states 7 and
8 exist) ideal funnel
Question again
In the language of proteins
Given one netive state, how rare are the amino acidsequence that the energy landscape becomes theideal funnel among all the possible sequences.
In terms of the modelGiven a random graph, estimate the appearanceprobalibity of the ideal funnel among all the possiblearrangement of integers.
Method
Rare event sampling
We estimate the appearance probability of idealfunnel among all the possible arrangement ofnumbers using the Multicanonical Ensemble MonteCarlo method with parameters determined byWang-Landau procedure.
The smallest number of the dead-end node that canbe reached from the unfolded state is used as energyof the arrangement for Multicanonical method.
We estimated the appearance probability of largemagic squares. (PlosONE 10(5) e0125062)
Multicanonical ensemble method with theWang-Landau learning is highly suitable forestimating the appearance probability of very rareconformations
Since the total number of conformations is knownfor the present mode, we can estimate ”absolute”appearance probability of the ideal funnels.
Computaion Detail
N = 8 ∼ 27Exact enumeration for N ≤ 14. Multicanonical forN > 14.
L = 3
Generate 1000 random graphs for each N
Estimate the appearance probability of idealfunnel for each graph.
Results
PDF of ideal funnels (N = 16, L = 3)
PDF of ideal funnels (N = 22, L = 3)
PDF of ideal funnels (N = 27, L = 3). The solidline represents the Log-Normal distribution
We expect that the appearance probability of theideal funnels approaches the Log-Normaldistribution for large N
Implication of Log-Normal
Very small number of networks have a largeprobability
Since the different arrangement of the numbercorresponds to the different sequence, suchnetworks are rubost against mutation.Possible explanation for the fact that there all onlyrelatively small number of folds for known proteins.
-14
-12
-10
-8
-6
-4
-2
0
10 15 20 25 30
<lo
g 10P
>
N-2
Ndependence of logP (bars: 3×S.D.)
P decreases exponentially with N (as expected)Number of robust networks decreases moreslowly than typical networks
Evolutionally favorable
Summary and Discussion
We estimated the rareness of the foldingfunnels using the random-energy model on therandom network.PDF of appearance probability of the idealfunnel is close to Log-Normal type.
There are very small number of the nativeconformations that are robust against mutations.
Typecal networks decrease exponentially.The rubust networks decrease also exponentiallybut more slowly.
Multicanonical and Wang-Landau method issuitable for estimating the probability of rareconformations.
Remark
The model is very simple, abstract, and ratherarbitrary.
We did not consider special fearures of strucruralgraphs, such as small-worlk, scale-free, hubstructure etc.The present study, however, will serve as a startpoint to consier rareness of the foldable proteinsand its implication to evolutions from this study.
Study of similar direction is on the way.Gene reguration network, Genetic code etc.
Rareness will be an important keyword inconsidering life related phenomena in the fieldof Biophysics.
The paper is in preparation.
Contact [email protected] if you’reinterested in the rareness of life-relatedphenomena.