Chau Fasman using MATLAB
-
Upload
rawatpooran05 -
Category
Documents
-
view
26 -
download
6
description
Transcript of Chau Fasman using MATLAB
Experiment 7Aim: Write a program to implement Chou-fasman algorithm
Equipment Required: Computer with internet connection and Matlab installed
Learning Objectives: To acquaint students with the Programming skills, to write program
Theory: The ChouFasman method is an empirical technique for the prediction of secondary structures in proteins, originally developed in the 1970s by Peter Y. Chou and Gerald D. Fasman.The method is based on analyses of the relative frequencies of each amino acid in alpha helices, beta sheets, and turns based on known protein structures solved with X-ray crystallography. From these frequencies a set of probability parameters were derived for the appearance of each amino acid in each secondary structure type, and these parameters are used to predict the probability that a given sequence of amino acids would form a helix, a beta strand, or a turn in a protein. The method is at most about 5060% accurate in identifying correct secondary structures, which is significantly less accurate than the modern machine learningbased techniques.
The ChouFasman method takes into account only the probability that each individual amino acid will appear in a helix, strand, or turn.
Algorithm:The ChouFasman method predicts helices and strands in a similar fashion, first searching linearly through the sequence for a "nucleation" region of high helix or strand probability and then extending the region until a subsequent four-residue window carries a probability of less than 1. As originally described, four out of any six contiguous amino acids were sufficient to nucleate helix, and three out of any contiguous five were sufficient for a sheet. The probability thresholds for helix and strand nucleations are constant but not necessarily equal; originally 1.03 was set as the helix cutoff and 1.00 for the strand cutoff.
Turns are also evaluated in four-residue windows, but are calculated using a multi-step procedure because many turn regions contain amino acids that could also appear in helix or sheet regions. Four-residue turns also have their own characteristic amino acids;prolineandglycineare both common in turns. A turn is predicted only if the turn probability is greater than the helix or sheet probabilitiesanda probability value based on the positions of particular amino acids in the turn exceeds a predetermined threshold. The turn probability p(t) is determined as:
wherejis the position of the amino acid in the four-residue window. If p(t) exceeds an arbitrary cutoff value (originally 7.5e3), the mean of the p(j)'s exceeds 1, and p(t) exceeds the alpha helix and beta sheet probabilities for that window, then a turn is predicted. If the first two conditions are met but the probability of a beta sheet p(b) exceeds p(t), then a sheet is predicted instead.Procedure: 1. Make a table of Chou-Fasman Parameters i.e. derived probability parameters of each amino acid residue.2. Save the table by name P_table3. Run Matlab4. Call the table by using P_Table>> P_table5. Enter the sequence seq= 'ala,arg,pro,val,iso,leu,lys,met'6. Run the program formed to find the structure.7. Analyse the result.
Program:seq= 'ala,arg,pro,val,iso,leu,lys,met's1= [Alanine(1)+Arginine(1)+Proline(1)+Valine(1)+Isoleucine(1)+Leucine(1)]d1=s1/6s2= [Arginine(1)+Proline(1)+Valine(1)+Isoleucine(1)+Leucine(1)+Lysine(1)]d2=s2/6s3= [Proline(1)+Valine(1)+Isoleucine(1)+Leucine(1)+Lysine(1)+Methionine(1)]d3=s3/6s4= [Alanine(2)+Arginine(2)+Proline(2)+Valine(2)+Isoleucine(2)+Leucine(2)]d4=s4/6s5= [Arginine(2)+Proline(2)+Valine(2)+Isoleucine(2)+Leucine(2)+Lysine(2)]d5=s5/6s6= [Proline(2)+Valine(2)+Isoleucine(2)+Leucine(2)+Lysine(2)+Methionine(2)]d6=s6/6if (d1||d2||d3>1.03) reply = 'helix'elseif (d4||d5||d6>1.00) reply= 'sheet' else reply= 'turns' end
Required Results: 1. The program is written2. The program is successfully executed.
seq =
ala,arg,pro,val,iso,leu,lys,met
s1 =
6.5200
d1 =
1.0867
s2 =
6.2400
d2 =
1.0400
s3 =
6.7100
d3 =
1.1183
s4 =
6.9100
d4 =
1.1517
s5 =
6.8200
d5 =
1.1367
s6 =
6.9400
d6 =
1.1567
Predicted_structure =
helix
Learning outcomes: