Fasta

Post on 10-Dec-2014

106 views 2 download

description

An Introduction

Transcript of Fasta

FASTAFatima Khaliq ROLL #1038BS(HONS) BOTANY 3rd (M)University of Education Lahore, Okara campus, Renalakhurd

Table of ContentsIntroductionFeatures of FASTAUses of FASTAData StructureConclusionFuture Directions References

Introduction Bioformatics is the inter-disciplinary

branch of biology which merges computer science, mathematics and engineering to study the biological data.

FASTA is one of the software which are used by the biologist to study DNA and protein sequence either nucleotide or peptide sequence.

This software was originally developed in 1985 by Lipman and Pearson. Now the 35 version of the software is available in the market and it is compatible for MS-Windows, UNIX, Linux and Mac.

FASTA provide a text format in which protein alignment is presented by using single letter codes. It is also known as FASTA format.

FASTA format allows the sequence naming and comments to introduce the sequences. The use of FASTA format has become a standard for biologist to analyze the sequencing.

The format of FASTA codes is no longer than 120 characters..

Features of FASTARather than trying to find out the best

alignment between your data, it finds the patches of regional similarity.

It is rapid program. You can run the program locally or you can also send queries to an email server.

The alignments of FASTA can contain gaps. The sequence which contain the gap FASTA highlight those codes with red color.

Another feature of FASTA is that it ignores the complete sensitivity and provide information about the expected matched alignments.

USES of FASTAFASTA can be use for the alignment of

all types proteins and DNA.It can also use for the translation of

algorithms which handle frame shift errors.

It can used for calculating the similarity which can help the biologists to decide whether the alignment is occurred by chance or it is due to infer homology.

You can also use FASTA for calculating the optimal score for alignment.

It can also be used for inferring the functional and evolutionary relation between sequence can also help to identify the members of gene family.

Data Structure Data in FASTA is presented in a single

code sequences. It has got a different search methods which help in sequencing the proteins. For example with Smith-waterman type of algorithm FASTA help you to find out the potential matches and save your time as well.

While the results of FASTA are reported in the form of histogram where the expected values are compared to random search set. While the lower part of the histogram contain information about the matches of interest.

Conclusion

FASTA has become a standard software for the biologist to analyze and sequencing the proteins and DNA. Thus, it is one of the easiest software to not only help them to understand the nature of sequences but it also allow the biologist to precede the commenting as well.

Future Directions Future of FASTA is no doubt very bright

as the advancement in this software are enabling FASTA to overcome all the limitations which was present in the previous versions.

It is also expected that more formats will be introduced in the future to understand the input and output of sequencing.

Reference http://link.springer.com/

protocol/10.1385%2F0-89603-276-0%3A365#page-1

http://emboss.sourceforge.net/docs/themes/SequenceFormats.html#fut

http://www.ncbi.nlm.nih.gov/blast/blastcgihelp.shtml

http://en.wikipedia.org/wiki/FASTA

http://arep.med.harvard.edu/seqanal/fasta.html