1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS...
Transcript of 1. INTRODUCTION TO BIOLOGY AND BIOINFORMATICS · INTRODUCTION TO BIOLOGY AND BIOINFORMATICS...
2 "Introduction to Bioinformatics"
Bioinformatics Course
Is a characteristic that distinguishes objects that have signaling and self-sustaining processes (i.e. living organism) to those that do not have it
Is a state of living characterized by capacity for metabolism, growth, reaction to stimuli, and reproduction
A diversity of life forms are found on Earth, eg. plants, animals, fungi, protists, archaea and bacteria
LIFE
6 "Introduction to Bioinformatics"
Bioinformatics Course
BIOLOGY Is a study of life and living organisms
It brings together the structure, function, growth, origin, distribution, adaptation, interactions, taxonomy and evolution of living organism
AEROBIOLOGY, AGRICULTURE, ANATOMY, ASTROBIOLOGY, BIOCHEMISTRY, BIOENGINEERING, BIOINFORMATICS,
BIOMATHEMATICSOR, MATHEMATICAL BIOLOGY, BIOMECHANICS, BIOMEDICAL RESEARCH, BIOPHYSICS, BIOTECHNOLOGY, BUILDING BIOLOGY, BOTANY, CELLBIOLOGY, CONSERVATION BIOLOGY, CRYOBIOLOGY, DEVELOPMENTAL BIOLOGY, ECOLOGY, EMBRYOLOGY, ENTOMOLOGY, ENVIRONMENTAL BIOLOGY, EPIDEMIOLOGY, ETHOLOGY, EVOLUTIONARY BIOLOGY, GENETICS, HERPETOLOGY, HISTOLOGY, ICHTHYOLOGY, INTEGRATIVE BIOLOGY, LIMNOLOGY, MAMMALOGY, MARINE BIOLOGY, MICROBIOLOGY, MOLECULAR BIOLOGY, MYCOLOGY, NEUROBIOLOGY, OCEANOGRAPHY, ONCOLOGY, ORNITHOLOGY, POPULATION BIOLOGY, POPULATION ECOLOGY, POPULATION GENETICS, PALEONTOLOGY, PATHOBIOLOGY OR PATHOLOGY, PARASITOLOGY, PHARMACOLOGY, PHYSIOLOGY, PHYTOPATHOLOGY, PSYCHOBIOLOGY, SOCIOBIOLOGY, STRUCTURAL BIOLOGY, VIROLOGY
Is a study of life and living organisms
It brings together the structure, function, growth, origin, distribution, adaptation, interactions, taxonomy and evolution of living organism
AEROBIOLOGY, AGRICULTURE, ANATOMY, ASTROBIOLOGY, BIOCHEMISTRY, BIOENGINEERING, BIOINFORMATICS,
BIOMATHEMATICSOR, MATHEMATICAL BIOLOGY, BIOMECHANICS, BIOMEDICAL RESEARCH, BIOPHYSICS, BIOTECHNOLOGY, BUILDING BIOLOGY, BOTANY, CELLBIOLOGY, CONSERVATION BIOLOGY, CRYOBIOLOGY, DEVELOPMENTAL BIOLOGY, ECOLOGY, EMBRYOLOGY, ENTOMOLOGY, ENVIRONMENTAL BIOLOGY, EPIDEMIOLOGY, ETHOLOGY, EVOLUTIONARY BIOLOGY, GENETICS, HERPETOLOGY, HISTOLOGY, ICHTHYOLOGY, INTEGRATIVE BIOLOGY, LIMNOLOGY, MAMMALOGY, MARINE BIOLOGY, MICROBIOLOGY, MOLECULAR BIOLOGY, MYCOLOGY, NEUROBIOLOGY, OCEANOGRAPHY, ONCOLOGY, ORNITHOLOGY, POPULATION BIOLOGY, POPULATION ECOLOGY, POPULATION GENETICS, PALEONTOLOGY, PATHOBIOLOGY OR PATHOLOGY, PARASITOLOGY, PHARMACOLOGY, PHYSIOLOGY, PHYTOPATHOLOGY, PSYCHOBIOLOGY, SOCIOBIOLOGY, STRUCTURAL BIOLOGY, VIROLOGY
7 "Introduction to Bioinformatics"
Bioinformatics Course
BIOLOGY COMPRISES AREAS OF STUDY THAT FOCUS ON LIFE AT A VARIETY OF LEVELS AND FROM A DIVERSITY OF PERSPECTIVES
BIOLOGY
8 "Introduction to Bioinformatics"
Bioinformatics Course
Domain - Eukaryota Kingdom - Animalia Phylum - Chordata Vertebrata (Subphylum) Class - Mammalia Order - Primates Anthropoidea (Suborder) Hominoidea (Superfamily) Family - Hominidae Genus - Homo Species - sapiens
LIVING SYSTEMS
9 "Introduction to Bioinformatics"
Bioinformatics Course
HUMANS Lineage (full): root; cellular organisms; Eukaryota; Opisthokonta; Metazoa; Eumetazoa; Bilateria; Coelomata; Deuterostomia; Chordata; Craniata; Vertebrata; Gnathostomata; Teleostomi; Euteleostomi; Sarcopterygii; Tetrapoda; Amniota; Mammalia; Theria; Eutheria; Euarchontoglires; Primates; Haplorrhini; Simiiformes; Catarrhini; Hominoidea; Hominidae; Homininae; Homo; Homo sapiens
http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=9606
10 "Introduction to Bioinformatics"
Bioinformatics Course
SPECIES
Defined as a group of living organisms consisting of similar individuals capable of exchanging genes or interbreeding
http://www.nature.com/news/2011/110823/full/news.2011.498.html
13 "Introduction to Bioinformatics"
Bioinformatics Course
NO. OF SPECIES
http://www.iucnredlist.org/documents/summarystatistics/2010_1RL_Stats_Table_1.pdf
14
LEVELS OF ORGANISATION
http://www.nature.com/scitable/topicpage/biological-complexity-and-integrative-levels-of-organization-468
15
LEVELS OF ORGANISATION
http://www.nature.com/scitable/topicpage/biological-complexity-and-integrative-levels-of-organization-468
16 "Introduction to Bioinformatics"
Bioinformatics Course
BIOLOGICAL QUESTIONS
How are all life-forms related? What was the first cell like? How do species adapt to their environment? Which part of our genome is evolving the fastest? Are we descendents of Neanderthals? What genes are responsible for major human disease? Why do we need new flu vaccines every day?
Introduction to Computational Biology, Nello Christiani and Matthew W. Hahn
18 "Introduction to Bioinformatics"
Bioinformatics Course
COMPUTER SCIENCE [CS]
STUDIES COMPUTABLE PROCESSES AND STRUCTURES ( WITH THE AID OF COMPUTERS )
19 "Introduction to Bioinformatics"
Bioinformatics Course
BIOINFORMATICS AND COMPUTATIONAL BIOLOGY
The boundaries between the two diciplines are not well defined and can be distinguished by the problems they solve
BIOINFORMATICS – is the application of statistics and computer science to the field of molecular biology
COMPUTATIONAL BIOLOGY – actual process of analyzing and interpreting data
20 "Introduction to Bioinformatics"
Bioinformatics Course
DEFINITION OF BIOINFORMATICS
The term bioinformatics was coined in 1978 Bioinformatics is the application of information technology and computer science to the field of molecular biology The science of using / developing computer software and algorithms to record, analyze and merge biologically related data Using computer technology to manage large amounts of biological data Bioinformatics involves the use of techniques including applied mathematics, informatics, statistics, computer science, artificial intelligence, chemistry, and biochemistry to solve biological problems usually on the molecular level http://www.google.com/search?q=define%3ABioinformatics
21 "Introduction to Bioinformatics"
Bioinformatics Course
DEFINITION OF BIOINFORMATICS
The collection, organization, storage, analysis, and integration of large amounts of biological data using networks of computers and databases Bioinformatics involves the integration of computers, software tools, and databases in an effort to address biological questions In summary, the use of computer science to solve biological problems
http://www.google.com/search?q=define%3ABioinformatics
22
BIOINFORMATIC FOCUS
http://www.nature.com/scitable/topicpage/biological-complexity-and-integrative-levels-of-organization-468
ORGANISM
ORGAN
TISSUE CELL
MOLECULES
23
BIOINFORMATIC FOCUS
http://www.nature.com/scitable/topicpage/biological-complexity-and-integrative-levels-of-organization-468
ORGANISM
ORGAN
TISSUE CELL
MOLECULES
ANALYSIS AND INTERPRETATION OF VARIOUS TYPES OF BIOLOGICAL DATA INCLUDING: NUCLEOTIDE AND AMINO ACID SEQUENCES, PROTEIN DOMAINS, AND PROTEIN STRUCTURES.
Development of new algorithms and statistics with which to assess biological information, such as relationships among members of large data sets.
24
BIOINFORMATIC FOCUS
http://www.nature.com/msb/journal/v3/n1/images/msb4100163-f4b.jpg
"Introduction to Bioinformatics" Bioinformatics Course
25
BIOINFORMATIC FOCUS
http://www.jofwidata.com/images/database-design-development.jpg http://wolfson.huji.ac.il/expression/detective.jpg
Development and implementation of tools that enable efficient access and management of different types of information, such as various databases, integrated mapping information
"Introduction to Bioinformatics" Bioinformatics Course
26
UNITS OF INFORMATION IN BIOINFORMATICS
DNA Sequence Pathways
RNA Structure Interactions
Protein Evolution Mutations
"Introduction to Bioinformatics" Bioinformatics Course
27
UNITS OF INFORMATION IN COMPUTER SCIENCE
File Storage capacity by Bits and Bytes
Bit Byte Kilobyte Megabyte Gigabyte
bit 1 8 1024*8=
8,192 1024*8192=
8,388,608 1024*8388608= 8,589,934,592
byte 8 1 1024 1024*1024=
1,048,576 1024*1048576= 1,073,741,824
Kilobyte 8,192 1024 1 KB 1024 1,048,576
Megabyte 8,388,608 1,048,576 1024 1 MB 1024
Gigabyte 8,589,934,592 1,073,741,824 1,048,576 1024 1 GB
Terabyte 8,796,093,022,208
1TB 1,099,511,627,776 1,073,741,824 1,048,576 1024
Petabyte
9,007,199,254,740,990
1,125,899,906,842,620
1,099,511,627,776
1,073,741,824 1,048,576
1024 TB 1 PB
"Introduction to Bioinformatics" Bioinformatics Course
28
File Storage capacity by Bits and Bytes
Bit Byte Kilobyte Megabyte Gigabyte
Petabyte
9,007,199,254,740,990
1,125,899,906,842,620
1,099,511,627,776
1,073,741,824 1,048,576
1024 TB 1 BO
Exabyte
9,223,372,036,854,780,000
1,152,921,504,606,850,000
1,125,899,906,842,620
1,099,511,627,776
1,073,741,824
1,048,576 TB 1024 PB 1 EB
Zettabyte
9,444,732,965,739,290,000,000
1,180,591,620,717,410,000,000
1,152,921,504,606,850,000
1,125,899,906,842,620
1,099,511,627,776
1,073,741,824 TB 1,048,576 PB 1024 EB 1 ZB
Yottabyte
9,671,406,556,917,030,000,000,000
1,208,925,819,614,630,000,000,00
0
1,180,591,620,717,410,000,000 KB
1,152,921,504,606,850,000 MB
1,125,899,906,842,620 GB
1,099,511,627,776 TB 1,073,741,824 PB 1,048,576 EB 1024 ZB 1 YB
"Introduction to Bioinformatics" Bioinformatics Course
UNITS OF INFORMATION IN COMPUTER SCIENCE
29
CELL SIZES
"Introduction to Bioinformatics" Bioinformatics Course
http://learn.genetics.utah.edu/content/begin/cells/scale/
31
EXAMPLES OF BIOLOGICAL DATA
"Introduction to Bioinformatics" Bioinformatics Course
GENOME – DNA TRANSCRIPTOME – RNA PROTEOME – Proteins
The biological information contained in a genome is encoded in deoxyribonucleic acid (DNA) or, for many types of virus, in ribonucleic acid (RNA)
32
NAME THE NUMBERS
"Introduction to Bioinformatics" Bioinformatics Course
1
3
4 5
2
NUCLEUS DNA GENES CHROMOSOME CELL
34
CENTRAL DOGMA OF MOLECULAR BIOLOGY
http://compbio.pbworks.com/f/central_dogma.jpg
DNA is transcribed into RNA and RNA is translated into proteins
35
CENTRAL DOGMA OF MOLECULAR BIOLOGY
http://www.uic.edu/classes/bios/bios100/lectures/centraldogma.jpg
36
EXAMPLES OF BIOLOGICAL DATA
"Introduction to Bioinformatics" Bioinformatics Course
GENOME – DNA TRANSCRIPTOME – RNA PROTEOME – Proteins
The biological information contained in a genome is encoded in deoxyribonucleic acid (DNA) or, for many types of virus, in ribonucleic acid (RNA)
37
GENOME
"Introduction to Bioinformatics" Bioinformatics Course
Is the entirety of an organism’s hereditary information
The genome includes both the genes and non-coding sequences of DNA/RNA
In 1995, Haemophilus influenzae or was the first genome of a living organism to be sequenced in July 1995
1 830 140 base pairs of DNA in single circular chromosome that contains 1740 protein-coding gene, 58 transfer RNA genes and 18 other RNA genes
http://www.sciencemag.org/content/269/5223/local/front-matter.pdf http://en.wikipedia.org/wiki/File:Haemophilus_influenzae_01.jpg
39
GENOME SIZES
"Introduction to Bioinformatics" Bioinformatics Course
Introduction to Computational Biology, Nello Christiani and Matthew W. Hahn
40
GENOME SIZES
"Introduction to Bioinformatics" Bioinformatics Course
Japanese flower Paris japonica 130 billion base pairs – 50 times the human genome
42
HUMAN GENOME
"Introduction to Bioinformatics" Bioinformatics Course
Human body
• 1014 cells
• (100 trillion)
One cell
• 23 pairs of chromosomes
DNA
• ≈21,000 to 23,000 genes
RNA
• 3 billion pairs of DNA bases
Protein
• ≈100 000 different proteins
43
Relative proportions (%) of bases in DNA
"Introduction to Bioinformatics" Bioinformatics Course
CURRENT SCIENCE, VOL. 85, NO. 11, 10 DECEMBER 2003
44
DNA
DNA with high GC-content is more stable than DNA with low GC-content, 3 hydrogen bonds
"Introduction to Bioinformatics" Bioinformatics Course
45
DNA vs RNA
DNA – deoxyribonucleic acid Sugar is deoxyribose DNA is a polymer of deoxyribonucleotides Bases are adenine (A), guanine (G), cytosine (C) and thymine (T)
RNA –ribonucleic acid Sugar is ribose RNA is a polymer of ribonucleotides Bases are adenine (A), guanine (G), cytosine (C) and uracil (U)
http://www2.chemistry.msu.edu/faculty/reusch/VirtTxtJml/Images3/dna_rna1.gif
46
DNA SEQUENCE
"Introduction to Bioinformatics" Bioinformatics Course
Raw DNA sequence Coding or non-coding Parses into genes 4 nucleotide bases ATGC
>ENST00000539570 cdna:known chromosome:GRCh37:15:63889592:63893885:1 gene:ENSG00000259662 gene_biotype:protein_coding transcript_biotype:protein_coding ATGTGGCCACTGCTCACCATGCACATAACCCAGCTCAACCGGGAGTGCCTGCTGCACCTCTTCTCCTTCCTAGACAAGGACAGCAGGAAGAGCCTTGCCAGGACCTGCTCCCAGCTCCACGACGTGTTTGAGGACCCCGCACTCTGGTCCCTGCTGCACTTCCGTTCCCTCACTGAACTCCAGAAGGACAACTTCCTCCTGGGCCCGGCACTCCGCAGCCTCTCCATCTGCTGGCACTCCAGCCGCGTGCAGGTGTGCAGCATTGAGGACTGGCTCAAGAGTGCCTTCCAGAGAAGCATCTGCAGCCGGCACGAGAGCCTGGTCAATGATTTCCTCCTCCGGGTGTGCGACAGGCTTTCTGCTGTGCGCTCCCCACGGAGGCGGGAGGCGCCTGCACCGTCCTCGGGGACTCCGATCGCCGTTGGACCGAAATCACCTCGGTGGGGAGGACCTGACCACTCGGAGTTCGCCGACTTGCGCTCGGGGGTGACGGGGGCCAGGGCTGCCGCGCGCAGGGGTCTGGGGAGCCTCCGGGCGGAGCGACCCAGCGAGACCCCGCCGGCTCCCGGAGTGTCCTGGGGACCGCCACCTCCAGGAGCCCCGGTGGTGATCTCGGTGAAGCAGGAGGAGGGGAAGCAGGGGCGCACGGGCAGAAGGAGCCACCGAGCCGCTCCTCCTTGCGGTTTTGCCCGCACGCGCGTCTGCCCGCCCACCTTTCCTGGGGCGGATGCGTTCCCGCAGTGA
47
A GENE
"Introduction to Bioinformatics" Bioinformatics Course
http://www.down-syndrome.org/updates/2054/updates-2054-figure1-400w.png
48
GENE EXPRESSION REGULATORS
"Introduction to Bioinformatics" Bioinformatics Course
http://www.nature.com/scitable/topicpage/gene-expression-14121669
49
GENE EXPRESSION REGULATORS - EPIGENETICS
"Introduction to Bioinformatics" Bioinformatics Course
http://scienceblogs.com/pharyngula/2008/07/22/epigenetics/
50
EXAMPLES OF BIOLOGICAL DATA
"Introduction to Bioinformatics" Bioinformatics Course
GENOME – DNA TRANSCRIPTOME – RNA PROTEOME – Proteins Transcriptome is a set of all RNA molecules including mRNA, rRNA, tRNA, and non-coding RNA produced in one or a population of cells
http://www.bio.miami.edu/~cmallery/150/gene/c7.17.7b.transcription.jpg
51
TRANSCRIPTION
"Introduction to Bioinformatics" Bioinformatics Course
http://www.youtube.com/watch?v=ztPkv7wc3yU
52
TRANSCRIPTION
"Introduction to Bioinformatics" Bioinformatics Course
http://www.bio.miami.edu/~cmallery/150/gene/c7.17.7b.transcription.jpg
53
ALTERNATIVE SPLICING
http://www.nature.com/scitable/content/a-schematic-representation-of-alternative-splicing-95777
"Introduction to Bioinformatics" Bioinformatics Course
54
TYPES OF RNA
"Introduction to Bioinformatics" Bioinformatics Course
http://csls-text.c.u-tokyo.ac.jp/images/fig/fig03_4.gif
mRNA – messenger RNA:
encodes amino acid sequences of a polypeptide
tRNA – transfer RNA: brings
amino acids to ribosomes during translation
rRNA – ribosomal RNA: with
ribosome proteins makes up the ribosomes, the organelles that translate the mRNA
snRNA – small nuclear
RNA: forms complexes with proteins that are used in RNA processing in eukaryotes
55
TYPES OF RNA
"Introduction to Bioinformatics" Bioinformatics Course
http://finchtalk.geospiza.com/2009_05_01_archive.html
56
EXAMPLES OF BIOLOGICAL DATA
"Introduction to Bioinformatics" Bioinformatics Course
GENOME – DNA TRANSCRIPTOME – RNA PROTEOME – Proteins
The proteome is the entire set of proteins expressed by a genome, cell, tissue or organism.
http://artavanis-tsakonas.med.harvard.edu/research_images/figure_harsha_proteome.jpg
57
FROM TRANSCRIPTION TO TRANSLATION
"Introduction to Bioinformatics" Bioinformatics Course
http://www1.cs.columbia.edu/~cleslie/cs4761/microarray/central-dogma.png
58
TRANSLATION
"Introduction to Bioinformatics" Bioinformatics Course
http://0.tqn.com/d/chemistry/1/0/G/m/mrnatranslation.jpg
59
TRANSLATION INITIATION
"Introduction to Bioinformatics" Bioinformatics Course
http://bioap.wikispaces.com/Ch+17+Collaboration
60
TRANSLATION TERMINATION
"Introduction to Bioinformatics" Bioinformatics Course
http://kvhs.nbed.nb.ca/gallant/biology/translation_termination.html
61
UNIVERSAL GENETIC CODE
"Introduction to Bioinformatics" Bioinformatics Course
http://www.biogem.org/codon.jpg
63
PROTEIN
Proteins consists of long chains of amino acid sequences 20 letter alphabet (IUPAC nomenclature)
IUPAC amino acid code
Three letter code
Amino acid
A Ala Alanine
C Cys Cysteine
D Asp Aspartic Acid
E Glu Glutamic Acid
F Phe Phenylalanine
G Gly Glycine
H His Histidine
I Ile Isoleucine
K Lys Lysine
L Leu Leucine
IUPAC amino acid code
Three letter code
Amino acid
M Met Methionine
N Asn Asparagine
P Pro Proline
Q Gln Glutamine
R Arg Arginine
S Ser Serine
T Thr Threonine
V Val Valine
W Trp Tryptophan
Y Tyr Tyrosine
"Introduction to Bioinformatics" Bioinformatics Course
64
PROTEIN SEQUENCE
>sp|P48431|SOX2_HUMAN Transcription factor SOX-2 OS=Homo sapiens GN=SOX2 PE=1 SV=1 MYNMMETELKPPGPQQTSGGGGGNSTAAAAGGNQKNSPDRVKRPMNAFMVWSRGQRRKMA QENPKMHNSEISKRLGAEWKLLSETEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTKTLM KKDKYTLPGGLLAPGGNSMASGVGVGAGLGAGVNQRMDSYAHMNGWSNGSYSMMQDQLGY PQHPGLNAHGAAQMQPMHRYDVSALQYNSMTSSQTYMNGSPTYSMSYSQQGTPGMALGSM GSVVKSEASSSPPVVTSSSHSRAPCQAGDLRDMISMYLPGAEVPEPAAPSRLHMSQHYQS GPVPGTAINGTLPLSHM
"Introduction to Bioinformatics" Bioinformatics Course
65
PROTEIN SIZE
http://www.quora.com/Protein-nutrition-1/Whats-the-average-size-of-a-human-protein-in-kDa
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1150220/
67
PROTEIN STRUCTURE
http://upload.wikimedia.org/wikipedia/commons/thumb/0/05/Protein_structure.png/1024px-Protein_structure.png
69
PROTEIN SEQUENCE
>sp|P48431|SOX2_HUMAN Transcription factor SOX-2 OS=Homo sapiens GN=SOX2 PE=1 SV=1 MYNMMETELKPPGPQQTSGGGGGNSTAAAAGGNQKNSPDRVKRPMNAFMVWSRGQRRKMA QENPKMHNSEISKRLGAEWKLLSETEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTKTLM KKDKYTLPGGLLAPGGNSMASGVGVGAGLGAGVNQRMDSYAHMNGWSNGSYSMMQDQLGY PQHPGLNAHGAAQMQPMHRYDVSALQYNSMTSSQTYMNGSPTYSMSYSQQGTPGMALGSM GSVVKSEASSSPPVVTSSSHSRAPCQAGDLRDMISMYLPGAEVPEPAAPSRLHMSQHYQS GPVPGTAINGTLPLSHM
Proteins are divided into domains
DNA BINDING DOMAIN
http://www.uniprot.org/
70
GENE TRANSCRIPTION, TRANSLATION AND PROTEIN SYNTHESIS
http://compbio.pbworks.com/f/central_dogma.jpg
72
BIOINFORMATIC APPLICATIONS
"Introduction to Bioinformatics" Bioinformatics Course
The integrative approaches are useful and applied in Agricultural Higher yield in crops or fruits Disease or drought resistance crops
Medical To understand processes in healthy and disease individuals Genetic diseases
Pharmaceutical To find or develop new and better drugs Gene based drugs Structure based drug designing
73
BIOINFORMATIC QUESTIONS 1
"Introduction to Bioinformatics" Bioinformatics Course
To identify an unknown gene of interest
Sequence matching
Is there a match to known sequence in the database
Which protein family does it match to
How to identify more family members
I have an similar structure, how to identify its potential ligands
How to identify if my gene/protein is found present also in other species
How can I identify genes that are inherited together in a specific region
74
BIOINFORMATIC QUESTIONS 2
"Introduction to Bioinformatics" Bioinformatics Course
I have to constructed a artificial gene, how do I design the primers, how to check if I have the right sequence?
To know structure of an poorly expressed RNA sequence
To identify the structure and function of a protein sequence
To cluster protein sequences into families of related sequences and develop models
To generate phylogenetic trees to identify the evolutionary relationships using similar proteins/DNA
To identify which other proteins interacts with sequence of interest.