Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute...
-
date post
19-Dec-2015 -
Category
Documents
-
view
214 -
download
2
Transcript of Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute...
![Page 1: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/1.jpg)
Biological DatabasesBiological Databases
November 30, 2006
Wailap V. Ng
Institute of Biotechnology in MedicineInstitute of Bioinformatics
National Yang Ming [email protected]
![Page 2: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/2.jpg)
• DNA (Deoxyribonucleic acid)
• RNA (Ribonucleic acid)
- mRNA (Messenger RNA)
- rRNA (Ribosomal RNA)
- tRNA (Transfer RNA)
• Proteins- Enzymes
- Structural proteins
- Regulatory proteins
- Transporters
Macromolecules Related to Bioinformatics
![Page 3: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/3.jpg)
A C G T G A A C CT
A C G U G A A C CU
G A V L I S T C DM E N Q R K F H Y PW
Nucleic acid and protein sequences store the essential bioinformation
DNA (A, C, G, T)
RNA (A, C, G, U)
Protein (20 amino acids)
A C G T G A A C CT
A C G U G A A C CU
G A V L I S T C DM E N Q R K F H Y PW
![Page 4: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/4.jpg)
DNA mRNAs Proteins
Replication
Transcription Translation
Central DogmaCentral Dogma
![Page 5: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/5.jpg)
Basic structure of a bacterial gene
Transcription
mRNA
Translation
protein
P GeneDNA
5’ 3’
5’ 3’
N C
![Page 6: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/6.jpg)
Stop codon (TAA)
Start codon (ATG)
A gene is a segment of DNA with an upstream start codon and a downstream stop codon that codes for the sequence of a polypeptide (protein)
![Page 7: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/7.jpg)
Information in Biological Databases
• DNA and protein sequences• Protein structures• Expression data (microarray, SAGE, etc.)• Biological pathways• Subcellular location of proteins• Protein-protein interactions • Molecular medicine• Literature• etc.
![Page 8: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/8.jpg)
International Union of Pure and Applied Chemistry (IUPAC) codes for nucleotides and amino acids
![Page 9: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/9.jpg)
IUPAC nucleotide code Base
A Adenine
C Cytosine
G Guanine
T (or U) Thymine (or Uracil)
R A or G (purine)
Y C or T (pyrimidine)
S G or C
W A or T
K G or T
M A or C
B C or G or T
D A or G or T
H A or C or T
V A or C or G
N any base
. or - gap
![Page 10: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/10.jpg)
IUPAC amino acid code
Three letter code
Amino acid
A Ala Alanine
C Cys Cysteine
D Asp Aspartic Acid
E Glu Glutamic Acid
F Phe Phenylalanine
G Gly Glycine
H His Histidine
I Ile Isoleucine
K Lys Lysine
L Leu Leucine
M Met Methionine
N Asn Asparagine
P Pro Proline
Q Gln Glutamine
R Arg Arginine
S Ser Serine
T Thr Threonine
V Val Valine
W Trp Tryptophan
Y Tyr Tyrosine
![Page 11: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/11.jpg)
The Origins of Protein Sequence Databases
* Protein sequencing (Sanger and Tuppy, 1951)
• Atlas of Protein Sequence and Structure (Margaret Dayhoff and National Biomedical Research Foundation (NBRF) (1965-1978)
• Protein Information Resource (PIR) (NBRF, 1984 - present)
• PIR-International Protein Sequence Database (NBRF, MIPS, and JIJPID, 1988 – present)
![Page 12: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/12.jpg)
The Origins of DNA Sequence Databases
* DNA double-helix structure (James Watson and Francis Crick, 1953)
* Recombinant DNA (Paul Berg et al., 1972)
* DNA sequencing (Maxim and Gilbert; Sanger - 1977)
• GenBank [Walter Goad et al., 1979 (prototype); 1982 -1992, LANL (Los Alamos National Lab.)]
• EMBL Data Library [1982 (1980) – present] – UK
• DDBJ [1986 (1984) – present] - Japan
![Page 13: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/13.jpg)
http://www.infobiogen.fr/services/dbcat/ (Site closed)
Number of biological databases in 2005
![Page 14: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/14.jpg)
Major Bioinformation Resources
• NCBI – National Institute of Health
• EMBL – European Bioinformatics Institute
• DDBJ – National Institute of Genetics (Japan)
• Expasy – Swiss Institute of Bioinformatics
• GenomeNet – Koyoto University
![Page 15: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/15.jpg)
NCBI molecular databases
![Page 16: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/16.jpg)
Nucleotide Sequence Databases Consist of the Following Sequences:
• DNA fragments
• cDNA [Expressed Sequence Tags (EST) and full length cDNA sequences - partial and complete mRNA]
• Genomes
Nucleic acid sequences provide the fundamental starting point for describing and understanding the structure, function, and development of genetically diverse organisms.
![Page 17: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/17.jpg)
Common Sequence File Formats• Fasta
• GenBank (DNA) or GenPept (Protein)
Each sequence has at least one unique number to allow you to retrieve it from the public db – e.g. Accession Number, gi_number, g
ene_ID, protein_ID, locus name, etc.
![Page 18: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/18.jpg)
>gi|43500|emb|Y00534.1|HHGVPA Halobacterium halobium gvpA gene for major gas vesicle protein AAGCTTTACACTCTCCGTACTTAGAAGTACGACTCATTACAGGAGACATAACGACTGGTGAAACCATACACATCCTTATGTGATGCCCGAGTATAGTTAGAGATGGGTTAATCCCAGATCACCAATGGCGCAACCAGATTCTTCAGGCTTGGCAGAAGTCCTTGATCGTGTACTAGACAAAGGTGTCGTTGTGGACGTGTGGGCTCGTGTGTCGCTTGTCGGCATCGAAATCCTGACCGTCGAGGCGCGGGTCGTCGCCGCCTCGGTGGACACCTTCCTCCACTACGCAGAAGAAATCGCCAAGATCGAACAAGCCGAACTTACCGCCGGCGCGAGGCGGCACCCGAGGCCTGACGCACAGGCCTCCCTTCGGCCGGCGTAAGGGAGGTGAATCGCTTGCAAACCATACTTTAACACCT TCTCGGGTAC
DNA sequence in FASTA format
![Page 19: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/19.jpg)
DNA sequence in GenBank format
![Page 20: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/20.jpg)
Nucleotide Sequence DatabasesNucleotide Sequence Databases
• GenBank – NCBI (National Center for Biotechnology Information)
http://www.ncbi.nlm.nih.gov/
• EMBL (European Molecular Biology Laboratory) – EBI
http://www.ebi.ac.uk/
• DDBJ (DNA Data Bank of Japan) – NIG (National Institute of Genetics)
http://www.ddbj.nig.ac.jp/
![Page 21: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/21.jpg)
When did the collaboration start?
In February, 1986, GenBank and EMBL began a collaborative effort [joined by DDBJ in 1987] to devise a common feature table format and common standards for annotation practice.
INSDCINSDC
International Nucleotide Sequence Database CollaborationInternational Nucleotide Sequence Database Collaboration
![Page 22: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/22.jpg)
August 2005
![Page 23: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/23.jpg)
National Center for Biotechnology Information (NCBI)
- Established in 1988
- Part of the National Library of Medicine, NIH, USA
- Creates public databases, conducts research in computational biology, develops software tools for analyzing genome data, and disseminates biomedical information
- Host to the GenBank nucleotide sequence database since 1992 (1982 -1992, LANL)
![Page 24: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/24.jpg)
![Page 25: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/25.jpg)
NCBI Nucleotide Databases
• GenBank - INSDC collected DNA sequences• RefSeq - a comprehensive, integrated, non-redundant set of seq
uences, for major research organisms
• dbEST - contains sequence data on "single-pass" cDNA sequences (Expressed Sequence Tags)
• UniGene - a non-redundant set of gene-oriented clusters of automatically partitioned from GenBank sequences
• dbSTS - sequence & mapping data on short genomic landmark sequences or Sequence Tagged Sites (PCR primer pairs)
• UniSTS - a comprehensive db of STSs derived from STS-based maps and other experiments
![Page 26: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/26.jpg)
NCBI Nucleotide Databases (continued)
• dbSNP – Single nucleotide polymorphism database
• dbGSS - Genome survey sequence database
• PopSet - a set of DNA sequences collected to analyze the evolutionary relatedness of a population
• TPA - Third party annotation sequences
• Nucleotide - Entrez Nucleotides database of GenBank, RefSeq, and PDB sequences
• Trace Archive – Raw DNA sequence trace files
• HomoloGene – A system for automated detection of homologs among the annotated genes of several completely sequenced eukaryotic genomes
• http://www.ncbi.nlm.nih.gov/Database/
![Page 27: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/27.jpg)
NCBI Nucleotide Databases (continued)• MGC (Mammalian Gene Collection; )
![Page 28: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/28.jpg)
cDNA Sequence Related Databases
dbEST
Unigene
TIGR THC
Full-Length cDNA Sequences
![Page 29: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/29.jpg)
What is dbEST?
dbEST (Nature Genetics 4:332-3;1993) is a division of GenBank that contains sequence data and other information on "single-pass" cDNA sequences, or Expressed Sequence Tags, from a number of organisms.
![Page 30: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/30.jpg)
Transcription
DNA
mRNA
cDNA
Reverse Transcription
DNA sequencing
EST
![Page 31: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/31.jpg)
cDNA sequencing is a powerful tool for quick identification of new genes
![Page 32: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/32.jpg)
![Page 33: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/33.jpg)
cDNA Sequence Related Databases
dbEST
Unigene
TIGR THC, Human Gene Index
Full-Length cDNA Sequences
![Page 34: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/34.jpg)
AAAAAAAAAAA
AAAAAAAAAAAAAA
AAAAAAAAAAAA
AAAAAAAAAA
mRNA
Transcription
cDNA cloning
Gene
cDNA sequencing
ESTs
EST clustering
Unigene
![Page 35: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/35.jpg)
![Page 36: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/36.jpg)
![Page 37: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/37.jpg)
![Page 38: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/38.jpg)
![Page 39: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/39.jpg)
![Page 40: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/40.jpg)
Expression profile
![Page 41: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/41.jpg)
cDNA Sequence Related Databases
dbEST
Unigene
THC (Tentative human consensus sequences) - The Institute for Genome Research (www.tigr.org)
Full-Length cDNA Sequences
![Page 42: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/42.jpg)
![Page 43: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/43.jpg)
cDNA Sequence Related Databases
dbEST
Unigene
THC (Tentative human consensus sequences) - The Institute for Genome Research (www.tigr.org)
Full-Length cDNA Sequences
![Page 44: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/44.jpg)
AAAAAAAAAAA
AAAAAAAAAAAAAA
AAAAAAAAAAAA
AAAAAAAAAA
mRNA
Transcription
Gene
Sequence assemblyFull-length cDNA sequence
DNA sequencing
ESTs
Full-length cDNA clone
cDNA cloning
![Page 45: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/45.jpg)
![Page 46: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/46.jpg)
http://hinv.ddbj.nig.ac.jp/
![Page 47: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/47.jpg)
Protein Sequence Databases
![Page 48: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/48.jpg)
Origins of Protein Sequences
Derived from:
• DNA fragment sequences
• mRNA sequences
• ESTs
• Genomes
![Page 49: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/49.jpg)
Database name
Full name and/or description
NCBI Protein database
All protein sequences: translated from GenBank and imported from other protein databases
PIR-PSDProtein Information Resource Protein Sequence Database, has been merged into the UniProt knowledgebase - Georgetown University
PIR-NREFPIR's Non-redundant Reference protein database - Georgetown University
PRFProtein research foundation database of peptides: sequences, literature and unnatural amino acids - Japan
Swiss-ProtNow UniProt/Swiss-Prot: expertly curated protein sequence database, section of the UniProt knowledgebase - Swiss Institute of Bioinformatics
TrEMBLNow UniProt/TrEMBL: computer-annotated translations of EMBL nucleotide sequence entries: section of the UniProt knowledgebase - SIB
UniProtUniversal protein knowledgebase: merged data from Swiss-Prot, TrEMBL and PIR protein sequence databases – GU, SIB, EMBL
UniRefUniProt non-redundant reference database: clustered sets of related sequences (including splice variants and isoforms) – GU, SIB, EMBL
![Page 50: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/50.jpg)
Protein sequence in FASTA format
![Page 51: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/51.jpg)
Protein sequence in GenPept format – example 1
![Page 52: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/52.jpg)
Protein sequence in GenPept format – example 2
![Page 53: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/53.jpg)
Protein sequence in UniProt/SwisProt format
![Page 54: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/54.jpg)
Other Biological Databases
• Protein-Protein Interaction
• Gene Ontology
• Biological Pathways
• Protein structures
• Orthologs
• Gene expression
• Literature
![Page 55: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/55.jpg)
Protein-Protein Interaction Databases
• Most proteins do not work alone in the cell
• Utilize the concept of ‘guilt by association’ to discover the functions of previously uncharacterized proteins
![Page 56: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/56.jpg)
Figure 1: (A) An interaction map of the yeast proteome assembled from published interactions.The map contains 1,548 proteins and 2,358 interactions. Proteins are colored according to their functional role as defined by the Yeast Protein Database16; proteins involved in membrane fusion (blue), chromatin structure (gray), cell structure (green), lipid metabolism (yellow), and cytokinesis (red). For other maps with different functional groups highlighted, see <http://depts.washington.edu/sfields/>. On-line maps can also be zoomed and searched for protein names. (B) Section of part A showing the clustering of proteins involved in membrane fusion (blue), lipid metabolism (yellow), and cell structure (green).
Schwikowski et al. 2000. Nat. Biotech.
![Page 57: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/57.jpg)
Protein interaction map of Drosophila melanogasterGiot et al. Science 302:1727-36, 2003
7,048 proteins & 20,405 interactions
![Page 58: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/58.jpg)
Integrated physical-interaction network. Nodes represent genes and are labeled with their corresponding gene names. Connections between nodes display physical interactions as recorded in the public databases, where a yellow arrow directed from one node to another represents a protein --> DNA interaction, and a blue line between nodes represents a protein-protein interaction. Global changes in mRNA expression (in this case, in response to a deletion of GAL4 in the presence of galactose) are visually superimposed on the network. The grayscale intensity of each node indicates the change in mRNA expression of the corresponding gene, where medium gray represents no change, darker or lighter shades represent an increase or decrease in expression, respectively, and node diameter scales with the overall magnitude of change. GAL4 is colored in red to signify that its expression level has been perturbed by external means. Highly interconnected groups of genes tend to have common biological function and are annotated accordingly (rectangular labels).
Ideker et al. Science 292:929, 2001
![Page 59: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/59.jpg)
• Database of Interacting Proteins (DIP)
(http://dip.doe-mbi.ucla.edu/; UCLA)
• Biomolecular Interaction Network Database (BIND)
(http://bind.ca/; Mount Sinai Hospital, Canada)
• Human Protein Reference Database (HPRD)
(http://www.hprd.org/; Johns Hopkins University and the Institute of Bioinformatics)
• MIPS Mammalian Protein-Protein Interaction Database
(http://mips.gsf.de/proj/ppi/; Munich Information Center for Protein Sequences)
More can be found in http://mips.gsf.de/proj/ppi/
![Page 60: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/60.jpg)
Database of Interacting Proteins (DIP)
The DIPTM database catalogs experimentally determined interactions between proteins. It combines information from a variety of sources to create a single, consistent set of protein-protein interactions. The data stored within the DIP database were curated, both, manually by expert curators and also automatically using computational approaches that utilize the knowledge about the protein-protein interaction networks extracted from the most reliable, core subset of the DIP data.
![Page 61: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/61.jpg)
![Page 62: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/62.jpg)
![Page 63: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/63.jpg)
![Page 64: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/64.jpg)
BIND Database
![Page 65: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/65.jpg)
![Page 66: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/66.jpg)
![Page 67: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/67.jpg)
![Page 68: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/68.jpg)
![Page 69: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/69.jpg)
MIPS (Mammalian Protein-Protein Interaction) Database is a collection of manually curated high-quality PPI data collected from the scientific literature by expert curators. We took great care to include only data from individually performed experiments since they usually provide the most reliable evidence for physical interactions.
![Page 70: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/70.jpg)
Human Protein Reference Database
A centralized platform to visually depict and integrate information pertaining to domain architecture, post-translational modifications, interaction networks and disease association for each protein in the human proteome. All the information in HPRD has been manually extracted from the literature by expert biologists who read, interpret and analyze the published data.
![Page 71: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/71.jpg)
Biological Pathway Databases
• KEGG (GenomeNet)
• Biocarta ( NCBI)
• BioPax (Biological Pathway Exchange)
* Ingenuity Systems
* GeneGo
![Page 72: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/72.jpg)
![Page 73: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/73.jpg)
![Page 74: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/74.jpg)
![Page 75: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/75.jpg)
![Page 76: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/76.jpg)
ARGININE AND PROLINE METABOLISM
![Page 77: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/77.jpg)
Biocarta Pathways http://cgap.nci.nih.gov/Pathways/BioCarta_Pathways
http://www.biocarta.com/
![Page 78: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/78.jpg)
![Page 79: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/79.jpg)
![Page 80: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/80.jpg)
![Page 81: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/81.jpg)
![Page 82: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/82.jpg)
http://www.biopax.org/
![Page 83: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/83.jpg)
BioPAX Motivation
Before BioPAX With BioPAX
Common format will make data more accessible, promoting data sharing and distributed curation efforts
>150 DBs and tools
Database
Application
User
![Page 84: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/84.jpg)
Ingenuity Systems - Analyze expression/other biological data in pathways/networks
![Page 85: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/85.jpg)
Ingenuity Systems – example
![Page 86: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/86.jpg)
Gene Ontology (GO)
http://www.geneontology.org/
![Page 87: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/87.jpg)
![Page 88: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/88.jpg)
![Page 89: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/89.jpg)
• Gene Card
• Human genes, proteins and diseases db
• http://www.genecards.org/
• Omin
• Online Mendelian Inheritance in Man
• http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM
Molecular Pathology and Disease Information Databases
![Page 90: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/90.jpg)
Human disease database
GeneCards
Omin
![Page 91: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/91.jpg)
![Page 92: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/92.jpg)
![Page 93: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/93.jpg)
![Page 94: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/94.jpg)
![Page 95: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/95.jpg)
![Page 96: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/96.jpg)
![Page 97: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/97.jpg)
![Page 98: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/98.jpg)
COG/KOG
{COGNITOR/KOGNITOR}
Clusters of Orthologous Groups of proteins (COGs)
![Page 99: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/99.jpg)
![Page 100: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/100.jpg)
![Page 101: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/101.jpg)
SAGE Database
Serial Analysis of Gene Expression
![Page 102: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/102.jpg)
![Page 103: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/103.jpg)
![Page 104: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/104.jpg)
![Page 105: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/105.jpg)
GEO (Gene Expression Omnibus)
http://www.ncbi.nlm.nih.gov/geo/
![Page 106: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/106.jpg)
GPLPlatform
descriptions
GSMRaw/processedspot intensities
from a singleslide/chip
GSEGrouping of
slide/chip data“a single experiment”
GDSGrouping ofexperiments
Curated byNCBI
Submitted byExperimentalistsSubmitted by
Manufacturer*
Entrez GEOEntrez
GEO Datasets
![Page 107: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/107.jpg)
Submit and update data
Query the database:• gene identifiers• field information• sequence
Browse datasets
Download data
Redesigned
with
new features
![Page 108: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/108.jpg)
From Unigene: Hs.194143
![Page 109: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/109.jpg)
![Page 110: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/110.jpg)
Sequence and literature Search/Retrieval
• Entrez
• SRS
• ftp
![Page 111: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/111.jpg)
Major sequence databases accessible through the Internet
1. GenBank - National Center for Biotechnology Information (NCBI), USA http://www.ncbi.nih.gov/Entrez/
2. European Molecular Biology Laboratory (EMBL) - European Bioinformatics Institute http://www.ebi.ac.uk/embl/index.html
3. DNA DataBank of Japan (DDBJ) - Mishima, Japanhttp://www.ddbj.nig.ac.jp/
4. Protein International Resource (PIR) - National Biomedical Research Foundation (NBRF), USAhttp://www-nbrf.georgetown.edu/pirwww/
5. SwissProt - Swiss Institute for Experimental Cancer Researchhttp://www.expasy.org/cgi-bin/sprot-search-de
6. Sequence Retrieval System (SRS) - European Bioinformatics Institute http://srs6.ebi.ac.uk
![Page 112: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/112.jpg)
![Page 113: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/113.jpg)
![Page 114: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/114.jpg)
![Page 115: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/115.jpg)
Protein Structure Databases
• PDB (Protein Data Bank)
http://www.rcsb.org/pdb/
• Entrez Structure (NCBI)
![Page 116: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/116.jpg)
![Page 117: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/117.jpg)
![Page 118: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/118.jpg)
ftp ftp.ncbi.nih.gov
ftp ftp.expasy.org
ftp ftp.ebi.ac.uk
ftp ftp.ddbj.nig.ac.jp
Retrieve complete sets of data
![Page 119: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/119.jpg)
Retrieve Raw Sequencing Data from
NCBI Trace Archive Database
![Page 120: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/120.jpg)
![Page 121: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/121.jpg)
![Page 122: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/122.jpg)
Literature Searches
Entrez Pubmed (NCBI)
Entrez Pubmed Central (NCBI)
SRS (EMBL-EBI)
Gopubmed (Ontology-based Literature search)
![Page 123: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/123.jpg)
![Page 124: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/124.jpg)
![Page 125: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/125.jpg)
![Page 126: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/126.jpg)
http://www.geocities.com/bioinformaticsweb/datalink.html
More bio-db can be found in Bioinformatics web
![Page 127: Biological Databases November 30, 2006 Wailap V. Ng Institute of Biotechnology in Medicine Institute of Bioinformatics National Yang Ming University wvng@ym.edu.tw.](https://reader038.fdocuments.in/reader038/viewer/2022110322/56649d3e5503460f94a172ad/html5/thumbnails/127.jpg)