Class 3 2009
description
Transcript of Class 3 2009
![Page 1: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/1.jpg)
Class 3 2009
European Resources
Protein Focused
![Page 2: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/2.jpg)
Protein Databases
EBI – European Bioinformatics Institute
http://www.ebi.ac.uk/
![Page 3: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/3.jpg)
What is the difference between dealing
with nucleotide DBs and protein DBs?
![Page 4: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/4.jpg)
Protein information• Name & description
• Gene encoded from
• Organism
• Function (only one?)
• Enzyme?
• Ligands?
• PTMs?
• Interactions?
• Biological processes.
• Structure.
• Sequence.
• Localization
• More...
![Page 5: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/5.jpg)
Protein DB -short history
Pre-UniProt
Swiss-Prot: created in July 1986; since 1987, a collaboration of the SIB and the EMBL/EBI;
TrEMBL: created at the EBI in 1996 as a computer-annotated protein sequence database supplementing Swiss-Prot.
It was introduced to deal with the increased data flow from genome projects
![Page 6: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/6.jpg)
PIR
EBI
SIB
![Page 7: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/7.jpg)
The three-layered approach
The UniProt Archive (UniParc)•UniProtKB + all other protein sequences publicly available•Completeness
The UniProt Reference Clusters (UniRef)•Non-redundant views of UniProtKB + selected UniParcsets•Speed
The UniProt Knowledgebase (UniProtKB)•Central database of annotated protein sequences and functional information•UniProtKB/Swiss-Prot + UniProtKB/TrEMBL
![Page 8: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/8.jpg)
Protein DBs• Swiss-Prot - manually annotated.
• TrEMBL – translated EMBL, automatically annotated.
• UniProtKB – The UniProt Knowledge
• UniParc – The Achieve pf UniProt
• PIR - Protein Information Resource
• UniRef – The UniProt Reference Clusters
• PDB – Protein Data Bank – structure
• PRIDE – Resource for experimental proteomics (not in this
class)
![Page 9: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/9.jpg)
Databases growth
www.genome.jp/en/db_growth.html
![Page 10: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/10.jpg)
Protein DBs• Swiss-Prot - manually annotated
2005- ~100,000 2009 - ~400,000
![Page 11: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/11.jpg)
.
• TrEMBL – translated EMBL, automatically
annotated.
![Page 12: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/12.jpg)
![Page 13: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/13.jpg)
Protein NamesDifferent DBs – different accessions
DB Accessions
TrEMBL P12345
Swiss-Prot (to be changed..) MAPK_HUMAN
RefSeq NP_123456
XP_123456
UniRef UniRef100_P99999
UniRef90_P99999
UniRef50_P99999
Ensembl ENSP00000123456
![Page 14: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/14.jpg)
Protein DBs• Swiss-Prot - manually annotated.
• TrEMBL – translated EMBL, automatically annotated.
• UniProtKB – The UniProt Knowledge
• UniParc – The Achieve pf UniProt
• PIR - Protein Information Resource
• UniRef – The UniProt Reference Clusters
• PDB – Protein Data Bank – structure
• PRIDE – Resource for experimental proteomics (not in this
class)
![Page 15: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/15.jpg)
Principles
![Page 16: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/16.jpg)
More in UniProt a complete annotated protein sequence database
UniProt The Universal Protein Resource for protein sequences.
UniProt Archive A non-redundant archive of protein sequences extracted from public databases and contains only protein sequences.
UniProt/UniRef Features clustering of similar sequences to yield a representative subset of sequences. This produces very fast search times.
UniProt/UniMES A repository specifically developed for metagenomic and environmental data.
![Page 17: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/17.jpg)
Protein DBs• Swiss-Prot - manually annotated.
• TrEMBL – translated EMBL, automatically annotated.
• UniProtKB – The UniProt Knowledge
• UniParc – The Achieve pf UniProt
• PIR - Protein Information Resource
• UniRef – The UniProt Reference Clusters
• PDB – Protein Data Bank – structure
• PRIDE – Resource for experimental proteomics (not in this
class)
![Page 18: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/18.jpg)
How is it built?
![Page 19: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/19.jpg)
![Page 20: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/20.jpg)
http://beta.uniprot.org/
What’s in UniProt?
![Page 21: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/21.jpg)
EBI interface
![Page 22: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/22.jpg)
PIR – Protein Information Resource
Protein Family Classification System
Integrated
Protein
Knowledgebase
Integrated Protein Literature, Information and Knowledge
![Page 23: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/23.jpg)
END
If you got lost…(class exercise)
some more slides…
![Page 24: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/24.jpg)
EB-eye search
![Page 25: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/25.jpg)
EB-eye search
![Page 26: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/26.jpg)
![Page 27: Class 3 2009](https://reader036.fdocuments.in/reader036/viewer/2022062408/5681446f550346895db100dd/html5/thumbnails/27.jpg)
NCBI - Entrez