A Prlic - BioJava update
-
Upload
jan-aerts -
Category
Technology
-
view
630 -
download
1
description
Transcript of A Prlic - BioJava update
How to use BioJavato calculate one billion protein structure alignments at
the RCSB PDB website
Andreas Prlić
My Two Hats
RCSB PDBBioJava
www.pdb.org
Overview N
umbe
r of r
elea
sed
entr
ies
Year
Some of the things you can do at the RCSB PDB site
• Advanced queries
• Custom reports
• Visualization
• Education section
• Comparisons across PDB, based on sequence and 3D structure similarities
Jmol
LigandExplorer
Custom report
www.pdb.org
Systematic Structural AlignmentObjective: Find novel relationships
Example: Green Fluorescent Protein§ Nidogen-1: similar 11-stranded § beta-barrel and internal helices§ 3 Å RMSD, only 9% sequence identity§ Nidogen-1: component of basement membrane, no chromophore§ GFP and NID-1 may share common ancestor
Open Science Grid
based on the FATCAT (rigid) algorithm Yuzhen Ye & Adam Godzik. Flexible structure alignment by chaining aligned fragment pairs allowing twists. 2003. Bioinformatics vol.19 suppl. 2. ii246-ii255.
Systematic comparisons of representative chains from 40% sequence identity clusters
22000 sequence clusters33000 representative domains
PDBCustom Job Management
Java Clients can run anywhere
Open Science
Grid
Sends out instructionsto clients
Writes resultsto disk
.
.
.
Initial calculation of frozen snapshot of PDB
~170k CPU hourson OSG
Incremental weekly updates(~1-2 million alignments)
<1000 CPU hours
Code www.biojava.org
1 billion alignmentsavailable freely at
www.rcsb.org
BioJava
• Major rewrite - BioJava 3
BioJava 1 BioJava 3
core data model
symbols/alphabets, counts, distributions
Genome/sequencing
Mult. seq. align
Structure alignment
Modfinder
AA Properties
Protein Disorder
Hmmer3 WS
NCBI WS
Parsers: Genbank/Embl/Blast
Acknowledgments
• Spencer Bliven
• Peter Rose
• Phil Bourne
• all contributors
• A. Yates, J. Jacobsen, P. Troshin, M. Chapman, J. Gao, C.H. Koh, S. Foisy, R. Holland, G. Rimsa, M. Heuer, H. Brandstaetter-Mueller, S. Willis
RCSB PDB BioJava
FundingRCSB PDBGoogle Summer of Code Open Science Grid