INFSO-RI-508833 Enabling Grids for E-sciencE EGEE Review WISDOM demonstration Vincent Bloch,...
-
Upload
melvyn-barton -
Category
Documents
-
view
216 -
download
0
Transcript of INFSO-RI-508833 Enabling Grids for E-sciencE EGEE Review WISDOM demonstration Vincent Bloch,...
INFSO-RI-508833
Enabling Grids for E-sciencE
www.eu-egee.org
EGEE Review
WISDOM demonstration
Vincent Bloch, Vincent Breton, Matteo Diarena, Jean Salzemann. LPC Clermont-Ferrand IN2P3/CNRS
Hurng-Chun Lee CERN
J. SALZEMANN, 19/04/07 2
Enabling Grids for E-sciencE
Wide In Silico Docking On Malaria
• Biomedical goal
Find new drugs and/or at least improve the drug discovery process.
• Bio Informatics Solution
Use the Grid to analyze potential drugs at a large scale
J. SALZEMANN, 19/04/07 3
Enabling Grids for E-sciencE
High Throughput Virtual Docking
Compounds:ZINC- 4,3MChembridge - 500 000
Targets:
3D structures in PDB
Millions of chemicalcompounds available High Throughput Screening
1-10$/compound, several hours
Molecular docking (FlexX, Autodock)20 cents/compound, 1 minute
Data challenge on EGEE~ 2 months on ~2000 computers
Hits screeningusing assays performed onliving cells
Leads
Clinical testing
Drug
J. SALZEMANN, 19/04/07 4
Enabling Grids for E-sciencE
Objective of the WISDOM development
• Objective– Dock a whole compound database in a limited time with a minimal human
involvement during the data challenge.
• Need an optimized environment– Production in Limited time
– Performance are important
• Need a fault tolerant environment– Grid is heterogeneous and dynamic
– Data produced are important and can’t be easily reproduced
• Need an automatic production environment– Ease the execution
– User-friendly hi-level services
J. SALZEMANN, 19/04/07 5
Enabling Grids for E-sciencE
Grid Added Value
• Large number of CPUs available
• Reliable and secured Data Management Services.– Sharing of results– Replication of the data– ACLs
• Availability of the resources
J. SALZEMANN, 19/04/07 6
Enabling Grids for E-sciencE
Statistics of deployment
• First DC:– 80 CPU years– 1 TB– 1700 CPUs used in parallel– July 1st - August 15th 2005
• 2nd DC– 100 CPU years– 800 GB– 1700 CPUs used used in parallel– May 1st -April 15th 2006
• 3rd DC– 400 CPU years– 1,6 TB– Up to 5000 CPUs in parallel– October 1st - 15th December 2006
J. SALZEMANN, 19/04/07 7
Enabling Grids for E-sciencE
Production Environment
J. SALZEMANN, 19/04/07 8
Enabling Grids for E-sciencE
Grid Statistics Portal
• Real-Time monitoring of the Grid
• Customizable interface
• Drag and drop components
J. SALZEMANN, 19/04/07 9
Enabling Grids for E-sciencE
Interactive Web Portal
• User Fridely Interface for biologists
• Real Time output of the results– 3D views of the docking poses and structures
• Resubmission of docking jobs
J. SALZEMANN, 19/04/07 10
Enabling Grids for E-sciencE
Conclusion
• Take advantage of the EGEE services, APIs and resources.
• Use of AMGA to store results and statistics immediately.
• Interoperable Web Service InterfaceWSDL following the WS-I profile
• Improved flexibility to deploy other bioinformatics applications.