National Center for Supercomputing Applications Cloud Resources in Production Cyberenvironments for...
-
Upload
colleen-wilson -
Category
Documents
-
view
214 -
download
0
Transcript of National Center for Supercomputing Applications Cloud Resources in Production Cyberenvironments for...
National Center for Supercomputing Applications
Cloud Resources in Production Cyberenvironments for E-Science Virtual
Organizations GridChem/ParamChem
Interoprability
NSF Cloud ComputingWorkshop
Arlington, VA 17-18 Mar 2011
Sudhakar PamidighantamNCSA, University of Illinois at
National Center for Supercomputing Applications
Acknowledgments
• Jayeeta Ghosh, NCSA, ParamChem• Suresh Marru, Indiana U. OGCE • Ye Fan, Indiana U. OGCE• Kenno Vonnommeslaeghe, U. Maryland/Paramchem, • Narendra Polani, UKy, Middleware/ParamChem• Michael Sheetz, UKy, Application Interfaces/ ParamChem• Vikram Gazula, UKy, Server Administration• Tom Roney, NCSA, Server and Database Maintenance• Nikhil Singh, NCSA, Paramchem
• Liu Yang, NCSA, GridChem• Scott Brozell, OSC, Applications and Testing• Rion Dooley, TACC Middleware Infrastructure• Stelios Kyriacou, OSC Middleware Scripts• Chona Guiang, TACC Databases and Applications• Kent Milfeld, TACC Database Integration • Kailash Kotwani, NCSA, Applications and Middleware
National Center for Supercomputing Applications
Outline• Historical Background : --- Grid Computational Chemistry• Production Environments• Current Status Web Services • Usage:Grid and Science Achievements• Cloud in Hybrid Environments• Interoperability• Future
National Center for Supercomputing Applications
MotivationIntegrating Services for E-Science and
Engineering inResearch, Education and TrainingSoftware - Reasonably Mature and easy to use to address
chemists questions of interestCommunity of Users - Need and capable of using the software Some are non traditional computational chemistsResources - Various in capacity and capability - Distributed and heterogeneous
National Center for Supercomputing Applications
NSF Petascale Road Map• Track I Scheme Multi-petaflop single site system to be deployed by 2011 at NCSA BlueWaters http://www.ncsa.illinois.edu/BlueWaters/
• Track 2 Sub-petaflop systems Several to be deployed until Track 1 is online System OS Cores• Dell PowerEdge(NCSA) EM64T 9600• SGI-Altix(PSC) IA64 768• SGI UV-Ice(NCSA) EM64T 1568 • IBM Power4 Cluster(NCSA) Pwr4 48• IBM PowerPC(Indiana) Pwr4 1536• Sun Constellation (TACC) EM64T 50000
Additional Systems to be online soon (currently being allocated) SGI UV-Ice(PSC) EM64T 4096 FutureGrid Diverse on demand
National Center for Supercomputing Applications
Grids and New OpportunitiesAlliance to TeraGridHomogenous Grid with predefined fixed software and
system stack was planned (Teragrid) but it was difficult to keep it homogenous
Local preferences and diversity leads to heterogeneous grids now! (Operating Systems, Schedulers, Policies, Software and Services)
Openness and standards that lead interoperability are critical for successful services
Grid Hard-ware
Middleware
Scientific ApplicationsInterfacesInterfaces
National Center for Supercomputing Applications
User CommunityChemistry and Computational Biology
NRAC AAB Small Allocations As of Oct 04
#PIs 26 23 64
#SUs 5,953,100 1,374,100 640,000
TeraGrid Allocations in 2010
Discipline # PIs Initial Alloc. SUs Physics 125 920,254,700
Molecular Biosciences 308 689,733,465
Chemistry 264 255,479,494
Chemical, Thermal Systems 143 232,905,769
Materials Research 207 210,602,367
2101 Users using Chemistry Software
230 ASC 30 AST 18 ATM 8 BCS 30 CCR 28 CDA 653 CHE 11 CTS
1 DBS 2 DEB 805 DMR 10 DMS 18 EAR 1 ECS 23 IBN 2 IRI
153 MCB 10 MSS 3 NCR 4 OCE 37 PHY 6 SEE 5 SES 3 STA
National Center for Supercomputing Applications
Computational Chemistry Grid
This is a Virtual OrganizationIntegrated Cyber Infrastructure for
Computational Chemistry
Integrates Applications, Middleware, HPC
resources, Scheduling and Data
management
Allocations, User services and Training
National Center for Supercomputing Applications
Other Resources
Extant HPC resources at various
Supercomputer Centers, Cloud resources (Interoperable)
Optionally Other Grids and Hubs/local/personal
resources
These may require existing
allocations/Authorization
National Center for Supercomputing Applications
Grid Middleware Proxy Server
GridChem System
user user useruser user
PPortal Clientortal Client
Grid ServicesGrid Services
GridGrid
applicationapplicationapplicationapplication
Mass Storage
http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0438312
National Center for Supercomputing Applications
Applications
• GridChem supports some apps already– Gaussian, GAMESS, NWChem, Aces3 Molpro, ADF, Quild,
QMCPack, Castep, DMol3, Amber, Charmm
• Schedule of integration of additional software– Crystal– Q-Chem– Wein2K– MCCCS Towhee – Others...
Workflows
National Center for Supercomputing Applications
GridChem Resources Monitoring
http://portal.gridchem.org:8080/gridsphere/gridsphere?cid=home
National Center for Supercomputing Applications
Application Software ResourcesCurrently Supported
Suite Version Location
Gaussian 03 C.02/D.01 Many Platforms
MolPro 2006.1 NCSA
NWChem 5.0/4.7 Many Platforms
Gamess Jan 06 Many Platforms
Amber 8.0 Many Paltforms
QMCPack 2.0 NCSA
National Center for Supercomputing Applications
GridChem Software ResourcesNew Applications
Integration Underway
• ADF Amsterdam Density Functional Theory• Wien2K Linearized Augemented Plain wave (DFT)• CPMD Car Parinello Molecular Dynamics • QChem Molecular Energetics (Quantum Chemistry)• Aces3 Parallel Coupled Cluster Quantum Chemistry• Gromacs Nano/Bio Simulations (Molecular Dynamics)
• NAMD Molecular Dynamics• DMol3 Periodic Molecular Systems ( Quantum Chemistry)• Castep Quantum Chemistry • MCCCS-Towhee Molecular Confirmation Sampling (Monte Carlo)• Crystal98/06 Crystal Optimizations (Quantum Chemistry)• ….
National Center for Supercomputing Applications
GridChem User Services• Allocationhttps://www.gridchem.org/allocations/index.shtmlCommunity and External Registration Reviews, PI Registration and Access Creation Community User Norms Established
• Consulting/User Serviceshttps://www.gridchem.org/consultTicket tracking, Allocation Management
• Documentation, Training and Outreachhttps://www.gridchem.org/doc_train/index.shtmlFAQ Extraction, Tutorials, Dissemination
Help is integrated into the GridChem client
National Center for Supercomputing Applications
Users and Usage
• 433 Users under 221 Projects
Include Academic PIs, two graduate classes
And about 15 training users
More than a 2, 000, 000 CPU Wallhours
More than 35500 Jobs processed
5 Dissertations, More than 50 Publications
User Research
National Center for Supercomputing Applications
Diversity of User Research
NH3 on Si Surfaces
CytP450 Catalysis
ZeoliteChemistry
Phosphinoboranepercyclics
Semiquinonereactions
Si Surface IR Disulfide clevageby P-
Thiolate –SS interchange
V in photocatalysts
PES of diphenylbutadienes
FTIR of Heptanedione on Si
National Center for Supercomputing Applications
Science Enabled
• Azide Reactions for Controlling Clean Silicon Surface Chemistry: Benzylazide on Si(100)-2 x 1Semyon Bocharov et al..J. Am. Chem. Soc., 128 (29), 9300 -9301, 2006
• Chemistry of Diffusion Barrier Film Formation: Adsorption and Dissociation of Tetrakis(dimethylamino)titanium on Si(100)-2 × 1 Rodriguez-Reyes, J. C. F.; Teplyakov, A. V.J. Phys. Chem. C.; 2007; 111(12); 4800-4808.
• Computational Studies of [2+2] and [4+2] Pericyclic Reactions between Phosphinoboranes and Alkenes. Steric and Electronic Effects in Identifying a Reactive Phosphinoborane that Should Avoid Dimerization Thomas M. Gilbert* and Steven M. Bachrach Organometallics, 26 (10), 2672 -2678, 2007.
National Center for Supercomputing Applications
Science Enabled• Chemical Reactivity of the Biradicaloid (HO...ONO) Singlet
States of Peroxynitrous Acid. The Oxidation of Hydrocarbons, Sulfides, and Selenides. Bach, R. D et al. J. Am. Chem. Soc. 2005, 127, 3140-3155.
• The "Somersault" Mechanism for the P-450 Hydroxylation of Hydrocarbons. The Intervention of Transient Inverted Metastable Hydroperoxides. Bach, R. D.; Dmitrenko, O. J. Am. Chem. Soc. 2006, 128(5), 1474-1488.
• The Effect of Carbonyl Substitution on the Strain Energy of Small Ring Compounds and their Six-member Ring Reference Compounds Bach, R. D.; Dmitrenko, O. J. Am. Chem. Soc. 2006,128(14), 4598.
National Center for Supercomputing Applications
System Wide UsageHPC System Usage (SUs)
Tungsten(NCSA) 5507
Copper(NCSA) 86484
CCGcluster(NCSA) 55709
Condor(NCSA) 30
SDX(UKy) 116143
CCGCluster(UKy) .5
Longhorn(TACC) 54
CCGCluster(OSC) 62000
TGCluster(OSC) 36936
Cobalt(NCSA) 2485
Champion(TACC) 11
Mike4 (LSU) 14537
Force Field ParameterizationMolecular Force Fields require constant improvement
as new reference data becomes available (that can not be accommodated easily with existing sets)
New molecular systems become amenable for computational analysis
New models/potential energy functions/Hamiltonians for force are established
Coverage of force fields should constantly be extended to cover new fields of research/new functionality (nanomaterials, biomaterials and medicine,...)"
Cyberenvironments for Molecular Force Fields
• Extension of currently available models, with the resulting parameters sets to be made available publicly
• Databases of experimental and quantum mechanical reference data to be used in the parameterization process
• Integration of computational resources for data acquisition, automation of QM reference data generation
• Automation Extensible infrastructure for parameterization management for rapid and systematic parameterization of novel Hamiltonians (empirical and semi-empirical)
• Systematic improvement of parameter optimization processes
Accurate Force Fields Are needed
Published by AAAS
A. J. Stone Science 321, 787 -789 (2008)
Fig. 1. Errors (V) in electrostatic potential on a surface at 1.8 times van der Waals radii around N-methyl propanamide for two models. (Left) Point charges; (right) charge, dipole, and quadrupole on C, N, and O; charge and dipole on H. The errors are much reduced in the multipole approach
Compute ResourcesCompute
Resources
Resource MiddlewareResource
Middleware Cloud Interfaces Grid Middleware SSH & Resource Managers
Computational Clouds Computational Grids
Gateway ServicesGateway Services
User Interfaces
User Interfaces
Web/Gadget Container
Web Enabled Desktop
Applications
User Management
Auditing & Reporting
Fault Tolerance
Application Abstractions
Workflow System
Information Services
ApplicationMonitoring
Registry SecurityProvenance &
Metadata Management
Local Resources
Web/Gadget Interfaces
Gateway Abstraction Interfaces
Science Gateways Layer Cake
Color Coding
Dependent resource provider components
Complimentary Gateway Components
OGCE Gateway Components
GFac Current & Future Features
Input Handler
s
Input Handler
s
Scheduling Interface
Scheduling Interface
Auditing
Auditing
Monitoring Interface
Monitoring Interface
Data Management Abstraction
Data Management Abstraction
Job ManagementAbstraction
Job ManagementAbstraction
Fault Tolerance
Fault Tolerance
Output HandlersOutput
Handlers
Registry InterfaceRegistry Interface
Checkpoint Support
Checkpoint Support
GlobusGlobus
Campus Resourc
es
Campus Resourc
es
UnicoreUnicore
CondorCondor
Amazon Eucalyptu
s
Amazon Eucalyptu
s
Color Coding
Planned/Requested Features
Existing Features
OGCE Layered Workflow Architecture:Derived from LEAD Workflow System
Workflow Execution &
Control Engines
Workflow Execution &
Control Engines
Apache ODE
Workflow Specification
Workflow Specification
Workflow Interfaces (Design & Definition)
Workflow Interfaces (Design & Definition)
PythonBPEL 2.0
BPEL 1.0 Java Code Pegasus DAG
Scufl
XBaya GUI (Composition,
Deploying, Steering & Monitoring) Gadget Interface for
Input Binding
Condor DAGMan
Taverna
Dynamic Enactor
Jython InterpreterGBPEL
Flex/Web Composition
ParamChem-Xbaya-Pegasus• Input Workflow for GridChem/ParamChem created using Pegasus
JAVA DAX API
-- DAX can have combinations of tasks ( like Charmm/ multiple Gaussian tasks) each taking respective input file.
• The tasks can be mapped to either respective specific applications (like charmm/amber/g03 or g09 )based on a simple configuration.
• Input data (instructions, structure, topology, parameters) will be staged from middleware using GridFTP to the execute clusters (such as TeraGrid systems Mercury and Abe at NCSA).
• Jobs will be distributed across the multiple execute clusters using Round-Robin or other schema.
-- Any heuristics based scheduling is also possible.• Output files will be staged back from execute clusters to middleware
using GridFTP for post processing/archiving.
36
National Center for Supercomputing Applications
Some New GridChem Infrastructure• Workflow Editors• Coupled Application Execution• Large Scale Computing• Metadata and Archiving • Rich Client Platform Refactorization• Intergrid Interactions
• Open Source Distribution http://cvs.gridchem.org/cvs/
• Open Architecture and Implementation details http://www.gridchem.org/wiki
ParamChem Apache Axis2 Services
• NotificationService• ResourceService• TriggerService• SessionService• SoftwareService• JobService• Workflow Service• FileService• UserService• ProjectService
Cloud HPC Interoperability
National Center for Supercomputing Applications
The Cloud in our case is a part of over all resources for computing and storage They have to be usable interoperably along with other HPC and local resourcesParticular use will be for on-demand computing and high throughput computingCertain routine sensor enabled data dependent computing hydrological event monitoring and simulation could be handled by clouds for rapid on demand prediction of short term eventsThe interoperability requirements that enable data and computation movement from one resource to other should be explored.