GridChemA Computational Chemistry
Cyber-infrastructureUsing Web services
Sanibel Symposium 23 Feb 07
Sudhakar PamidighantamNCSA, University of Illinois at
Outline
• Historical Background Grid Chemistry
• Current Status Web Services Usage
• Brief Demo
• Future
MotivationSoftware - Reasonably Mature and easy to use to address
chemists questions of interest
Community of Users - Need and capable of using the software Some are non traditional computational chemists
Resources - Various in capacity and capability
Background
Qauntum Chemistry Remote Job Monitor( Quantum Chemistry Workbench)1998, NCSA
Chemviz1999-2001, NSF
TechnologiesWeb Based Client Server ModelsVisual InterfacesDistributed computing
GridChem
NCSA Alliance was commissioned 1998
Diverse HPC systems deployed
both at NCSA and Alliance Partner Sites
Batch schedulers different at sites
Policies favored different classes and modes of
use at different sites/HPC systems
Grid and Gridlock
Alliance lead to Physical Grid
Grid lead to TeraGrid
Homogenous Grid was planned but it was difficult to keep it homogenous
Things got more complicated and we have heterogeneous grids now!
Interoperability and Standards and Openness Are Critical
User Community
Chemistry and Computational Biology
User BaseSep 03 – Oct 04
NRAC AAB Small Allocations
-------------------------------------------------------------
#PIs 26 23 64
#SUs 5,953,100 1,374,100 640,000
User Issues• New systems meant learning new commands• Porting Codes• Learning new job submissions and
monitoring protocols• New proposals for time• Computational modeling became more
popular and users increased • Batch queues are longer / waiting increased• Find resources where to compute - probably
multiple distributed sites• Multiple proposals/allocations/logins• Authentication and Data Security • Data management
Computational Chemistry Grid
Integrated Cyber Infrastructure for Computational Chemistry
Integrates Applications, Middleware, HPC
resources, Scheduling and Data
management
Allocations, User Services and Training
Resources
System (Site) Procs Avail
Total CPU Hours/Year
Status
Intel Cluster (OSC) 36 315,000 SMP and Cluster nodes
HP Integrity Superdome (UKy)
33 290,000 TB Replaced with an SMP/ Cluster nodes
IA32 Linux Cluster (NCSA)
64 560,000
Intel Cluster (LSU) 1024 1,000,000
IBM Power4 (TACC) 16 140,000
Teragrid (Multiple Institutions)
250,000 New Allocation Expected
Other Resources
Extant HPC resources at various
Supercomputer Centers (Interoperable)
Optionally Other Grids and Hubs/local/personal
resources
These may require existing allocations/Authorization
Grid Middleware Proxy Server
GridChem System
user user useruser user
PPortal Clientortal Client
Grid ServicesGrid Services
GridGrid
applicationapplicationapplicationapplication
Mass Storage
http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=0438312
Applications
• GridChem supports some apps already– Gaussian 98/03, GAMESS, NWChem, Molpro, QMCPack,
Amber
• Schedule of integration of additional software– ACES-2– Crystal– Q-Chem– Wein2K– MCCCS Towhee – More …..
WS
XML is used to tag the data, SOAP is used to transfer the data, WSDL is used for describing the services available and UDDI is used for listing what services are available.
Web Services is different from Web Page Systems or Web Servers:There is no GUIWeb Services Share business logic, data & processes through API with each other (not with user)Web Services describe Standard way of interacting with “web based” applications
A client program connecting to a web service can read the WSDL to determine what functions are available on the server. Any special datatypes used are embedded in the WSDL file in the form of XML Schema. Universal Description, Discovery, and Integration. WSRF Standards Compliant.
Client Objects Database Interaction
WSResources
DTOClient
Objects Hibernate
Databasehb.xml
DTO (Data Transfer Object)Serialize transfer through XML
DAO (Data Access Object) How to get the DB objectshb.xml (Hibernate Data Map)
describes obj/column data mapping
BusinessModel
DAO
Database Table Relationships
Users Projects Resources
UserProjectResource
SoftwareResources
ComputeResources
NetworkResoruces
StorageResources
Resources
resoruceIDTypehostNameIPAddresssiteID
userIDprojectIDresourceIDloginNameSUsLocalUserUsed
JobsjobIDjobNameuserIDprojIDsoftIDcost
Users Resources
GMS_WS Use Cases
• Authentication
• Job Submission
• Resource Monitoring
• File Retrieval
http://www.gridchem.org:8668/space/GMS/usecase
GMS_WS Authentication
• WSDL (Web Service Definition Language) is a language for describing how to interface with XML-based services. It describes network services as a pair of endpoints operating on messages with either document-oriented or procedure-oriented information.
• The service interface is called the port type • WSDL FILE: <?xml version="1.0" encoding="UTF-8"?> <definitions name="MathService"
targetNamespace="http://www.globus.org/namespaces/examples/core/MathService_instance" xmlns="http://schemas.xmlsoap.org/wsdl/" …
http://www.gridchem.org:8668/space/GMS/usecase
Contact GMSCreates Session, Session RP and EPRSends EPR
Login Request(username:passwd)
Validates, Loads UserProjectsSends acknowledgement
Retrieve UserProjects(GetResourceProperty port Type PT)
GC Client GMS
GMS_WS Authenticationhttp://www.gridchem.org:8668/space/GMS/usecase
Selects projectLoadVO port type(w. MAC address)
Verifies user/project/MACaddrLoad UserResources RP
Retrieve UserResources[as userVO/ Profile](GetResourceProperty port Type PT)
GC Client GMS
Validates, Loads UserProjectsSends acknowledgement
Sends acknowledgement
GMS_WS Job Submission
Create Job objectPredictJobStartTime PT + JobDTO
JobStart Prediction RP
PT = portType RP = Resource PropertiesDTO = Data Transfer Object
Completion:Email from batch systemto GMS servercron@GMS DB
SubmissionCoGKitGAT“gsi-ssh”
If decision OK,SubmitJob PT + JobDTO
Create Job objectAPI—SubmitStore Job Object
Send Acknowledgement
Need to check to make sure allocation-time is available.
GC Client GMS
GMS_WS Monitoring
Parse XML,Display
PT = portType RP = Resource PropertiesDTO = Data Transfer ObjectDB = Data Base
cron@GMS servercron@HPC ServersJob Launcher NotificationsVO Admin emailparses email DB(status + cost)
Request for Job,Resource StatusAlloc. Balance
UserResource RP Updated from DB
GC Client GMS Resources/Kits/DB
Send info
GMS_WS File Retrieval
GetResourceProperty PTFileDTO(?)LoadFile PT(project folder+job)
Validates projectfolder owned by user.Send new listing
PT = portType RP = Resource PropertiesDTO = Data Transfer ObjectMSS = Mass Storage System
Job Completion:Send Output to MSS
LoadFile PT MSS queryUserFiles RP +FileDTO object
Retrieve Root Dir. Listing on MSS withCoGKit orGAT or“gsi-ssh”
Should whole directory be evaluated (may be large)—why not just those owned by user?
API file requestStore locallyCreate FileDTOLoad into UserData RP
RetrieveFiles PT(+file rel.path)
Retrieve file:CoGKit orGAT or“gsi-ssh”
GetResourceProperty PT
GC Client GMS Resources/Kits/DB
GMS_WS File Retrieval
PT = portType RP = Resource PropertiesDTO = Data Transfer ObjectMSS = Mass Storage System
Create FileDTO (?)Load into UserData RP
Should whole directory be evaluated (may be large)—why not just those owned by user?
RetrieveJobOutput PT(+JobDTO)
Job Record fromDB.Running: from ResourceComplete: from MSS
Retrieve file:CoGKit orGAT or“gsiftp”
GetResourceProperty PT
GC Client GMS Resources/Kits/DB
Web Services
WSRF (Web Services Resource Framework) Compliant WSRF Specifications:WS-ResourceProperties (WSRF-RP)
WS-ResourceLifetime (WSRF-RL) WS-ServiceGroup (WSRF-SG) WS-BaseFaults (WSRF-BF)
%ps -aux | grep ws/usr/java/jdk1.5.0_05/bin/java \-Dlog4j.configuration=container-log4j.properties \-DGLOBUS_LOCATION=/usr/local/globus \-Djava.endorsed.dirs=/usr/local/globus/endorsed \-DGLOBUS_HOSTNAME=derrick.tacc.utexas.edu \-DGLOBUS_TCP_PORT_RANGE=62500,64500 \-Djava.security.egd=/dev/urandom \-classpath /usr/local/globus/lib/bootstrap.jar: /usr/local/globus/lib/cog-url.jar: /usr/local/globus/lib/axis-url.jar org.globus.bootstrap.Bootstrap org.globus.wsrf.container.ServiceContainer -nosec
Logging ConfigurationWhere to find Globus
Where to get random seedfor encryption key generation
Classpath (required jars)
model
dto
credential
job
notification
file file.taskjob.task
user
exceptions
resource
persistence
synchquery
test
util
dao
gpir
cryptenumeratorsgatproxy
GMS_WS
client
audit
gms Classes for WSRF service implementation (PT)Cmd line tests to mimic client requestsData Access Obj – queries DB via persistent classes (hibernate)Data Transfer Obj – (job,File,Hardware,Software,User) XMLHow to handle errors (exceptions)CCG Service business mode (how to interact)Contains user’s credentials 4 job sub. file browsing,…“Oversees correct” handling of user data (get/putfile).Define Job & util & enumerations (SubmitTask, KillTask,…)
CCGResource&Util, Synched by GPIR, abstract classesNetworkRes., ComputeRes., SoftwareRes., StorageRes., VisualizationRes.
User (has attributes – Preference/Address)DB operations (CRUD), OR Maps, pool mgmt,DB session,Classes that communicate with other web services
Periodically update DB with GPIR info (GPIR calls)JUnit service test (gms.properties): authen. VO retrieval, Res.Query,Synch, Job Mgmt, File Mgmt, NotificationContains utility and singleton classes for the service.Encryption of login passwordMapping from GMS_WS enumeration classes DBGAT util classes: GATContext & GAT Preferences generationClasses deal with CoGKit configuration.
Autonomous notification via email, IM, textmesg.
GMS_WS external jars
• Testing
• For XML Parsing
• “Java” Document Object Model – Lightweight– Reading/Writing XML Docs– Complements SAX (parser) & DOM– Uses Collections**
Molecular Visualization
Better molecule representations(Ball and Stick/VDW/MS)
In Nanocad Molecular Editor Third party visualizer integration Chime/VMD
Export Possibilities to others interfaces Deliver standard file formats
(XML,SDF,MSF,Smiles etc…)
Eigen Function Visualization
• Molecular Orbital/Fragment Orbital
• MO Density Visualization
• MO Density Properties
• Other functions
Radial distribution functions
Spectra
• IR/Raman Vibrotational Spectra
• UV Visible Spectra
• Spectra to Normal Modes
• Spectra to Orbitals
GridChem Use
• Allocation
Community and External Registration
• Consulting/User Services
Ticket tracking, Allocation Management
• Documentation Training and Outreach
FAQ Extraction, Tutorials, Dissemination
Users and Usage
• 170 Users
Include Academic PIs, two graduate classes
And about 15 training users• NCSA 57000 SUs + A 7 node dedicated system• UKy around 106766 SUs• OSC 13,820 SUs + A 14 node dedicated system• Usage at LSU and TACC as well
More than a 335000 CPU Wallhours since Jan 06.
Science Enabled
• Chemical Reactivity of the Biradicaloid (HO...ONO) Singlet States of Peroxynitrous Acid. The Oxidation of Hydrocarbons, Sulfides, and Selenides. Bach, R. D.; Dmitrenko, O.; Estévez, C. M. J. Am. Chem. Soc. 2005, 127, 3140-3155.
• The "Somersault" Mechanism for the P-450 Hydroxylation of Hydrocarbons. The Intervention of Transient Inverted Metastable Hydroperoxides. Bach, R. D.; Dmitrenko, O. J. Am. Chem. Soc. 2006, 128(5), 1474-1488.
• The Effect of Carbonyl Substitution on the Strain Energy of Small Ring Compounds and their Six-member Ring Reference Compounds Bach, R. D.; Dmitrenko, O. J. Am. Chem. Soc. 2006,128(14), 4598.
Science Enabled
• Azide Reactions for Controlling Clean Silicon Surface Chemistry:Benzylazide on Si(100)-2 1Semyon Bocharov, Olga Dmitrenko, Lucila P. Mendez De Leo, and Andrew V. Teplyakov*Department of Chemistry and Biochemistry, UniVersity of Delaware, Newark, Delaware 19716Received April 13, 2006; E-mail: [email protected]
http://pubs.acs.org.proxy2.library.uiuc.edu/cgi-bin/asap.cgi/jacsat/asap/pdf/ja0623663.pdf [May require ACS access]
Third Year Plans
• Post Processing
• New Application Support
• Expansion of Resources
• Extension Plan
Acknowledgments
• Rion Dooley, TACC Middleware Infrastructure
• Stelios Kyriacou, OSC Middleware Scripts
• Chona Guiang, TACC Databases and Applications
• Kent Milfeld, TACC Database Integration • Kailash Kotwani, NCSA, Applications and Middleware
• Scott Brozell, OSC, Applications and Testing
• Michael Sheetz, UKy, Application Interfaces
• Vikram Gazula, UKy, Server Administration
• Tom Roney, NCSA, Server and Database Maintaienance
Top Related