Cyberinfrastructure & Grid Computing in New York State ... · 6/11/2007 · Grass-Roots...
Transcript of Cyberinfrastructure & Grid Computing in New York State ... · 6/11/2007 · Grass-Roots...
NSF, NIH, DOE, NIMA, NYS, HP www.cse.buffalo.edu/faculty/miller/CI/
Cyberinfrastructure & Grid Computing in New York State
Russ MillerCyberinfrastructure Lab, SUNY-Buffalo
Hauptman-Woodward Med Res Inst
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
Cyberinfrastructure
Generic: transparent and ubiquitous application of technologies central to contemporary engineering and scienceNSF: “comprehensive phenomenon that involves creation, dissemination, preservation, and application of knowledge”Foster & Kesselman: “a domain-independent computational infrastructure designed to support science.”NSF Cyberinfrastructure (OCI)
HPC Hardware and SoftwareData CollectionsScience Gateways/Virtual OrganizationsSupport of Next Generation Observing Systems
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
NSF Integrated CI
NSF Integrated Cyberinfrastructure
NSF Director Arden L. Bement: “leadership in cyberinfrastructure may determine America's continued ability to innovate – and thus our ability to compete successfully in the global arena.”
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
Cyberinfrastructure in Academia
Empower students to compete in knowledge-based economyEmbrace digital data-driven societyAccelerate discovery and comprehensionEnhance virtual organizationsProvide increased education, outreach, and trainingEnhance and expand relationships between academia and the corporate world
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
Critical Academic Initiatives
Create links between enabling technologists and disciplinary usersImprove efficiency of knowledge-driven applications in myriad disciplines
New TechniquesNew AlgorithmsNew Interactions (people & systems)
Support HPC infrastructure, research, and applications Deliver high-end cyberinfrastructure to enable efficient
Collection of data Management/Organization of dataDistribution of dataAnalysis of dataVisualization of data
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
Grid Computing
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
Grid Computing Overview
Coordinate Computing Resources, People, Instruments in Dynamic Geographically-Distributed Multi-Institutional EnvironmentTreat Computing Resources like Commodities
Compute cycles, data storage, instruments Human communication environments
No Central Control; No Trust
Imaging Instruments Large-Scale Databases
Data Acquisition AnalysisAdvanced Visualization
Computational ResourcesLHC
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
Major Grid InitiativesTeraGrid (NSF)
Integrates High-End ResourcesHigh-Performance (Dedicated) Networks9 Sites; 250TF & 30PB100+ Databases Available
OSG (DOE, NSF)High-Throughput Distributed FacilityOpen & HeterogeneousBiology, Computer Science, Astrophysics, LHC57 Compute Sites; 11 Storage Sites; 10K CPUS; 6PB
EGEE: Enabling Grids for E-SciencE (European Commision)Initial Focus on CERN (5PB of Data/Year)
High-Energy Physics and Life SciencesExpanded Focus Includes Virtually All Scientific Domains200 Institutions; 40 Countries20K+ CPUs; 5PB; 25,000 jobs per day!
A Depiction of Cyber-enabled Discover(NSF)
Theory
Experimentation
Insight
Data
RepresentationComputationInterpretation
“The Old Way: Classical Science”
Data
How will CDI (Cyber-Enabled Discover and Innovation) change the way we will think about and practice science?
Where, for the most part, classical
science is practiced in a constrained
domain.
A Depiction of Cyber-enabled Discover
(NSF)
Theory
Experimentation
Insight
Data
RepresentationComputationInterpretation Classical
Science
Data
“The New Way: Cyber-Enabled Science”
Example:Discovering causality of a
phenomenon, such as a disease or weather pattern, by mining massive
Data stores.
For biologists, meteorologists, computational scientists, and others, a “cyber-layer” between their problem space and the way they traditionally engaged in science and may provide new sources of data, new ways of visualizing the data, new tools for extracting meaning from the
data, new ways to collaborate with other scientists, and other enabling mechanisms and computational tools for discovery in new areas of science.
A Depiction of Cyber-enabled Discover
(NSF)
Theory
Experimentation
Insight
Data
RepresentationComputationInterpretation
DiscoveryDiscovery Discovery Discovery
Classical Science
Domain and Computational Filters and New Tools Sets(e.g. data generation, data representation, data mining,
visualization, modeling, simulation, etc.)
Data
Cyber-enabled enhancement of classical
science
Ubiquitous, cyber-enabled discovery across many fields.
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
Miller’s Cyberinfrastructure Lab
CI sits at core of modern simulation & modelingCI allows for new methods of investigation to address previously unsolvable problemsFocus on development of
AlgorithmsPortalsInterfaces Middleware
Free end-users to do disciplinary workFunding (2001-pres): NSF ITR, NSF CRI, NSF MRI, NYS, Fed
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
CI Lab CollaborationsHigh-Performance Networking InfrastructureGrid3+ CollaborationiVDGL Member
Only External MemberOpen Science Grid
GRASE VONYSGrid.org
NYS CI InitiativeFndg Executive DirectorVarious WGs
Grid-Lite: Campus GridHP Labs Collaboration
Innovative Laboratory PrototypeDell Collaboration
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
Evolution of CI Lab ProjectsACDC-Grid
Experimental Grid: Globus & CondorIntegrate Data & Compute, Monitor, Portal, Node Swapping, Predictive Scheduling/Resource ManagementGRASE VO: Structural Biology, Groundwater Modeling, Earthquake Eng, Comp Chemistry, GIS/BioHazardsBuffalo, Buffalo State, Canisius, Hauptman-Woodward
WNY GridHeterogeneous System: Hardware, Networking, UtilizationBuffalo, Geneseo, Hauptman-Woodward, Niagara
NYS GridExtension to Hardened Production-Level System State-WideAlbany, Binghamton, Buffalo, Geneseo, Canisius, Columbia, HWI, Niagara, [Cornell, NYU, RIT, Rochester, Syracuse, Marist], {Stony Brook, RPI, Iona}VOs: Engage, GADU, GRASE, Nanohub, SDSS, USATLAS, USCMS
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
NYS Grid ResourcesAlbany: 8 Dual-Processor Xeon NodesBinghamton: 15 Dual-Processor Xeon NodesBuffalo: 1050 Dual-Processor Xeon NodesCornell: 30 Dual-Processor Xeon NodesGeneseo State: Sun/AMD with 128 Compute CoresHauptman-Woodward Institute: 50 Dual-Core G5 NodesMarist: 9 P4 NodesNiagara University: 64 Dual-Processor Xeon NodesNYU: 58 Dual-Processor PowerPC NodesRIT: 4 Dual-Processor Xeon NodesSyracuse: 8 Dual-Processor Xeon Nodes
CI Lab Grid Monitor: http://osg.ccr.buffalo.edu/
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
Predictive Scheduler
Build profiles based on statistical analysis of logs of past jobs
Per User/Group Per Resource
Use these profiles to predict runtimes of new jobsMake use of these predictions to determine
Resources to be utilizedAvailability of Backfill
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
Small number (40) of CPUs were dedicated at nightAn additional 400 CPUs were dynamically allocated during the dayNo human intervention was requiredGrid applications were able to utilize the resources and surpassed the Grid3 goals
ACDC-Grid Dynamic Resource Allocation at SC03 with Grid3
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
ACDC-Grid Data Grid Functionality
Basic file management functions are accessible via a platform-independent web interface.User-friendly menus/interface.File Upload/Download to/from the Data Grid Portal.Simple Web-based file editor.Efficient search utility.Logical display of files (user/ group/ public).Ability to logically display files based on metadata (file name, size, modification date, etc.)
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
Structural BiologySnB and BnP for Molecular Structure Determination/Phasing
Groundwater ModelingOstrich: Optimization and Parameter Estimation ToolPOMGL: Princeton Ocean Model Great Lakes for Hydrodynamic CirculationSplit: Modeling Groundwater Flow with Analytic Element Method
Earthquake EngineeringEADR: Evolutionary Aseismic Design and Retrofit; Passive Energy Dissipation System for Designing Earthquake Resilient Structures
Computational ChemistryQ-Chem: Quantum Chemistry Package
Geographic Information Systems & BioHazardsTitan: Computational Modeling of Hazardous Geophysical Mass Flows
Grid-Enabling Application Templates (GATs)
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
Grid Enabled Shake-and-Bake(Molecular Structure Determination)
Required Layered Grid ServicesGrid-enabled Application Layer
Shake–and–Bake applicationApache web serverMySQL database
High-level Service LayerGlobus, NWS, PHP, Fortran, and C
Core Service LayerMetacomputing Directory Service, Globus Security Interface,
GRAM, GASSLocal Service Layer
Condor, MPI, PBS, Maui, WINNT, IRIX, Solaris, RedHat Linux
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
NYS Grid Portal
Startup Screen for CI Lab Grid Job Submission
Instructions and Description for Running a Job on ACDC-Grid
Software Package Selection
Full Structure / Substructure Template Selection
Default Parameters Based on Template
SnB Review (Grid job ID: 447)
Graphical Representation of Intermediate Job Status
Histogram of Completed Trial Structures
Status of Jobs
User starts up – default image of structure.
Heterogeneous Back-End Interactive Collaboratory
Molecule scaled, rotated, and labeled.
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
Binghamton UniversityGrid Computing Research LaboratoryDrs. Kenneth Chiu, Madhu Govindaraju, and Michael Lewis.Techniques for Web and grid service performance optimization Component frameworks for grids Instruments and sensors for grid environments Adaptive information dissemination protocols across grid overlays Emulation framework for grid computation on multi-core processorsSecure grid data transferwww.grid.cs.binghamton.edu/
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
NYSGrid.org
Grass-Roots Cyberinfrastructure Initiative in NYS.Open to academic and research institutions.Mission Stmt: To create and advance collaborative technological infrastructure that supports and enhances the research and educational missions of institutions in NYS.Enable Research, Scholarship, and Economic Development in NYS.To date, no utilization on any Grid through their VO.www.nysgrid.orgRecently made part of NYSERNet.
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
Acknowledgments
Mark GreenCathy RubyAmin GhadersohiNaimesh ShahSteve GalloJason RappleyeJon BednaszSam GuercioMartins InnusCynthia Cornelius
George DeTittaHerb HauptmanCharles WeeksSteve Potter
Alan RabideauIgor JanckovicMichael Sheridan Abani PatraMatt Jones
NSF ITRNSF CRINSF MRINYSCCR
University at Buffalo The State University of New York CI LabCyberinfrastructure Laboratory
www.cse.buffalo.edu/faculty/miller