Cyberinfrastructure and its Role in Science
-
Upload
cameron-kiddle -
Category
Technology
-
view
941 -
download
1
description
Transcript of Cyberinfrastructure and its Role in Science
![Page 1: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/1.jpg)
Cyberinfrastructure and its Role in Science
Cameron Kiddle
Research Fellow, Grid Research Centre
Adjunct Assistant Professor, Department of Computer Science, University of Calgary
Distributed Systems Architect, WestGrid
![Page 2: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/2.jpg)
Outline Challenges Cyberinfrastructure Cyberinfrastructure Technologies Examples
ICE Force Project Molecular Dynamics Simulations GT4-based Grid for Canada Fire Dynamics Simulator Rendering on the Cloud GeoChronos
IAI Summer School July 6, 2009
Cyberinfrastructure - 2
![Page 3: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/3.jpg)
Collaboration Challenges Familiarity/awareness of collaboration tools Keeping all interested parties in the loop Finding related work and researchers Keeping up to date with current research Collaboration while working in the field
IAI Summer School July 6, 2009
Cyberinfrastructure - 3
![Page 4: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/4.jpg)
Data Challenges Acquisition of data
Many different data sources Large quantities of data Different regulations/mechanisms for accessing data Lack of automation Finding the right data Bandwidth constraints
Managing data Scattered and unorganized data Inadequate tools for recording/maintaining metadata
Data without metadata is meaningless Lack of suitable metadata standards Validation of metadata
Tracking provenance of data Pre-processing of data
Raw data typically cannot be directly analyzed Significant amount of time spent preparing data for analysis Lack of automation
IAI Summer School July 6, 2009
Cyberinfrastructure - 4
![Page 5: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/5.jpg)
Application Challenges Limited availability of computing resources Access to and familiarity of heterogeneous
computing resources Fault tolerance and reliability Access to software available in research lab
while in field or other locations Installing, configuring and updating software System dependencies of software Awareness and suitability of available software Sharing applications and results
IAI Summer School July 6, 2009
Cyberinfrastructure - 5
![Page 6: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/6.jpg)
Cyberinfrastructure
IAI Summer School July 6, 2009
Cyberinfrastructure - 6
“Like the physical infrastructure of roads, bridges, power grids, telephone lines, and water systems that support modern society, "cyberinfrastructure" refers to the distributed computer, information and communication technologies combined with the personnel and integrating components that provide a long-term platform to empower the modern scientific research endeavor.”
Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure, 2003.
![Page 7: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/7.jpg)
Cyberinfrastructure Technologies Grid Computing Cloud Computing Virtualization Web 2.0 / Social Networking Web Portals / Scientific Gateways Semantic Web …
IAI Summer School July 6, 2009
Cyberinfrastructure - 7
![Page 8: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/8.jpg)
Grid Computing
IAI Summer School July 6, 2009
Cyberinfrastructure - 8
Many different definitions/uses computational grids, data grids, desktop grids, campus grids, sensor
grids, access grids Coordinated sharing of heterogeneous resources across
administrative domains
Resources Shared by Virtual Organization X
Resources Shared byVirtual Organization Y
Domain A
Domain B Domain C
![Page 9: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/9.jpg)
Grid Middleware
IAI Summer School July 6, 2009
Cyberinfrastructure - 9
The layer between users/applications and grid resources that glues everything together
Example grid middleware Globus Toolkit
GT2 – pre-standards GT4 – Web Services based
UNICORE gLite ARC NAREGI
![Page 10: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/10.jpg)
Key Grid Middleware Services Security Services
Concerned with authentication, authorization, secure communication, …
Information Services Provide information about resources, policy, services
and applications to tools and users Data Management Services
Manage movement and replication of data as well as metadata about data
Execution Management Services Handle placement, provisioning and lifetime
management of jobs and workflowsIAI Summer School July 6, 2009
Cyberinfrastructure - 10
![Page 11: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/11.jpg)
Benefits of Grid Computing Easier access to more resources
Users/organizations can share resources Single sign-on Common interface (hide heterogeneity)
Improved data management Efficient file transfers Abstraction of physical location of data
Automated execution of jobs and workflows
IAI Summer School July 6, 2009
Cyberinfrastructure - 11
![Page 12: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/12.jpg)
Example Grid Projects
IAI Summer School July 6, 2009
Cyberinfrastructure - 12
Name DescriptionLHC Computing Grid http://lcg.web.cern.ch/
data storage and analysis infrastructure for the high energy physics community using the Large Hadron Collider (LHC) at CERN (ATLAS Tier-1 site at TRIUMF in British Columbia)
Network for Earthquake Engineering Simulation (NEES) http://www.nees.org/
a US national network of 15 facilities to study the impact of earthquakes on buildings, bridges, etc.
Expanding GEOsciences on DEmand (EGEODE)
http://www.egeode.org/
a virtual organization (VO) associated with EGEE that is dedicated to research in geoscience for both public and private industrial R&D and academic laboratories
International Virtual Observatory Alliance (IVOA) http://www.ivoa.net/
development of standards and infrastructure to share and analyze astronomical archives from around the world
![Page 13: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/13.jpg)
Cloud Computing
IAI Summer School July 6, 2009
Cyberinfrastructure - 13
Transparent access to scalable and dynamic services over the Internet
Key features: Everything as a Service (EaaS) Utility/On-demand Accessibility/Transparency Scalability Virtualization
![Page 14: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/14.jpg)
Cloud Computing Solutions
IAI Summer School July 6, 2009
Cyberinfrastructure - 14
![Page 15: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/15.jpg)
Benefits of Cloud Computing
IAI Summer School July 6, 2009
Cyberinfrastructure - 15
Reduce capital, support and maintenance costs Pay only for what you use Get access to more/fewer resources when needed
Ready to use for users No more downloads, installations or updates
Simplify and speed up software development Don’t have to support multiple platforms
Application popularity and lifespan difficult to predict Scale applications according to user demand
![Page 16: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/16.jpg)
Cloud Computing Case Study: Application Popularity on Facebook
IAI Summer School July 6, 2009
Cyberinfrastructure - 16
Difficult to predict popularity and lifespan of applications
Facebook Application Growth Sep. 2007: ~ 3700 Sep. 2008: ~39000
Facebook Application Popularity (Sep. 12, 2008) 39181 applications Active user data for 37155
apps 3 apps > 10 million active users 80% apps < 1000 active users
Monthly Active Users vs.
Rank of Facebook Applications(September 12, 2008)
![Page 17: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/17.jpg)
Cloud Computing Case Study: Shrek (Dreamworks)
IAI Summer School July 6, 2009
Cyberinfrastructure - 17
Shrek (2001) – 5 million CPU render hours Shrek 2 (2004) – 10 million CPU render hours Shrek 3 (2007) – 20 million CPU render hours
Time to Render
1 CPU 100 CPUs 10000 CPUs
Shrek 571 years 5.7 years 21 days
Shrek 2 1142 years 11.4 years 42 days
Shrek 3 2283 years 22.8 years 83 days
(Source: R. Rowe. DreamWorks Animation "Shrek the Third": Linux Feeds an Ogre. Linux Journal. June 5, 2007. (http://www.linuxjournal.com/article/9653))
![Page 18: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/18.jpg)
Cloud Computing Case Study: Animoto
IAI Summer School July 6, 2009
Cyberinfrastructure - 18
Animoto (http://animoto.com) Produces professional quality videos from
images Runs on Amazon EC2
Popularity soared when promoted on Facebook
During the course of 4 days: Jumped from 8 to 450 renderings per minute ~20000 new users per hour 3500 instances running on Amazon EC2 at peak
(Source: D. Barker. You Need 3,500 Servers by When?! On-demand Enterprise. 2008.07.07)
![Page 19: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/19.jpg)
Virtualization
IAI Summer School July 6, 2009
Cyberinfrastructure - 19
Can transform a single physical machine into multiple virtual machines (VMs) each with their own OS and software stack
Virtualization software Xen, KVM, VMWare Support allocation, deallocation, checkpointing and
migration of VMs Benefits
Custom environments (root access) More efficient use of resources (consolidation) System maintenance without disruption
![Page 20: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/20.jpg)
Web 2.0 – The “Social Web”
IAI Summer School July 6, 2009
Cyberinfrastructure - 20
Aimed at: Providing feature rich user environments Making it easier for users to generate Web content Improving online social connectivity
Example Web 2.0 technologies Blogs (WordPress, TypePad) Wikis (Wikipedia) Mashups (HousingMaps, ChicagoCrime) Widgets/Gadgets (iGoogle, Netvibes) Social networks (Facebook, MySpace, YouTube)
![Page 21: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/21.jpg)
Social Networking Sites/Platforms
IAI Summer School July 6, 2009
Cyberinfrastructure - 21
![Page 22: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/22.jpg)
Web Portals / Scientific Gateways
IAI Summer School July 6, 2009
Cyberinfrastructure - 22
Aimed at providing a community of users access to computing resources through a common Web-based interface
Web portal development tools GridSphere (portlet based) Web 2.0/Social Networking
Examples TeraGrid Scientific Gateways (over 30 of them) nanoHUB
![Page 23: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/23.jpg)
Semantic Web Aimed at representing knowledge, not just
information Connecting and relating data in a way
understandable by machines Semantic Web standards
Resource Description Framework (RDF) Web Ontology Language (OWL)
IAI Summer School July 6, 2009
Cyberinfrastructure - 23
![Page 24: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/24.jpg)
IAI Summer School July 6, 2009
Cyberinfrastructure -
Confederation Bridge ICE Force Monitoring Project
Monitoring of forces on the Confederation Bridge Data analyzed by civil engineering groups at University of
Calgary and Carleton University GRC developed solution to automate data management
as part of a CANARIE AAP project
(http://www.confederationbridge.com) (http://www.confederationbridge.com)
2424
![Page 25: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/25.jpg)
IAI Summer School July 6, 2009
Cyberinfrastructure -
ICE Force - Technologies Used Grid Middleware
GT4 Data Management
Proactive Data Management Service (PDMS) Data Transfer - GridFTP, RFT Replication Management – RLS Metadata Management - MCS
25
![Page 26: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/26.jpg)
IAI Summer School July 6, 2009
Cyberinfrastructure -
Molecular Dynamics Simulations (GROMACS)
GROMACS Parallel molecular dynamics
simulation application Can simulate hundreds to
millions of particles Simulation runs can take
days, weeks or months Issues with long running
jobs Fault tolerance Scheduler policy constraints
(http://moose.bio.ucalgary.ca/)
26
![Page 27: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/27.jpg)
IAI Summer School July 6, 2009
Cyberinfrastructure -
GROMACS - Grid Enabled Solution Automated grid enabled solution developed
by GRC to manage GROMACS simulations as part of a CANARIE AAP project
Long jobs split into a series of shorter jobs Automates checkpointing, migration and
reconfiguration of jobs
27
![Page 28: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/28.jpg)
IAI Summer School July 6, 2009
Cyberinfrastructure -
GROMACS - Portal
28
![Page 29: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/29.jpg)
IAI Summer School July 6, 2009
Cyberinfrastructure -
GROMACS - Technologies Used Grid Middleware
GT4 Information Services
WS MDS Data Management
PDMS (GridFTP, RFT, RLS, MCS) Execution Management
Custom system (Condor-G, WS GRAM) Portal
GridSphere
29
![Page 30: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/30.jpg)
IAI Summer School July 6, 2009
Cyberinfrastructure -
Web Service based Grid Environment for Canada Established a GT4-based grid environment from
resources across Canada (CANARIE CIIP)
30
![Page 31: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/31.jpg)
IAI Summer School July 6, 2009
Cyberinfrastructure -
GT4-based Grid - Model Schemas Models developed to describe systems, applications
and scheduler policy (GRC Model Schema)
System Model Class Diagram
31
![Page 32: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/32.jpg)
32IAI Summer School July 6, 2009
Cyberinfrastructure -
GT4-based Grid – Viewing Resource Information Used WebMDS, a customizable Web based interface
for viewing resource information published by WS MDS
![Page 33: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/33.jpg)
GT4-based Grid - Technologies Used
IAI Summer School July 6, 2009
Cyberinfrastructure - 33
Grid Middleware GT4
Data Management GridFTP, RFT
Information Services GRC Model Schema, WS MDS, WebMDS
Execution Management Condor-G, WS GRAM
![Page 34: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/34.jpg)
IAI Summer School July 6, 2009
Cyberinfrastructure -
Example: Fire Simulation Developed a comprehensive
environment for the Fire Dynamics Simulator (FDS) as part of a collaborative project between GRC and HP Labs
Deployed on HP Labs Data Centre at University of Calgary
Initial focus of project Leverage Web 2.0 technologies Explore use of virtualization in a
utility/cloud computing environment
34
![Page 35: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/35.jpg)
IAI Summer School July 6, 2009
Cyberinfrastructure -
Fire Simulation - Technologies Used User level
Web 2.0/social networking technology (Facebook)
Service provider level LAMP environment (Linux, Apache, MySQL,
Perl/Python/PHP) Simulation (FDS, Condor) Visualization (Smokeview, VNC)
Resource (utility) provider level Cloud computing technology (ASPEN) Virtual machine technology (Xen)
35
![Page 36: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/36.jpg)
IAI Summer School July 6, 2009
Cyberinfrastructure -
Example: Rendering on the Cloud GRC created an on-
demand cloud rendering service for EDM Studio
Cybera Pilot Project Technologies used:
Cloud computing technology (ASPEN)
Virtual machine technology (Xen)
Social networking technology (Ning/Elgg)
36
![Page 37: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/37.jpg)
An on-line platform For:
Earth Observation Scientists Facilitating:
Collaboration between scientists Data access, management and sharing Application access, management and sharing
Leveraging: Web 2.0 / social networking technologies (Elgg) Semantic Web technologies (RDF, OWL) Cloud computing and virtualization technologies (ASPEN,
Xen)
IAI Summer School July 6, 2009
Cyberinfrastructure - 37
![Page 38: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/38.jpg)
GeoChronos - Collaboration
Social networking portal Elgg-based (elgg.org)
Social networking services Blogs Tags Media/document sharing Wikis Friends/contacts Groups Discussions Message boards Calendars Status News Feeds
IAI Summer School July 6, 2009
Cyberinfrastructure -
http://geochronos.org/
38
![Page 39: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/39.jpg)
GeoChronos - Data Data Acquisition
Automated acquisition of data from sensors (ground, airborne, satellite) or third party
Data Storage Store, share, browse and
search data i.e., spectral library
Data Processing Automated data workflows
i.e., mosaic, reproject and subset MODIS data
IAI Summer School July 6, 2009
Cyberinfrastructure - 39
![Page 40: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/40.jpg)
GeoChronos - Applications Interactive Application
Service (IAS) On-line, on-demand access to
scientific applications Share application sessions and
data with other users Access control to applications
Batch Processing Service Batch processing environment
for longer running data processing tasks or simulations
For use directly by individual users or as part of automated data workflows
IAI Summer School July 6, 2009
Cyberinfrastructure - 40
![Page 41: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/41.jpg)
GeoChronos - Project Team
IAI Summer School July 6, 2009
Cyberinfrastructure -
Dr. Arturo Sanchez-AzofeifaUniversity of Alberta
Dr. John GamonUniversity of Alberta
Dr. Benoit RivardUniversity of Victoria
Dr. Rob SimmondsUniversity of Calgary
Prinicipal Investigators
Project Coordination Platform Development Domain Scientists
41
![Page 42: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/42.jpg)
GeoChronos - Virtual Organization
IAI Summer School July 6, 2009
Cyberinfrastructure - 42
![Page 43: Cyberinfrastructure and its Role in Science](https://reader033.fdocuments.in/reader033/viewer/2022052823/55504fc6b4c905b2788b52dd/html5/thumbnails/43.jpg)
Contact Information
IAI Summer School July 6, 2009
Cyberinfrastructure -
Cameron [email protected]://pages.cspc.ucalgary.ca/~kiddlec/
http://grid.ucalgary.ca/
43