WGISS Workshop, September 12, 2006 Cyberinfrastructure for Research and Education and its challenges...

32
WGISS Workshop, September 12, 2006 Cyberinfrastructure for Research and Education and its challenges Dr. Sebastien Goasguen Dr. Carol Song Rosen Center for Advanced Computing Purdue University [email protected] http://www.rcac.purdue.edu (Work presented here are supported by OCI-0438246 (NMI nanoHUB), OCI-0503992 (TeraGrid RP))

Transcript of WGISS Workshop, September 12, 2006 Cyberinfrastructure for Research and Education and its challenges...

WGISS Workshop, September 12, 2006

Cyberinfrastructure for Research and Education and its challenges

Dr. Sebastien GoasguenDr. Carol Song

Rosen Center for Advanced ComputingPurdue University

[email protected]://www.rcac.purdue.edu

(Work presented here are supported by OCI-0438246 (NMI nanoHUB), OCI-0503992 (TeraGrid RP))

WGISS Workshop, September 12, 2006

Highlights

• Infrastructure building– Community clusters, cycles harvest across campus, high

speed network links, storage capacity– Data collections

• System interoperability– Integrate computing infrastructures– Integrate services

• Enabling multidisciplinary research and education– Most design decisions are guided by this principle

WGISS Workshop, September 12, 2006

Outline

• TeraGrid– HPC through community resources and interoperability

• NanoHUB Science Gateway– Online Simulations & Education

• Multidisciplinary Data Management – Data source aggregation– Data Management – Workflow

• Seamless integration of grids and services– Integration with Education (Sakai, podcast, Merlot)– NanoHUB & TeraGrid, OSG– TG & campus infrastructures

WGISS Workshop, September 12, 2006(Charlie Catlett, TeraGrid Director, ANL)

$150M

WGISS Workshop, September 12, 2006

TeraGrid

• Grid Infrastructure Group (U Chicago)– TG integration, planning, management and

coordination.

• Resource Partners– 9 partners– Provide system resources, user support– Provide access to resources through policies, software

and other mechanism

• Individual PIs access TG high performance computing resources through unified user support, coordinated software and services, and extensive documentation and training.

WGISS Workshop, September 12, 2006

WGISS Workshop, September 12, 2006

WGISS Workshop, September 12, 2006

WGISS Workshop, September 12, 2006

TG Internals

Globus enabled resources

GT2 or GT4 WSRF

ssh using std unix practices

Globus submit

Condor-g submit

Wrappers scripts to get work done (compute, store, move data)

PBS, LSF etc…local batch system and schedulers

Condor talks to Globus talks to scheduler….

CTSS software Stack: HPC + Grid

WGISS Workshop, September 12, 2006

nanoHUB: online simulation and more

WGISS Workshop, September 12, 2006

NanoHUB Middleware

WGISS Workshop, September 12, 2006

“easy” access

ClusterTeraGrid

Remote access to simulators and compute power

Condor-GGlobus

Condor-GGlobus

internet

nanoHUB infrastructure

Browser(VNC)

nanoHUB.orgWeb site

Physical Machine

Virtual Machine

NMI Cluster

WGISS Workshop, September 12, 2006

Example: CNT simulation

WGISS Workshop, September 12, 2006

NanoHUB Learning Modules

WGISS Workshop, September 12, 2006

nanoHUB – Sakai Integration for Assessment Services

• Assessment of learning impact is a key metric• Sakai – Service-oriented Assessment Service Integration

WGISS Workshop, September 12, 2006

SAKAI Integration Architecture

WS Clientsakai_mambo.p

hp

SAF—Kernel

SAF—Common Services

Application Services

Tool Code

Tool Layout

Presentation

ServiceInterface (i.e. API)

Axis

WS End Point

FrameworFrameworkk

ApplicatioApplicationn

Web SvcsWeb Svcs

SAKAI

nanoHUB Learning Learning ModuleModuleLearning Learning

ModuleModuleLearning Learning ModuleModule

Session based launch

SakaiLogin.jwsSakaiSite.jws

WGISS Workshop, September 12, 2006

Workspaces

WGISS Workshop, September 12, 2006

nanoHUB Internals

Delegated trust

Local Virtual Machines

Migratable

Isolated from Local infrastructure

VIOLIN Virtual Cluster

Virtual Infrastructure over WAN

WGISS Workshop, September 12, 2006

Purdue TG Data Management System

TeraGrid network- Provides HPC, storage resources

Multidisciplinary scientific data

- Remote sensing, weather, modeling data

SRB middleware system developed at SDSC

- Provides distributed data management- Logical and System Attributes

Server-side data processing tools

- OPeNDAP/THREDDS data server

Web Services interface- File query, File listing, Metadata query, File download

Purdue TG data portal- JSR-168 compliant portlets, based on Gridsphere- Uses SRB Jargon API for data access

WGISS Workshop, September 12, 2006

LARS Dataset (Laboratory for Applications of Remote

Sensing)

• Multispectral and Hyperspectral remote sensing images for Indiana

• ERDAS LAN, Leica Geosystems Imagine, GeoTIFF, and HDF formats

• 1972 to 2004• IndianaView Glovis web

access– Part of the AmericaView

initiative– Funded through USGS– Graphical Interface for

viewing and downloading remote sensing image data

– http://indianaview.envision.purdue.edu/glovis/index.htm

WGISS Workshop, September 12, 2006

PTO Satellite Data(Purdue Terrestrial

Observatory)

• GOES-GVAR sensor (L band), 3.7m. fixed antenna, Feb. 2005.

• Terra-MODIS, Aqua-MODIS, NOAA-AVHRR and FY1-MVISR sensors (L- and X- band), 4.27 m. tracking antenna , April. 2006.

• 10 Node cluster data processing and visualization server, more than 25 different products.

WGISS Workshop, September 12, 2006

National Weather Service Data

• Next Generation Radar (NEXRAD) Level II data

• 159 Weather Surveillance Radar-1988 Doppler (WSR-88D) sites

• Real-time streaming, high-resolution data from the national network

• Reflectivity, mean radial velocity, and spectrum width

• One of the four top-level distributors

• THREDDS/OPeNDAP data servers

WGISS Workshop, September 12, 2006

CCSM Climate Simulation Data

• Community Climate System Model (CCSM) to simulate climate change on Earth

• Ocean, Land, and Atmospheric models

• NetCDF format• OPeNDAP server

provides post-processing functionalities

WGISS Workshop, September 12, 2006

Architecture

1. Data Capture– Commercial vendor HW, SW– Data drivers to

• Harvest and register meta data• Ingest data to SRB server• Normalize application data to standards

2. SRB (Storage Resource Broker) - SDSC– Stores data in logical collections, associated with meta data.– Stores raw and processed data for access– Meta data catalog (MCAT) in SDSC, data servers at Purdue.

3. Application layer: Integrates applications for enhanced data access– THREDDS (Thematic Real-time Environmental Distributed Data Services)

for Doppler radar data– OPeNDAP (Open-source Project for a Network Data Access Protocol) for

climate modeling data

4. Presentation layer• Gridsphere based portlets: browse, search, download data.

WGISS Workshop, September 12, 2006

Data Access

• Command line (SRB S-commands)– Sinit, Sls, Sget, Sexit

• Web Interface: MySRB • Windows GUI Client: inQ• OPeNDAP/THREDDS clients• Purdue Environmental Data Portal• Web Services

WGISS Workshop, September 12, 2006

Purdue Environmental Data Portal

WGISS Workshop, September 12, 2006

Climate Data Processing Workflow

WGISS Workshop, September 12, 2006

Bioscience Data & Applications

WGISS Workshop, September 12, 2006

Security Challenges of Interoperable Grid Infrastructure

Services Services

Services

SOA

Certificate Delegation

Trust level

WGISS Workshop, September 12, 2006

SOA with Authorization

Services

Services

Certificate Delegation

Policies

Trust level

Attribute Server

Authorization Policy

WGISS Workshop, September 12, 2006

ShibbolethIdP

Exec(Condor_Submit)

Mambo

Apache Web server

UsernamePassword

Back end

PHP scripting

LDAP

SAML authentication assertion

Front end

nanoShib

Attribute request SAML

assertion

(6) Attributes request

GlobusGatekeeper

<SAML>grid_proxy_init

nanoHUB Community Credential

Username+ Shibboleth IdP Id

Policy Information Point

SAML-enabled attributes handlers for GT4 -extract SAML assertion from proxy - query Shib AA based on SAML assertion from proxy- render access control decision based on attributes from Shib AA

TG RP

AA

VirtualMachine

(1)

(7) SAML authorization assertion

(5) Globus request

(2)(3)

(4)Attribute-based policies

WGISS Workshop, September 12, 2006

Rosen Center for Advanced ComputingPurdue University

IT Org supporting 21st Century Science

User Support

Science Gateway

Grid protocols

Systems Group

Security