Data Gateways for Scientific Communities Birds of a Feather (BoF) Tuesday, June 10, 2008 Craig...

10
Data Gateways for Scientific Communities Birds of a Feather (BoF) Tuesday, June 10, 2008 Craig Stewart (Indiana University) [email protected] Chris Jordan (Texas Advanced Computing Center, University of Texas at Austin) [email protected] Stephen Simms (Indiana University) [email protected]

Transcript of Data Gateways for Scientific Communities Birds of a Feather (BoF) Tuesday, June 10, 2008 Craig...

Data Gateways forScientific Communities

Birds of a Feather (BoF)Tuesday, June 10, 2008

Craig Stewart (Indiana University) [email protected]

Chris Jordan (Texas Advanced Computing Center, University of Texas at Austin) [email protected]

Stephen Simms (Indiana University) [email protected]

License Terms• Please cite this presentation as: Stewart, C.A. , C. Jordan, S. Simms. 2008. Data

gateways for scientific communities – Birds of a Feather Session. (Presentation). 10 June 2008, TeraGrid ‘08 Conference, 9-13 June, Las Vegas, NV. Available from: http://hdl.handle.net/2022/14528

• Portions of this document that originated from sources outside IU are shown here and used by permission or under licenses indicated within this document.

• Items indicated with a © are under copyright and used here with permission. Such items may not be reused without permission from the holder of copyright except where license terms noted on a slide permit reuse.

• Except where otherwise noted, the contents of this presentation are copyright 2008 by the Trustees of Indiana University. This content is released under the Creative Commons Attribution 3.0 Unported license (http://creativecommons.org/licenses/by/3.0/). This license includes the following terms: You are free to share – to copy, distribute and transmit the work and to remix – to adapt the work under the following conditions: attribution – you must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). For any reuse or distribution, you must make clear to others the license terms of this work.

2

Science Gateways

www.nanohub.org

A Science Gateway is a domain-specific computing environment, typically accessed via the Web, that provides a scientific community with end-to-end support for a particular scientific workflow.

3

TeraGrid and Data Collections

• Long list of data collections TeraGrid supports available from TeraGrid User Portal (https://portal.teragrid.org/)

• What may be much more important is what the TeraGrid can provide as a hosting environment for Data Collections – particularly when provided via some sort of gateway service

4

55

U. Chicago SIDGrid

(sidgrid.ci.uchicago.edu)

6

(www.chembiogrid.org)

6

MutDB and CICC

(www.mutdb.org)

Groups involved in Data activities

• Technology:- Data Working Group- “Data Architecture” Group

• User-Focused:- Data Collections Working Group- Science Gateways Working Group

• Maybe a bit too complex…

7

What does TeraGrid provide?

• Many individual data resources- Parallel file systems- Archives- Global File Systems (GPFS-WAN/Lustre)

• Some data management tools- Storage Resource Broker- Replica Location Services

• Data Management poorly organized

8

9

Service-oriented models?

• Theory: Gateways utilize familiar (web) interfaces to ease access to computer-based resources

• Many commercial web services do the same thing

- From Flickr to Amazon S3

• Can TeraGrid (more) effectively use this model?

What should TeraGrid Provide, and what do you need?

• More emphasis on metadata?

• More support for forming new collections?

• More globally accessible file systems (or apparent file systems)?

• Better access mechanisms for organized, tagged metadata?

• Most of what I hear is about access and interfaces?

• Lots of discussions going on around gateways, data collections, long-term preservation, metadata, search mechanisms, etc., etc., etc. We need open discussions and feedback from the community!

• What can the TeraGrid do to provide data-centric services more widely, effectively, and easily for the data providers?

10