Research Data Management as a Service
-
Upload
globus -
Category
Technology
-
view
191 -
download
5
description
Transcript of Research Data Management as a Service
![Page 1: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/1.jpg)
computationinstitute.org www.globusonline.org
Research data management as a service
Ian Foster [email protected]
![Page 2: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/2.jpg)
computationinstitute.org www.globusonline.org
High energy physics
Molecular biology
Cosmology
Genetics
Metagenomics
Linguistics
Economics
Climate change
Visual arts
![Page 3: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/3.jpg)
computationinstitute.org www.globusonline.org
What would a “dropbox for science”
look like?
![Page 4: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/4.jpg)
computationinstitute.org www.globusonline.org
Registry Staging Store
Ingest Store
Analysis Store
Community Store
Archive Mirror
Ingest Store
Analysis Store
Community Store
Archive Mirror
Registry
Quota exceeded
!
Expired credentials
!
Network failed. Retry.
!
Permission denied
!
It should be trivial to Collect, Move, Sync, Share, Analyze, Annotate, Publish, Search, Backup, & Archive BIG DATA … but in reality it’s often very challenging
![Page 5: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/5.jpg)
computationinstitute.org www.globusonline.org
• Collect • Move • Sync • Share • Analyze
• Annotate • Publish • Search • Backup • Archive
BIG DATA …for
![Page 6: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/6.jpg)
computationinstitute.org www.globusonline.org
• Collect • Move • Sync • Share • Analyze
• Annotate • Publish • Search • Backup • Archive
• Collect • Move • Sync • Share Capabili8es delivered using
So=ware-‐as-‐Service (SaaS) model
![Page 7: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/7.jpg)
computationinstitute.org www.globusonline.org
![Page 8: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/8.jpg)
computationinstitute.org www.globusonline.org
Data Source
Data Destination
User iniAates transfer request
1
Globus Online moves/syncs files
2
Globus Online noAfies user
3
![Page 9: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/9.jpg)
computationinstitute.org www.globusonline.org
Data Source
User A selects file(s) to share; selects user/group, sets share permissions
1
Globus Online tracks shared files; no need to move files to cloud storage!
2
User B logs in to Globus Online and accesses shared file
3
![Page 10: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/10.jpg)
computationinstitute.org www.globusonline.org
Early adopAon is encouraging
![Page 11: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/11.jpg)
computationinstitute.org www.globusonline.org
Early adopAon is encouraging
8,000 registered users; >100 daily ~16 PB moved; ~1B files
10x (or beOer) performance vs. scp 99.9% availability
En8rely hosted on Amazon
![Page 12: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/12.jpg)
computationinstitute.org www.globusonline.org
Globus Online already does a lot
Globus Toolkit
Sharing Service
Transfer Service
Globus Nexus (Identity, Group, Profile)
Glo
bu
s O
nlin
e A
PIs
Glo
bu
s C
on
nec
t
![Page 13: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/13.jpg)
computationinstitute.org www.globusonline.org
We are also adding capabiliAes
Globus Toolkit
Sharing Service
Transfer Service
Globus Nexus (Identity, Group, Profile)
Glo
bu
s O
nlin
e A
PIs
Glo
bu
s C
on
nec
t
![Page 14: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/14.jpg)
computationinstitute.org www.globusonline.org
We are also adding capabiliAes
Globus Toolkit
Sharing Service
Transfer Service
Dataset Services
Globus Nexus (Identity, Group, Profile)
Glo
bu
s O
nlin
e A
PIs
Glo
bu
s C
on
nec
t
![Page 15: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/15.jpg)
computationinstitute.org www.globusonline.org
Expanding Globus Online services
• Ingest and publication – Imagine a DropBox that not only replicates, but
also extracts metadata, catalogs, converts • Cataloging
– Virtual views of data based on user-defined and/or automatically extracted metadata
• Computation – Associate computational procedures,
orchestrate application, catalog results, record provenance
![Page 16: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/16.jpg)
computationinstitute.org www.globusonline.org
Builds on catalog as a service Approach
• Hosted user-defined catalogs
• Based on tag model <subject, name, value>
• Optional schema constraints
• Integrated with other Globus services
Three REST APIs /query/ • Retrieve subjects /tags/ • Create, delete, retrieve
tags /tagdef/ • Create, delete, retrieve
tag definitions Builds on USC Tagfiler project (C. Kesselman et al.)
![Page 17: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/17.jpg)
17
mydata42
owner: Francesco type: 3dtomo format: HDF5 beamline: 2BM
Tomography!
Define dataset Infer type Extract metadata
Populate catalog(s)
Locate datasets Access files
analyze
Catalog derived products
transfer/schedule
Orchestra8on Organiza8on
Record provenance
Annotate, share browse, search
![Page 18: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/18.jpg)
computationinstitute.org www.globusonline.org
Our challenge:
Sustainability
We are a non-profit service provider to the non-profit
research community
![Page 19: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/19.jpg)
computationinstitute.org www.globusonline.org
Globus Online Provider Plans
Support ongoing operations
Offer value-added capabilities
Engage more closely with users
![Page 20: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/20.jpg)
computationinstitute.org www.globusonline.org Starting at $20k per year
• Provider endpoints with sharing • Multiple GridFTP servers per endpoint • Branded web sites • Alternate identity provider • Usage reporting • MSS optimizations • Operations monitoring and management • Input into and access to product roadmap
Provider Plans offer…
![Page 21: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/21.jpg)
computationinstitute.org www.globusonline.org
Thanks to great colleagues and collaborators
• Steve Tuecke, Rachana Ananthakrishnan, Kyle Chard, Raj Kettimuthu, Ravi Madduri, Tanu Malik, and many others at Argonne & Uchicago
• Carl Kesselman, Karl Czajkowski, Rob Schuler, and others at USC/ISI
• Birali Runesha and others at UChicago Research Computing Center
![Page 22: Research Data Management as a Service](https://reader034.fdocuments.in/reader034/viewer/2022052310/554e9fdcb4c905fb7c8b45d2/html5/thumbnails/22.jpg)
computationinstitute.org www.globusonline.org
Thank you to our sponsors!