Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon...
Transcript of Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon...
![Page 1: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/1.jpg)
DataONE Cyberinfrastructure
Ma# JonesDave VieglaisBruce Wilson
![Page 2: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/2.jpg)
Foremost a Federa9on
Member Nodes (MNs)• Heart of the federa9on• Harness the power of local cura9on
Coordina9ng Nodes (CNs)• Services to link Member Nodes
Inves9gator Toolkit (ITK)• Tools for the whole data lifecycle
2
Interoperability
![Page 3: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/3.jpg)
• Scalable
• Usable by people and agents
• Resilient to technical and ins9tu9onal change
• Adap9ve to evolving standards
• Inclusive of exis9ng communi9es and tools
• Cognizant of sociological drivers
• Informed by prior and current work
Requirements for DataONE
![Page 4: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/4.jpg)
Why a Federa9on?
Diverse Federa9on == Resilience• Failover for temporary outages
• Insurance against project/ins9tu9onal failure
Diverse Federa9on == Scalability• Storage increases with Member Nodes
• Incremental costs to each MN to replicate
• Distributes sustainability costs
4
![Page 5: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/5.jpg)
Authorita9ve members of the Federa9on• Curate their own data holdings
Provide unique iden,fiers for each object
Ensure availability, quality, and reliability
• Replicate holdings for other MNs• Provide access and access control• Log and report accesses to objects• Engage with DataONE community
• Deploy a DataONE-‐compa9ble soVware system
Member Nodes
5
![Page 6: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/6.jpg)
Implementa9on Tiers
Tier 1 Supports publicly readable content without authen9ca9on or more specific access control rules.
Tier 2 Tier 1 plus access control support
Tier 3 Tier 2 plus ability to add content through the DataONE service interfaces and provides full support for interac9on with DataONE Inves9gator Toolkit applica9ons and plugins.
Tier 4 Support the full set of DataONE APIs and can operate as replica9on targets, accep9ng content from compa9ble (technical and policy) Member Nodes and fully suppor9ng the DataONE content access control rules.
6
![Page 7: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/7.jpg)
Characterizing Member Nodes
Diverse Contributors• Individual inves9gators
• Field sta9ons and networks
• Government agencies
• Non-‐profit partnerships
• Scien9fic Socie9es
• Synthesis centers 7
< 1
1-‐10
10-‐200
>200
0
15
30
45
60
MB
DataSizes
%
Data Types• Ecological
• Environmental
• Demographic
• Social/Legal/Economic
![Page 8: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/8.jpg)
Characterizing Member Nodes
Diverse Contributors• Individual inves9gators
• Field sta9ons and networks
• Government agencies
• Non-‐profit partnerships
• Scien9fic Socie9es
• Synthesis centers 7
< 1
1-‐10
10-‐200
>200
0
15
30
45
60
MB
DataSizes
%
Data Types• Ecological
• Environmental
• Demographic
• Social/Legal/Economic
![Page 9: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/9.jpg)
Coordina9ng Nodes
Provide coordina9ng services• Search and Discovery• Preserva9on monitoring
• Object tracking and replica management
• User iden9ty management• Logging and monitoring
Op9mized• High availability• Performance• Scalability
![Page 10: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/10.jpg)
The Inves9gator Toolkit
• Discovery tools
• Data Management tools
• Analysis and modeling tools
• Cita9on and publica9on tools
Inves9gator Toolkit
Web Interface Analysis, Visualiza9on Data Management
Client LibrariesJava Python Command Line
![Page 11: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/11.jpg)
Collect
Assure
Describe
Deposit
Preserve
Discover
Integrate
Analyze
Data Lifecycle
![Page 12: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/12.jpg)
Collect
Assure
Describe
Deposit
Preserve
Discover
Integrate
Analyze
Data Lifecycle
Morpho
![Page 13: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/13.jpg)
Goal: Uniquely iden9fy data or metadata objects
• Support the several iden9fier types widely used
• Iden9fiers assigned by Member Nodes
• Uniqueness ensured by Coordina9ng Nodes
• Resolu9on through Coordina9ng Nodes
Iden9fy objects
LSID PURLGUID{3F2504E0-4…
![Page 14: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/14.jpg)
Iden9fy people
• Iden9ty provider selected by the user
• Member nodes define access rules
• Rules propagated by Coordina9ng Nodes
• Iden9ty and access control consistent across en9re infrastructure
![Page 15: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/15.jpg)
KNBGenericNativeProxy
Deposit Data and Metadata
<meta>
Science metadata•EML, FGDC, DC, ISO, DIF, …System metadata• Globally unique IDs for data &
metadata (DOI, GUID, Hdl, …)•Checksums of objects•Object policies
![Page 16: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/16.jpg)
Preserve Data and Metadata
• Metadata mirrored at Coordina9ng Nodes• Data replicated between Member Nodes• CNs manage copies• Checksums recorded and verified• Promote quality metadata
Coordina9ngNodes
![Page 17: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/17.jpg)
Discover Content
![Page 18: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/18.jpg)
Integrate and Analyze
16
!
!
!
!!
!
! !
!
! !!
!! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !
!! !
!
!
!
!
!
!
!!
!
!
!
!
!
! ! ! !
!
!!
!!
!! ! !
24.2
024.3
024.4
024.5
0
water temperature
(bottom, 10m ADCP)
Time
Tem
pera
ture
degre
es C
01:00 05:00 09:00 13:00 17:00
Graphs and derived data can bearchived in DataONE
![Page 19: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/19.jpg)
Analysis and Visualiza9on
Spa9o-‐Temporal Exploratory Model iden9fies factors affec9ng pa#erns of migra9on
Diverse bird observa9ons and environmental data from 300,000 loca9ons in the US integrated and analyzed using High Performance Compu9ng Resources
Land Cover
Meteorology
MODIS – Remote sensing data
Slide from S. Kelling
• Examine pa#erns of migra9on
• Infer how climate change may affect bird migra9on
Model results
Occurrence of Swainson’s Hawk
Jan Sep DecJunApr
![Page 20: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/20.jpg)
DataONE System Overview
![Page 21: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/21.jpg)
DataONE System Overview
![Page 22: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/22.jpg)
Deploy core infrastructure suppor9ng four fundamental services:• Persistent, unique iden9fiers• Bit-‐level preserva9on• Search and retrieval• Federated iden9ty
Along with:• Build out and deployment of Member Nodes• Add ITK func9onality• Test, test, test• Ramp up R&D on addi9onal features
DataONE Ac9vi9es Through Year 2
![Page 23: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/23.jpg)
Inves9gator Toolkit SoUwareSearchPortal R Client Morpho
Client LibrariesJava Python Command Line
Member Node SoUware
Metacat
Coordina9ng Node SoUwareService Interfaces
Object Store Index
SoVware Delivered at Public Release
Zotero Fuse FS Excel
Dryad
GMN CUASHI
MerriZ Preserva9on MonitorCatalogIden9fiers
Replica9on DiscoveryResolu9on Registra9on
Mendeley
…
DataONE Service Programming Interface (SPI)
![Page 24: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/24.jpg)
• Data sub-‐selng, transforma9on
• Visualiza9on• Workflow support
• Seman9c search
• Seman9c data integra9on
• Computa9onal, or specialized nodes
• Inves9gator Toolkit expansion
DataONE Ac9vi9es: Years 3-‐5
DMP-Tool
![Page 25: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/25.jpg)
Cyberinfrastructure Outline
• CI Architecture, Requirements, and Design• Member Nodes
• Coordina9ng Nodes• Inves9gator Toolkit
• Demonstra9ons
![Page 26: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/26.jpg)
23
Collect
Assure
Describe
Deposit
Preserve
Discover
Integrate
Analyze
Demonstra9ons
![Page 27: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/27.jpg)
23
Collect
Assure
Describe
Deposit
Preserve
Discover
Integrate
Analyze
Demonstra9ons
Morpho
![Page 28: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/28.jpg)
23
Collect
Assure
Describe
Deposit
Preserve
Discover
Integrate
Analyze
Demonstra9ons
Morpho
![Page 29: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/29.jpg)
23
Collect
Assure
Describe
Deposit
Preserve
Discover
Integrate
Analyze
Demonstra9ons
Morpho
![Page 30: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/30.jpg)
23
Collect
Assure
Describe
Deposit
Preserve
Discover
Integrate
Analyze
Demonstra9ons
Morpho
![Page 31: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/31.jpg)
23
Collect
Assure
Describe
Deposit
Preserve
Discover
Integrate
Analyze
Demonstra9ons
Morpho
![Page 32: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/32.jpg)
Describing and deposit with Morpho
24
![Page 33: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/33.jpg)
Data discovery
25
![Page 34: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/34.jpg)
File system access
26
![Page 35: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/35.jpg)
R plugin demonstra9on
27
![Page 36: Ma$Jones Dave$Vieglais BruceWilson - DataONE · Dave$Vieglais BruceWilson. ForemostaFederaon Member’Nodes’(MNs) • Heartof$the$federaon • Harness$the$power$of$local$ …](https://reader034.fdocuments.in/reader034/viewer/2022051800/5acd1d007f8b9a63398db60e/html5/thumbnails/36.jpg)
Value of DataONE
• Discovery and access: Enabling discovery and universal access to data about life on earth from around the world
• Data integra9on and synthesis: Providing transforma9onal tools that enable cross-‐culng research
• Educa9on and training: Providing essen9al skills (e.g., data management training, best prac9ces) for scien9fic enquiry
• Building community: Combining exper9se and resources across diverse communi9es to collec9vely educate, advocate, and support stewardship of scien9fic data
• Data Sharing: Providing incen9ves and infrastructure for sharing data from federally funded researchers
28