Bonazzi data commons nhgri council feb 2017
-
Upload
vivien-bonazzi -
Category
Science
-
view
266 -
download
0
Transcript of Bonazzi data commons nhgri council feb 2017
![Page 1: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/1.jpg)
The NIH Data Commons
NHGRI Council – February 6, 2017
Vivien Bonazzi Ph.D.
Senior Advisor for Data Science & Data Commons
National Institutes of Health, Bethesda
![Page 2: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/2.jpg)
What’s the driving the need for a
Data Commons?
![Page 3: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/3.jpg)
Convergence of factors
Mountains of Data
Increasing need and support for Data sharing
FAIR – Findable Accessible Interoperable Reproducible
Availability of digital technologies and
infrastructures that support Data at scale
![Page 4: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/4.jpg)
![Page 5: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/5.jpg)
![Page 6: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/6.jpg)
https://gds.nih.gov/
Went into effect January 25, 2015
NCI guidance:
http://www.cancer.gov/grants-training/grants-management/nci-
policies/genomic-data
Requires public sharing of genomic data sets
![Page 7: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/7.jpg)
![Page 8: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/8.jpg)
![Page 9: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/9.jpg)
![Page 10: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/10.jpg)
Data Commons
enabling data driven science
Enable investigators to leverage all possible data and tools in the
effort to accelerate biomedical discoveries, therapies and cures
by
driving the development of data infrastructure and data science
capabilities through collaborative research and robust
engineering
![Page 11: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/11.jpg)
Developing a Data Commons
Treats products of research – data, methods, papers etc.
as digital objects
These digital objects exist in a shared virtual space
Find, Deposit, Manage, Share, and Reuse data,
software, metadata and workflows
Digital object compliance through FAIR principles:
Findable
Accessible (and usable)
Interoperable
Reusable
![Page 12: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/12.jpg)
The Data Commons
is a platform
that allows transactions to occur
on FAIR data at scale
![Page 13: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/13.jpg)
The Data Commons Platform
Compute Platform: Cloud
Services: APIs, Containers, Indexing,
Software: Services & Tools
scientific analysis tools/workflows
Data
“Reference” Data Sets
User defined data
Dig
ital O
bje
ct C
om
plia
nce
App store/User Interface/Portal
PaaS
SaaS
IaaS
https://datascience.nih.gov/commons
![Page 14: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/14.jpg)
Commons Architecture
User InterfaceData and Analysis Pipeline Management, Visualization
FAIR Data Access
Search, Indexing, Combine, Extract
Cloud Service Providers
Portability, Interoperability
Data Staging SandboxHarmonize, Variant Calling,
Researcher WorkspacesAnalysis Pipelines and Tools
Access Portal
Nearline
Storage: Infrequent Use
Online
Storage:Frequent Use
Cost Tracking
And ManagementRelational
DatabaseMeta-Data
Security-Data Access Rules, Consents
Data
![Page 15: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/15.jpg)
Other Data Commons’
![Page 16: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/16.jpg)
Other Data Commons’
![Page 17: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/17.jpg)
Commons Engagement
US Government Agencies & EU groups
![Page 18: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/18.jpg)
Interoperability with other Commons’
Common goals – democratizing, collaborating & sharing data
Reuse of currently available open source tools which support interoperability GA4GH, UCSC, GDC, NYGC
Planned meeting for current major Commons developers/NIH Staff
BioIT Commons Session?
Shared open standard APIs for data access and computing
Ability to deploy and compute across multiple cloud environments
Docker containers – Dockerstore/Docker registry
Workflows management, sharing and deployment
Discoverability (indexing) objects across cloud commons
Global Unique identifiers
NIH Commons Working Groups: BD2K, ELIXR members & broader community Commons FAIRness metrics WG:
Interoperable APIs
Docker registry /workflow sharing
Data Object registries
Common user authentication system
![Page 19: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/19.jpg)
Acknowledgments
ADDS Office: Jennie Larkin, Phil Bourne, Michelle Dunn, Mark Guyer, Allen Dearry, Sonynka Ngosso,
Tonya Scott, Lisa Dunneback, Vivek Navale (CIT/ADDS), Ron Margolis
NCBI: George Komatsoulis
NHGRI: Valentina di Francesco, Ajay Pillai, Ken Wiley
NIGMS: Susan Gregurick
CIT: Andrea Norris, Debbie Sinmao
NIH Common Fund: Jim Anderson , Betsy Wilder, Leslie Derr
NCI: Ian Fore, Sean Davis, Warren Kibbe, Tony Kerlavage, Tanja Davidsen
NIAID: Maria Giovanni, Alison Yao, Eric Choi, Claire Schulkey
NHLBI: Weiniu Gan, Alastair Thomson
NIH Clinical Centre: Elaine Ayres, (BITRIS),
NIBIB: Vinay Pai (DK),
OSP: Dina Paltoo, Kris Langlais, Erin Luetkemeier, Agnes Rooke,
Research and Industry: Mathew Trunnell (FHC), Bob Grossman (Chicago), Toby Bloom (NYGC)
![Page 20: Bonazzi data commons nhgri council feb 2017](https://reader033.fdocuments.in/reader033/viewer/2022051710/58f377651a28ab634b8b4597/html5/thumbnails/20.jpg)
Stay in
Touch
QR Business Card
@Vivien.Bonazzi
Slideshare
Blog (Coming soon!)