5-14-13 An Introduction to VIVO Presentation Slides

57
May 14, 2013 Hot Topics: DuraSpace Community Webinar Series Hot Topics: The DuraSpace Community Webinar Series Series Five: “VIVO: Research Discovery & Networking ” Curated by Dean Krafft

description

“Hot Topics: The DuraSpace Community Webinar Series, "Series Five: VIVO: Research Discovery and Networking.” Webinar #1: An Introduction to VIVO, May 14, 2013 Presented by: Dean Krafft, Chief Technology Strategist at Cornell University Library and Chair of the VIVO-DuraSpace Management Committee, Brian Lowe, Semantic Applications Programmer, Cornell and Jon Corson-Rikert, VIVO Development Lead, Cornell

Transcript of 5-14-13 An Introduction to VIVO Presentation Slides

Page 1: 5-14-13 An Introduction to VIVO Presentation Slides

May 14, 2013 Hot Topics: DuraSpace Community Webinar Series

Hot Topics: The DuraSpace Community Webinar Series

Series Five:

“VIVO: Research Discovery & Networking ”

Curated by Dean Krafft

Page 2: 5-14-13 An Introduction to VIVO Presentation Slides

May 14, 2013 Hot Topics: DuraSpace Community Webinar Series

Webinar 1: Overview of VIVO

Presented by:

Brian Lowe, Semantic Applications Programmer, Cornell

Jon Corson-Rikert, VIVO Development Lead, Cornell

Dean Krafft, Chief Technology Strategist at Cornell University Library and Chair of the VIVO-DuraSpace Management Committee

Page 3: 5-14-13 An Introduction to VIVO Presentation Slides

What is VIVO?

• A semantic-web-based researcher and research discovery tool– People plus much more

• Institution-wide, publicly-visible information– For external as well as internal audiences

• An open, shared platform for connecting scholars, communities, campuses, and countries using Linked Open Data

Page 4: 5-14-13 An Introduction to VIVO Presentation Slides

How did we get here?

31 authors

6 institutions

Page 5: 5-14-13 An Introduction to VIVO Presentation Slides

A brief VIVO history

2003-2005 First realization for the life sciences at Cornell, as a relational database

2006-2008 Expansion to all disciplines at Cornell, and conversion to Semantic Web

2009-2012 National Institutes of Health-sponsored VIVO: Enabling the National Networking of Scientists project transforms VIVO to a multi-institutional open source platform

2013-2014 VIVO Incubator Project with DuraSpace for open community development

Page 6: 5-14-13 An Introduction to VIVO Presentation Slides

Major opportunity, 2009

NIH … “invites applications designed to develop, enhance, or extend infrastructure for connecting people and resources to facilitate national discovery of individuals and of scientific resources by scientists and students to encourage interdisciplinary collaboration and scientific exchange.”

Page 7: 5-14-13 An Introduction to VIVO Presentation Slides

National partnership

2009

Page 8: 5-14-13 An Introduction to VIVO Presentation Slides

VIVO CollaborationCornell UniversityDean Krafft (Cornell PI)

Manolo BeviaJim Blake

Nick CappadonaBrian Caruso

Jon Corson-RikertElly Cramer

Medha DevareElizabeth Hines

Huda KhanDepak Konidena

Brian LoweJoseph McEnerneyHolly Mistlebauer

Stella MitchellAnup Sawant

Christopher WestlingTim Worrall

Rebecca Younes

University of FloridaMike Conlon (VIVO and UF PI)

Beth AutenMichael Barbieri

Chris BarnesKaitlin Blackburn

Cecilia BoteroKerry Britt

Erin BrooksAmy Buhler

Ellie BushhousenLinda Butson

Chris CaseChristine Cogar

Valrie DavisMary Edwards

Nita FerreeRolando Garcia-Milan

George HackChris HainesSara HenningRae Jesano

Margeaux JohnsonMeghan Latorre

Yang LiJennifer LyonPaula Markes

Hannah NortonJames Pence

Narayan RaumNicholas Rejack

Alexander RockwellSara Russell Gonzalez

Nancy SchaeferDale SchepplerNicholas SkaggsMatthew Tedder

Michele R. TennantAlicia Turner

Stephen Williams

Indiana UniversityKaty Borner (IU PI)

Kavitha ChandrasekarBin Chen

Shanshan ChenRyan CobineJeni Coffey

Suresh DeivasigamaniYing Ding

Russell DuhonJon Dunn

Poornima GopinathJulie Hardesty

Brian KeeseNamrata Lele

Micah LinnemeierNianli Ma

Robert H. McDonaldAsik Pradhan Gongaju

Mark PriceMichael Stamper

Yuyin SunChintan TankAlan Walsh

Brian WheelerFeng Wu

Angela Zoss

Ponce School of MedicineRichard J. Noel, Jr. (Ponce PI)

Ricardo Espada ColonDamaris Torres Cruz

Michael Vega Negrón

This project is funded by the National Institutes of Health, U24 RR029822"VIVO: Enabling National Networking of Scientists”

The Scripps Research Institute

Gerald Joyce (Scripps PI)Catherine Dunn

Sam KatkovBrant KelleyPaula King

Angela MurrellBarbara NobleCary Thomas

Michaeleen Trimarchi

Washington University School of Medicine in St. Louis

Rakesh Nagarajan (WUSTL PI)Kristi L. HolmesCaerie HouchinsGeorge JosephSunita B. Koul

Leslie D. McIntosh

Weill Cornell Medical CollegeCurtis Cole (Weill PI)

Paul AlbertVictor Brodsky

Mark BronnimannAdam Cheriff

Oscar CruzDan Dickinson

Richard HuChris Huang

Itay KlazKenneth Lee

Peter MicheliniGrace Migliorisi

John RuffingJason Specland

Tru TranVinay Varughese

Virgil Wong

Page 9: 5-14-13 An Introduction to VIVO Presentation Slides

What does VIVO do?

• Integrates multiple sources of data– Systems of record– Faculty activity reporting– External sources (e.g., Scopus, PubMed,

NIH RePORTER)• Provides a review and editing interface

– Single sign-on for self-editing or by proxy

• Provides integrated, filterable feeds to other websites

Page 10: 5-14-13 An Introduction to VIVO Presentation Slides

People

Page 11: 5-14-13 An Introduction to VIVO Presentation Slides

People and what they do

Page 12: 5-14-13 An Introduction to VIVO Presentation Slides

Structured data for visualizations

Page 13: 5-14-13 An Introduction to VIVO Presentation Slides

Enabling an (inter)national network

• Open software

• Open data

• Local control

• Decentralized infrastructure

Page 14: 5-14-13 An Introduction to VIVO Presentation Slides

What does VIVO model?

• People and more– Organizations, grants, programs, projects,

publications, events, facilities, and research resources

• Relationships among the above– Meaningful– Bidirectional– Navigable context

• Links to URIs elsewhere– Concepts, identifiers– People, places, organizations, events

Page 15: 5-14-13 An Introduction to VIVO Presentation Slides

Typical data sources

• HR – people, appointments

• Research administration – grants & contracts• Registrar – courses• Faculty reporting system(s)

– publications, service, research areas, awards• Events calendar• Internal and external news • External repositories – e.g., Pubmed, Scopus

Page 16: 5-14-13 An Introduction to VIVO Presentation Slides

Value for institutions

• Common data substrate– Public, granular and direct– Discovery via external and internal search

engines– Available for reuse at many levels

• Distributed curation– E.g., affiliations beyond what HR system tracks– Data coordination across functional silos– Feeding changes back to systems of record– Direct linking across campuses

• Data that is visible gets fixed

Page 17: 5-14-13 An Introduction to VIVO Presentation Slides

The Semantic Web

• Turn data into a web of simple links

• Use ontology to explain how things are linked

• Use reasoning to add new links automatically

• Be flexible and extensible

Page 18: 5-14-13 An Introduction to VIVO Presentation Slides

The VIVO ontology

• Describe people and organizations in the process of doing research

• Stay discipline neutral

• Use existing scientific domain terminology to describe content of research

Page 19: 5-14-13 An Introduction to VIVO Presentation Slides

What is Linked Open Data (LOD)?

• Data– Structured information, not just documents

with text– A common, simple format

• Open– Available, visible, mine-able– Anyone can post, consume, and reuse

• Linked– Directly by reference– Indirectly through common references and

inference

Page 20: 5-14-13 An Introduction to VIVO Presentation Slides

Linked Open Data

Page 21: 5-14-13 An Introduction to VIVO Presentation Slides

Linked data indexed for search

Ponce VIVOPonce VIVO

WashU VIVOWashU VIVO

IU VIVO

IU VIVO

Cornell

Ithaca VIVO

Cornell

Ithaca VIVO

WeillCornel

l VIVO

WeillCornel

l VIVO

eagle-iresearchresources

eagle-iresearchresources Harvard

ProfilesRDF

HarvardProfiles

RDF

OtherVIVOsOtherVIVOs

DigitalVitaRDF

DigitalVitaRDF

IowaLokiRDF

IowaLokiRDF

Linked Open DataLinked Open Data

vivosearch

.org

UF VIVOUF

VIVO

Scripps VIVO

Scripps VIVO

Solrsearchindex

Solrsearchindex

anotherSolr

index

anotherSolr

index

Page 22: 5-14-13 An Introduction to VIVO Presentation Slides
Page 23: 5-14-13 An Introduction to VIVO Presentation Slides
Page 24: 5-14-13 An Introduction to VIVO Presentation Slides
Page 25: 5-14-13 An Introduction to VIVO Presentation Slides
Page 26: 5-14-13 An Introduction to VIVO Presentation Slides

Implementation challenges

• A simple idea – take the basic public information about researchers at Cornell and make it easy to find for academic purposes

• Why is this hard?

Page 27: 5-14-13 An Introduction to VIVO Presentation Slides

Policy issues

• Dirty data

• Lack even of common definitions of organizations or who’s faculty

• Data ownership

• Many dimensions of privacy

• Short-term “go it alone” vs. common good

Page 28: 5-14-13 An Introduction to VIVO Presentation Slides

Enter data once, use it many times

Page 29: 5-14-13 An Introduction to VIVO Presentation Slides
Page 30: 5-14-13 An Introduction to VIVO Presentation Slides
Page 31: 5-14-13 An Introduction to VIVO Presentation Slides
Page 32: 5-14-13 An Introduction to VIVO Presentation Slides

Weill Cornell research reporting

• How has the number of publications co-authored with other institutions changed year to year?

Page 33: 5-14-13 An Introduction to VIVO Presentation Slides
Page 34: 5-14-13 An Introduction to VIVO Presentation Slides

Multi-institutional scenarios for VIVO

• Multiple campuses of one university• University and federal lab connections

– E.g., Colorado ties with regional federal labs

• Consortia – 60 CTSAs• International

– 13 Netherlands universities and the National Library

– AgriVIVO

Page 35: 5-14-13 An Introduction to VIVO Presentation Slides

Benefits across institutions

• Sharing experience provides clarity and new ideas

• Incentives from sharing development, tools, customizations

• Potential data-level connectivity

– Research is happening increasingly in teams that span institutions

– Meeting the needs of short and long-term virtual organizations

Page 36: 5-14-13 An Introduction to VIVO Presentation Slides

From outputs to outcomes

• Outputs like papers and patents can be tracked

– Collaborative ontology effort to adequately represent the humanities

• Outcomes such as economic impact or societal benefit are much harder to identify

• Questions about return on research investment beg for consistent, comparable data

– over time

– across institutions

– across domains

Page 37: 5-14-13 An Introduction to VIVO Presentation Slides

International engagement

Page 38: 5-14-13 An Introduction to VIVO Presentation Slides

International engagement

Page 39: 5-14-13 An Introduction to VIVO Presentation Slides

Partnerships – ORCID

• Open Researcher and Contributor ID– Attribution for works of any type

• ORCID and VIVO

– ORCID is an attribute in a VIVO profile– Tools being tested for submission of

researcher registrations from VIVO

http://orcid.org

Page 40: 5-14-13 An Introduction to VIVO Presentation Slides

VIVO/DuraSpace Partnership

• DuraSpace is a not-for-profit organization supporting the DSpace and Fedora repositories

• Serves as the open source community home for future VIVO development

• Provides a legal and financial framework, extensive tools, and proven track record of managing community developed open source projects

• Joint two-year initial governance based on founding sponsors, management team, and dedicated development and leadership effort

Page 41: 5-14-13 An Introduction to VIVO Presentation Slides

The VIVO Community

Page 42: 5-14-13 An Introduction to VIVO Presentation Slides

Meeting about VIVO

• 2nd Australian VIVO Days in February• CU Boulder hosted 50 attendees for the 3rd

VIVO Implementation Fest in April• May 20th VIVO event for New York City area

institutions• August 2013 will be the 4th Annual VIVO

Conference – approximately 200-250 attendees, with workshops, papers, keynotes, invited talks, and posters

Page 43: 5-14-13 An Introduction to VIVO Presentation Slides
Page 44: 5-14-13 An Introduction to VIVO Presentation Slides

Research Informatics Infrastructure

• USDA adopting for intramural research, and also using VIVO to knit together data from their 7 major agencies to fulfill reporting mandates to Office of Science & Technology Policy and Congress

• National Center for Atmospheric Research (NCAR) is piloting VIVO to coordinate large, multi-year, multi-institutional, multi-instrument research projects

Page 45: 5-14-13 An Introduction to VIVO Presentation Slides

Research Informatics Infrastructure – cont.

• Accurate, structured VIVO data can feed external profiling and discovery systems (ORCID, Google Scholar, Academic Analytics, etc.)

• VIVO extensibility allows it to represent research resources and tie them to research datasets, publications, and researchers, promoting data discovery and reuse

Page 46: 5-14-13 An Introduction to VIVO Presentation Slides
Page 47: 5-14-13 An Introduction to VIVO Presentation Slides
Page 48: 5-14-13 An Introduction to VIVO Presentation Slides

VIVO for atmospheric and space physics

Page 49: 5-14-13 An Introduction to VIVO Presentation Slides
Page 50: 5-14-13 An Introduction to VIVO Presentation Slides
Page 51: 5-14-13 An Introduction to VIVO Presentation Slides

CTSAconnect and the ISF

• VIVO and eagle-i team members won NIH funding in 2012 for a project to unify their ontologies and extend both in the clinical domain

• The unified ontology is known as the Integrated Semantic Framework, or ISF

• VIVO 1.6 and eagle-i’s next release will use the ISF

• This combined ontology is modular to allow selective data population based on local needs

Page 52: 5-14-13 An Introduction to VIVO Presentation Slides

Tying biomedical research to clinical delivery

Page 53: 5-14-13 An Introduction to VIVO Presentation Slides
Page 54: 5-14-13 An Introduction to VIVO Presentation Slides

Challenges

• Communicating VIVO’s goals to faculty, administrators, funders, and other institutions

• Adapting to constant changes in data sources• Fully exploiting the opportunities provided by

VIVO linked open data• Co-existing in a world where not everyone

uses VIVO• Positioning VIVO on a sustainable path

Page 55: 5-14-13 An Introduction to VIVO Presentation Slides

Next Webinar: Case Studies

• Tuesday, June 4• Colorado• Duke• Brown• Weill Cornell Medical College

Page 56: 5-14-13 An Introduction to VIVO Presentation Slides

3rd Webinar – Technical Deep Dive

• Tuesday, June 11• Ontology & Linked Data• Open source technologies used• What’s coming in v1.6• VIVO technical community touch points

• Many ways to participate, benefit, and contribute

Page 57: 5-14-13 An Introduction to VIVO Presentation Slides

May 14, 2013 Hot Topics: DuraSpace Community Webinar Series

Questions?