Taming the Wilderness of Open Research Information

Post on 30-Jun-2015

4.430 views 1 download

description

Ergebnisse eines Projekts mit Studierenden der Hochschule Hannover sowie aktuelle Entwicklungarbeiten im Kontext Forschungsinformation (VIVO) am Open Science Lab der TIB Vortrag von Dr. Ina Blümel und Gabriel Birke auf der i-Know conference 2014 (https://i-know.tugraz.at/) am 18. September 2014 in Graz

Transcript of Taming the Wilderness of Open Research Information

Dr. Ina Blümel, Gabriel Birkei-Know conference

September 18, 2014

Taming the Wilderness of Open Research InformationStudent project at HS Hannover, participants: Wendinda Carine Donessonne, Felix Kommnick, Elena Liventsova, Rahima Medshid, Bengt Olschewski, Anna Petersmeier, Tatiana Walther, Jana Wolf

Research Information: Paradigms

• Institutional• research management as driving force: reporting tools, etc. • mostly proprietary CRIS implementations at institutional and partly

national level, … (Pure, Converis, et al)• “closed world”

• Community based / discovery layer• merging & linking research information from various sources• Supporting scientists to establish networks, see success of

ResearchGate, academia.edu, etc.

2

3

VIVO

• Model for linkable research information with LOD ontologies

• Open source software• Originally developed at Cornell with

NSF funding, now supported by a consortium at DuraSpace

• Numerous implementations, previously primarily in the English-language bio/medical area (CTSA)

• Research profiles, visualisations, …

4

“feed” VIVO

1. External data sources, esp. websites (harvesting)

2. Internal data sources (Web API or other type of access)

3. Individual customization to suit professional needs

Challenge: • From the vast array of research inf. objects on the web to

structured research information • If possible, automatically

Sources

Science 2.0 community• Websites with publications,

projects, information about organizations, persons, ...

• with structured and unstructured information

Identify websites with repetitive, similarly structured content, worth setting up a harvesting pipeline!

5

Setting, Task

• 16 weeks project• 6th semester bachelor students of library and information

science• supported by an information and a computer scientist

• identify and document research information items on the websites

• map to the VIVO ontology

• certain steps re-defined or split up during running project according to students needs / prior knowledge

6

7

8

9

10

Steps

11

12

Challenges

13

• inconsistent publication data, entered as freeform text in CMS, e.g., up to 13 different versions of journal volume representation

• templates don’t provide RI in machine-readable formats

Challenges

• Variable content, stable structures  • Duplicates with different structure (publications, persons, …)

http://www.hiig.de/ausgewahlte-publikationen/ http://www.hiig.de/ausgewaehlte-veroffentlichungen/

14

Man and machine drawing same conclusions?

http://www.hiig.de/kooperationen/

Partners are marked with a logo (image) Luckily „alt“-tags available

15

Challenges

Results

16

• Discovery layer with aggregated research information

• Also approach for bootstrapping institutional research information systems from available web sources

• no substitute, but complementary to those systems

17

• Community building• VIVOcamp13, first workshop for EU VIVO community, SWIB13 satellite

November 2013• VIVO Bootcamp at ELAG Conference (European Library Automation

Group) in Bath, June 2014, „hands-on„• euroCRIS LOD group participation

• Policy & Standards Making: Position paper DINI AG FIS• Supervising bachelor thesis for extending VIVO ontology• DFG application: “German Academic Web”

Some activities (beside VIVO implementation)

Thank you for your attention!