Semantic web at Novartis

47
Experiences in Novartis Andrea Splendiani, Sr Scientific KE Consultant Geneve, Dec 2 nd 2015 Semantic Web @Novartis

Transcript of Semantic web at Novartis

Page 1: Semantic web at Novartis

Experiences in Novartis Andrea Splendiani, Sr Scientific KE Consultant Geneve, Dec 2nd 2015

Semantic Web @Novartis

Page 2: Semantic web at Novartis

Semantic Web @Novartis

2

Topics

§ Semantic Web @Novartis • Context (Where in Novartis) • Semantic Web in production • Semantic Web in research • Semantic Web under the hood

§ Semantic Web in “Real Life”: open questions

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 3: Semantic web at Novartis

Semantic Web uptake in time

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use 3

Context

Metastore/RDF

prep. production

“Semantic Web in pubmed” preparation

prep

Query federation

Visualisation

Other semantic technologies

CTMF p. p.

Page 4: Semantic web at Novartis

Semantic Web usage within the organization

4

Context

Activities of TMS:

§  Text mining

§  Ontology development

§  Ontology provision

§  Data curation

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 5: Semantic web at Novartis

Semantic Web @Novartis

5

Topics

§ Semantic Web @Novartis • Context (Where in Novartis) • Semantic Web in production • Semantic Web in research • Semantic Web under the hood

§ Semantic Web in “Real Life”: open questions

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 6: Semantic web at Novartis

Metastore: a central repository for ontologies

6

Semantic Web in production: Metastore

§  Consists of a semantic data federation layer based on controlled terminologies extracted from scientific data repositories

§  Organized around scientific concepts: Genes, Proteins, Indications, Anatomy etc…; some hierarchically organized and classified

§  Complemented by referential knowledge (cross references to internal and external knowledge repositories)

§  Supports different use cases, including text mining, data curation, data integration, search

§  Accessible through SPARQL endpoint, dedicated service layer and reusable widgets; full integrated application (MS Viewer) released to visualize all Metastore content.

§  Based on an RDF data model

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 7: Semantic web at Novartis

Metastore: content and usage

7

Semantic Web in production: Metastore

Approximately >2M accesses per month

March 2013

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 8: Semantic web at Novartis

Metastore data model

8

Semantic Web in production: Metastore

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 9: Semantic web at Novartis

Metastore technology I

9

Semantic Web in production: Metastore

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 10: Semantic web at Novartis

Metastore technology II

10

Semantic Web in production: Metastore

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Staging Table T_STABLE

RDF Triple store

Materialized Views

SPARQL end Point Joseki

Relational Tables •  Pointers •  History •  Versions •  Logs •  Reference

tables

Jena

Query SQL and PL/SQL APIs

DATA - Services

RDF/XML files

Page 11: Semantic web at Novartis

Metastore Widgets (suggest example)

11

Semantic Web in production: Metastore

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 12: Semantic web at Novartis

Metastore applications (Metastore viewer: summary)

12

Semantic Web in production: Metastore

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 13: Semantic web at Novartis

Metastore applications (Metastore viewer: links)

13

Semantic Web in production: Metastore

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 14: Semantic web at Novartis

Metastore applications (Metastore viewer: explorer)

14

Semantic Web in production: Metastore

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 15: Semantic web at Novartis

Semantic Web @Novartis

15

Topics

§ Semantic Web @Novartis • Context (Where in Novartis) • Semantic Web in production • Semantic Web in research

-  Query federation -  Visualization/interaction -  Other projects

• Semantic Web under the hood

§ Semantic Web in “Real Life”: open questions

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 16: Semantic web at Novartis

Query federation: why and how

16

Semantic Web in Research: query federation

•  Internal and external data already in RDF

•  Large datasets in relational systems

•  Proprietary datasets with license restrictions (e.g.: one server only)

•  Relational 2 RDF mapping (materialised and virtualised)

•  Bridge ontologies (work in progress)

•  Distributed queries (service)

Why ? How ?

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 17: Semantic web at Novartis

Data and systems architecture: example

17

Semantic Web in Research: query federation

Different arrangements possible (with caveats)

Export!triplest !

SERVICE!Dynamic translation!

Persist triples!

Ontop!SPARQL End Point!

NIBR!Data

Warehouse!!

Ontop!API!

Assay Repository!

RDBMS!

Allegrograph!!

Triplestore & End point!

UNIPROT/EBI SPARQL End

Point!METASTORE!

Oracle Spatial & graphs!

R2RML!+ reasoning!

Metastore!

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 18: Semantic web at Novartis

Federated query example

18

Semantic Web in Research: query federation

Assays

UNIPROT

Metastore

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 19: Semantic web at Novartis

Federated queries: logical model

19

Semantic Web in Research: query federation

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 20: Semantic web at Novartis

RDF virtualization via OnTop

20

Semantic Web in Research: query federation

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 21: Semantic web at Novartis

Semantic Web @Novartis

21

Topics

§ Semantic Web @Novartis • Context (Where in Novartis) • Semantic Web in production • Semantic Web in research

-  Query federation -  Visualization/interaction -  Other projects

• Semantic Web under the hood

§ Semantic Web in “Real Life”: open questions

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 22: Semantic web at Novartis

Visualization: why and how

22

Semantic Web in research: visulization and interaction

•  Accessibility of RDF data by end users

•  Complexity (or unfamiliarity) with SPARQL

•  General lack of knowledge on the structure of data, at query time

•  Visual, interactive environment

•  Pre-configuration to optimize interaction styles

•  Combination of tools and exploration paradigms

•  Data access through SPARQL endpoints

Why ? How ?

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 23: Semantic web at Novartis

RDF data explorer configuration

23

Semantic Web in research: visulization and interaction

§  Visualisation features are tuned to the datasets via a semi-automatic configuration.

§  Structure discovery:

•  ontology

•  queries

•  sampling

•  manual specification/overriding

§  Manual tuning of the ontology and other interaction parameters

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 24: Semantic web at Novartis

Data overview

24

Semantic Web in research: visulization and interaction

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 25: Semantic web at Novartis

Interaction: query builder + suggest

25

Semantic Web in research: visulization and interaction

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 26: Semantic web at Novartis

Interaction: path suggestions

26

Semantic Web in research: visulization and interaction

Assisted query formulation

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 27: Semantic web at Novartis

Visulization and graph navigation

27

Semantic Web in research: visulization and interaction

Detail, Augmentation, Filtering, query re-formulation

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 28: Semantic web at Novartis

Exploration, layouts, graphic clues

28

Semantic Web in research: visulization and interaction

Detail, Augmentation, Filtering, query re-formulation

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 29: Semantic web at Novartis

Multiple exports, sharing

29

Semantic Web in research: visulization and interaction

§  “queries” can be saved and shared as files or links

§  Query history

§  Download of partial or total datasets

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 30: Semantic web at Novartis

Semantic Web @Novartis

30

Topics

§ Semantic Web @Novartis • Context (Where in Novartis) • Semantic Web in production • Semantic Web in research

-  Query federation -  Visualization/interaction -  Other projects

• Semantic Web under the hood

§ Semantic Web in “Real Life”: open questions

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 31: Semantic web at Novartis

31

Example: provision of “phenotype ontologies” Semantic Web in Research: other projects

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

<owl:Class rdf:about="http://purl.obolibrary.org/obo/HP_0001636"> <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Tetralogy of Fallot</rdfs:label> <owl:equivalentClass> <owl:Restriction> <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000051"/> <owl:someValuesFrom> <owl:Class> <owl:intersectionOf rdf:parseType="Collection"> <rdf:Description rdf:about="http://purl.obolibrary.org/obo/PATO_0000001"/> <owl:Restriction> <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000051"/> <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/HP_0001629"/> </owl:Restriction> <owl:Restriction> <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000051"/> <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/HP_0001642"/> </owl:Restriction>

What systems can understand: HP_0001636 hasPart HP_0001629

Page 32: Semantic web at Novartis

32

Example: provision of “phenotype ontologies” Semantic Web in Research: other projects

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

<owl:Class rdf:about="http://purl.obolibrary.org/obo/HP_0001636"> <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Tetralogy of Fallot</rdfs:label> <owl:equivalentClass> <owl:Restriction> <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000051"/> <owl:someValuesFrom> <owl:Class> <owl:intersectionOf rdf:parseType="Collection"> <rdf:Description rdf:about="http://purl.obolibrary.org/obo/PATO_0000001"/> <owl:Restriction> <owl:onProperty rdfresource="http://purl.obolibrary.org/obo/BFO_0000051"/> <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/HP_0001629"/> </owl:Restriction> <owl:Restriction> <owl:onProperty rdf:resource="http://purl.obolibrary.org/obo/BFO_0000051"/> <owl:someValuesFrom rdf:resource="http://purl.obolibrary.org/obo/HP_0001642"/> </owl:Restriction>

What systems can understand: HP_0001636 hasPart HP_0001629

Imports closure

Classification

Extraction

Page 33: Semantic web at Novartis

Semantic Web @Novartis

33

Topics

§ Semantic Web @Novartis • Context (Where in Novartis) • Semantic Web in production • Semantic Web in research

-  Query federation -  Visualization/interaction -  Other projects

• Semantic Web under the hood

§ Semantic Web in “Real Life”: open questions

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 34: Semantic web at Novartis

CTMF: Collaborative Terminology Management

34

Semantic web under the hood: CTMF

§ The CTMF is a system designed to allow a distributed “editing of ontologies”.

§ Users can request new “terms” via a web interface or within an application.

§  “Content owners” can “assess” whether the requested terms are new concepts or synonyms (or errors!) and update the ontologies.

§ Resolution is asynchronous and the term request is non-blocking for applications

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 35: Semantic web at Novartis

CTMF web application (new request form)

35

Semantic web under the hood: CTMF

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 36: Semantic web at Novartis

CTMF: integration in applications

36

Semantic web under the hood: CTMF

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 37: Semantic web at Novartis

CTMF: term status page and discussion

37

Semantic web under the hood: CTMF

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 38: Semantic web at Novartis

CTMF: process (use of temporary ID)

38

Semantic web under the hood: CTMF

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 39: Semantic web at Novartis

Under the hood

39

Semantic web under the hood: CTMF

§  Basic principle of the Semantic Web: identity comes first. •  What “people can talk about” is give an URI, and information is built around it.

§  The CTMF adopts the same approach: •  a “term” request is in itself identifying a concept: what the requestor had in mind at the time of the

request. We give this idea a URI (the term status page) •  Information is built around this request (clarification). •  A “content owner” can assess whether the concept is identical to something already in metastore

(most likely what was requested for was a synonym), or whether a new concept should be introduced.

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 40: Semantic web at Novartis

Semantic Web @Novartis

40

Topics

§ Semantic Web @Novartis • Context (Where in Novartis) • Semantic Web in production • Semantic Web in research • Semantic Web under the hood

§ Semantic Web in “Real Life”: open questions

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 41: Semantic web at Novartis

Semantic Web @Novartis

41

Topics

§ Semantic Web @Novartis • Context (Where in Novartis) • Semantic Web in production • Semantic Web in research

-  Query federation -  Visualization/interaction -  Other projects

• Semantic Web under the hood

§ Semantic Web in “Real Life”: open questions

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 42: Semantic web at Novartis

Semantic Web in Real Life: Open questions

42

Data trumps everything

§  If there is a choice between better technology to access data, and better data, the latter prevails. • Corollary: interest is often where there is little data, especially in the

public domain.

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 43: Semantic web at Novartis

Semantic Web in Real Life: Open questions

43

Industry (or real life) is big

§ Areas that look nearby on paper may be very distant organization-wise. • Bench-to-bedside data integration

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 44: Semantic web at Novartis

Semantic Web in Real Life: Open questions

44

You don’t know the semantics of your data

§ The semantic expressiveness of RDF may be too much for what is represented in your data. • You don’t always make your data

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 45: Semantic web at Novartis

Semantic Web in Real Life: Open questions

45

Is data integration really a shared goal ?

§ Not all stakeholders have interest in “opening” their data. • When does a data producer gain in making its data more

accessible ?

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 46: Semantic web at Novartis

Semantic Web in Real Life: Open questions

46

Many people are doing SemWeb without knowing it

§  “My project is not based on RDF, it is based on a graph with properties from controlled vocabularies.” • Why not RDF?

-  Too academic -  Need something that works -  URIs are too long

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use

Page 47: Semantic web at Novartis

§ Therese Vachon

§ Pierre Parisot

§ Katia Vella

§ Frederic Sutter

§ Daniel Cronenberger

§ Fatma Oezdemir-Zaech

§ Anosha Siripala

§ Olivier Kreim

§ Gilles Hubert

§ Laurentiu Stanculescu

§ Marc Lieber

§ Martin Rezk (OnTop)

§ Andrea Splendiani

47

Semantic Web technologies experiences in Novartis

| Semantic Web technologies: experiences in Novartis| Andrea Splendiani | 2nd December 2015| Technology | Public use