Presentation to the J. Craig Venter Institute, Dec. 2014

216
“Shopping for data should be as easy as shopping for shoes!” Dr. Carole Goble Professor, Dept. of Computer Science University of Manchester

Transcript of Presentation to the J. Craig Venter Institute, Dec. 2014

Page 1: Presentation to the J. Craig Venter Institute, Dec. 2014

“Shopping for data should be as easy as

shopping for shoes!”

Dr. Carole Goble

Professor, Dept. of Computer Science

University of Manchester

Page 2: Presentation to the J. Craig Venter Institute, Dec. 2014

“A little bit of semantics goes a long way”

Dr. James Hendler

Artificial Intelligence Researcher

Rensselaer Polytechnic Institute

One of the originators of the Semantic Web

Page 3: Presentation to the J. Craig Venter Institute, Dec. 2014

…but a lot of semantics goes a long, long way!

Mark Wilkinson

Isaac Peral Distinguished ResearcherDirector, Fundación BBVA Chair in Biological Informatics

Center for Plant Biotechnology and GenomicsTechnical University of Madrid

Page 4: Presentation to the J. Craig Venter Institute, Dec. 2014

Making the Web a

biomedical research platform

from hypothesis through to publication

Page 5: Presentation to the J. Craig Venter Institute, Dec. 2014

Publication

Discourse

Hypothesis

Experiment

Interpretation

Page 6: Presentation to the J. Craig Venter Institute, Dec. 2014

Publication

Discourse

Hypothesis

Experiment

Interpretation

Page 7: Presentation to the J. Craig Venter Institute, Dec. 2014

Motivation:

3 intersecting trends in the Life Sciences

that are now, or soon will be,

extremely problematic

Page 8: Presentation to the J. Craig Venter Institute, Dec. 2014

NON-REPRODUCIBLE SCIENCE & THE FAILURE OF PEER REVIEW

TREND #1

Page 9: Presentation to the J. Craig Venter Institute, Dec. 2014

Trend #1

Multiple recent surveys of high-throughput biology

reveal that upwards of 50% of published studies

are not reproducible

- Baggerly, 2009

- Ioannidis, 2009

Page 10: Presentation to the J. Craig Venter Institute, Dec. 2014

Similar (if not worse!) in clinical studies

- Begley & Ellis, Nature, 2012

- Booth, Forbes, 2012

- Huang & Gottardo, Briefings in Bioinformatics, 2012

Trend #1

Page 11: Presentation to the J. Craig Venter Institute, Dec. 2014

Trend #1

“the most common errors are simple,

the most simple errors are common”

At least partially because the

analytical methodology was inappropriate

and/or not sufficiently described

- Baggerly, 2009

Page 12: Presentation to the J. Craig Venter Institute, Dec. 2014

Trend #1

These errors pass peer review

The researcher is (sometimes) unaware of the error

The process that led to the error is not recorded

Therefore it cannot be detected during peer-review

Page 13: Presentation to the J. Craig Venter Institute, Dec. 2014

Agencies have Noticed!

In March, 2012, the US Institute of Medicine ~said

“Enough is enough!”

Page 14: Presentation to the J. Craig Venter Institute, Dec. 2014

Agencies have Noticed!

Institute of Medicine Recommendations

For Conduct of High-Throughput Research:

Evolution of Translational Omics Lessons Learned and the Path Forward. The

Institute of Medicine of the National Academies, Report Brief, March 2012.

1. Rigorously-described, -annotated, and -followed data

management and manipulation procedures

2. “Lock down” the computational analysis pipeline once it

has been selected

3. Publish the analytical workflow in a formal manner,

together with the full starting and result datasets

Page 15: Presentation to the J. Craig Venter Institute, Dec. 2014

BIGGER, CHEAPER DATA

TREND #2

Page 16: Presentation to the J. Craig Venter Institute, Dec. 2014

Trend #2

High-throughput technologies are becoming

cheaper and easier to use

Page 17: Presentation to the J. Craig Venter Institute, Dec. 2014

Trend #2

High-throughput technologies are becoming

cheaper and easier to use

But there are still very few experts trained in

statistical analysis of high-throughput data

Page 18: Presentation to the J. Craig Venter Institute, Dec. 2014

Trend #2

The number of job postings for data scientist

positions increased by 15,000% between the

summers of 2011 and 2012

-- Indeed.com job trends data reported by

http://blogs.nature.com/naturejobs/2013/03/18/so-you-want-to-be-a-data-scientist

Page 19: Presentation to the J. Craig Venter Institute, Dec. 2014

Trend #2

Therefore

Even small, moderately-funded laboratories

can now afford to produce more data

than they can manage or interpret

Page 20: Presentation to the J. Craig Venter Institute, Dec. 2014

Trend #2

Therefore

Even small, moderately-funded laboratories

can now afford to produce more data

than they can manage or interpret

These labs will likely never be able to afford

a qualified data scientist

Page 21: Presentation to the J. Craig Venter Institute, Dec. 2014

“THE SINGULARITY”

TREND #3

Page 22: Presentation to the J. Craig Venter Institute, Dec. 2014

The Healthcare

Singularity and the

Age of Semantic

Medicine, Michael

Gillam, et al, The

Fourth Paradigm:

Data-Intensive

Scientific Discovery

Tony Hey (Editor),

2009

Slide adapted with

permission from

Joanne Luciano,

Presentation at

Health Web

Science Workshop

2012, Evanston IL,

USA

June 22, 2012.

Trend #3

Page 23: Presentation to the J. Craig Venter Institute, Dec. 2014

The Healthcare Singularity and the Age of Semantic Medicine, Michael Gillam, et al, The Fourth Paradigm: Data-Intensive Scientific Discovery Tony Hey (Editor), 2009

Slide Borrowed with Permission from Joanne Luciano, Presentation at Health Web Science Workshop 2012, Evanston IL, USA

June 22, 2012.

“The Singularity”

The X-intercept is where, the moment a discovery is made,

it is immediately put into practice

Page 24: Presentation to the J. Craig Venter Institute, Dec. 2014

Scientific research would have to be

conducted within a medium that

immediately interpreted

and disseminated the results...

You Are

Here

Page 25: Presentation to the J. Craig Venter Institute, Dec. 2014

...in a form that immediately (actively!) affected the

results of other researchers...

You Are

Here

Page 26: Presentation to the J. Craig Venter Institute, Dec. 2014

...without requiring them to be aware

of these new discoveries.

You Are

Here

Page 27: Presentation to the J. Craig Venter Institute, Dec. 2014

3 intersecting and problematic trends

Non-reproducible science that passes peer-review

Cheaper production of larger and more complex datasets

that require specialized expertise to analyze properly

Need to more rapidly disseminate and use new discoveries

Page 28: Presentation to the J. Craig Venter Institute, Dec. 2014

We Want More!

Page 29: Presentation to the J. Craig Venter Institute, Dec. 2014

I don’t just want to reproduce

your experiment...

Page 30: Presentation to the J. Craig Venter Institute, Dec. 2014

I want to re-use your experiment

Page 31: Presentation to the J. Craig Venter Institute, Dec. 2014

In my own laboratory... On MY DATA!

Page 32: Presentation to the J. Craig Venter Institute, Dec. 2014

When I do my analysis

I want to draw on the knowledge

of global domain-experts like

statisticians and pathologists...

...as if they were mentors sitting

in the chair beside me.

Page 33: Presentation to the J. Craig Venter Institute, Dec. 2014

Image from: Mark Smiciklas

Intersection Consulting, cc-nca

Please don’t make me find

all of the data and knowledge

that I require to do my experiment

...it simply isn’t possible anymore...

Page 34: Presentation to the J. Craig Venter Institute, Dec. 2014

Image from AJ Cann

cc-by-a license

I want to support peer review(ers)

so that I do better science.

Page 35: Presentation to the J. Craig Venter Institute, Dec. 2014

How do we get there from here?

Page 36: Presentation to the J. Craig Venter Institute, Dec. 2014

To overcome these intersecting problems

and to achieve the goals of transparent

reproducible research

Page 37: Presentation to the J. Craig Venter Institute, Dec. 2014

We must learn how to

do research IN the Web

Not OVER the Web

Page 38: Presentation to the J. Craig Venter Institute, Dec. 2014

How we use

The Web today

Page 39: Presentation to the J. Craig Venter Institute, Dec. 2014

The Web is not a pigeon!

Page 40: Presentation to the J. Craig Venter Institute, Dec. 2014

Semantic Web Technologies

Page 41: Presentation to the J. Craig Venter Institute, Dec. 2014

The Web

Page 42: Presentation to the J. Craig Venter Institute, Dec. 2014

The Semantic Web

causally related to

Page 43: Presentation to the J. Craig Venter Institute, Dec. 2014

This is the critical bit!

causally related to

The link is explicitly labeled!

???

Page 44: Presentation to the J. Craig Venter Institute, Dec. 2014

http://semanticscience.org/resource/SIO_000243

SIO_000243:

<owl:ObjectProperty rdf:about="&resource;SIO_000243">

<rdfs:label xml: lang="en"> is causally related with</rdfs:label>

<rdf:type rdf:resource="&owl;SymmetricProperty"/>

<rdf:type rdf:resource="&owl;TransitiveProperty"/>

<dc:description xml:lang="en"> A transitive, symmetric, temporal relation

in which one entity is causally related with another non-identical entity.

</dc:description>

<rdfs:subPropertyOf rdf:resource="&resource;SIO_000322"/>

</owl:ObjectProperty>

causally related with

Page 45: Presentation to the J. Craig Venter Institute, Dec. 2014

http://semanticscience.org/resource/SIO_000243

SIO_000243:

<owl:ObjectProperty rdf:about="&resource;SIO_000243">

<rdfs:label xml: lang="en"> is causally related with</rdfs:label>

<rdf:type rdf:resource="&owl;SymmetricProperty"/>

<rdf:type rdf:resource="&owl;TransitiveProperty"/>

<dc:description xml:lang="en"> A transitive, symmetric, temporal relation

in which one entity is causally related with another non-identical entity.

</dc:description>

<rdfs:subPropertyOf rdf:resource="&resource;SIO_000322"/>

</owl:ObjectProperty>

causally related with

Page 46: Presentation to the J. Craig Venter Institute, Dec. 2014

Semantic Web Technologies

“deep semantics”

Page 47: Presentation to the J. Craig Venter Institute, Dec. 2014

Deep Semantics?

Page 48: Presentation to the J. Craig Venter Institute, Dec. 2014

Ontology Spectrum

Catalog/

ID

Selected

Logical

Constraints(disjointness,

inverse, …)

Terms/

glossary

Thesauri

“narrower

term”

relationFormal

is-a

Frames

(Properties)

Informal

is-a

Formal

instanceValue Restrs. General

Logical

constraints

Originally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty;– updated by McGuinness.

Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html

Page 49: Presentation to the J. Craig Venter Institute, Dec. 2014

Ontology Spectrum

Catalog/

ID

Selected

Logical

Constraints(disjointness,

inverse, …)

Terms/

glossary

Thesauri

“narrower

term”

relationFormal

is-a

Frames

(Properties)

Informal

is-a

Formal

instanceValue Restrs. General

Logical

constraints

Originally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty;– updated by McGuinness.

Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html

Most biomedical ontologies

e.g. Gene Ontology

Page 50: Presentation to the J. Craig Venter Institute, Dec. 2014

Ontology Spectrum

Catalog/

ID

Selected

Logical

Constraints(disjointness,

inverse, …)

Terms/

glossary

Thesauri

“narrower

term”

relationFormal

is-a

Frames

(Properties)

Informal

is-a

Formal

instanceValue Restrs. General

Logical

constraints

Originally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty;– updated by McGuinness.

Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html

Ontologies being used in today’s talk

Most biomedical ontologies

e.g. Gene Ontology

Page 51: Presentation to the J. Craig Venter Institute, Dec. 2014

Ontology Spectrum

Catalog/

ID

Selected

Logical

Constraints(disjointness,

inverse, …)

Terms/

glossary

Thesauri

“narrower

term”

relationFormal

is-a

Frames

(Properties)

Informal

is-a

Formal

instanceValue Restrs. General

Logical

constraints

Originally from AAAI 1999- Ontologies Panel by Gruninger, Lehmann, McGuinness, Uschold, Welty;– updated by McGuinness.

Description in: www.ksl.stanford.edu/people/dlm/papers/ontologies-come-of-age-abstract.html

Categorization Systems

Like library shelves, inflexible

Discovery & Interpretation systems – flexible!

Page 52: Presentation to the J. Craig Venter Institute, Dec. 2014

Remember, this is the critical bit!

http://semanticscience.org/resource/SIO_000243

causally related with

It’s relationships that make

the Semantic Web “Semantic”

Page 53: Presentation to the J. Craig Venter Institute, Dec. 2014

Semantic Web Technologies

“deep semantics”

Page 54: Presentation to the J. Craig Venter Institute, Dec. 2014

Even with “deep semantics”

a lot of important information cannot be represented

on the Semantic Web

For example, all of the data that results from

analytical algorithms and statistical analyses

Page 55: Presentation to the J. Craig Venter Institute, Dec. 2014
Page 56: Presentation to the J. Craig Venter Institute, Dec. 2014
Page 57: Presentation to the J. Craig Venter Institute, Dec. 2014

Varying estimates

put the size of the

Deep Web between

500 and 800 times

larger than the

surface Web

Page 58: Presentation to the J. Craig Venter Institute, Dec. 2014

On the WWW

“automation” of

access to Deep Web

data happens through

“Web Services”

Page 59: Presentation to the J. Craig Venter Institute, Dec. 2014

There are many suggestions for how to bring the Deep Web

into the Semantic Web using Semantic Web Services (SWS)

Page 60: Presentation to the J. Craig Venter Institute, Dec. 2014

There are many suggestions for how to bring the Deep Web

into the Semantic Web using Semantic Web Services (SWS)

Describe input data

Describe output data

Describe how the system manipulates the data

Describe how the world changes as a result

Page 61: Presentation to the J. Craig Venter Institute, Dec. 2014

There are many suggestions for how to bring the Deep Web

into the Semantic Web using Semantic Web Services (SWS)

Describe input data

Describe output data

Describe how the system manipulates the data

Describe how the world changes as a result

None, so far, has proven to be wildly successful

(in my opinion)

Page 62: Presentation to the J. Craig Venter Institute, Dec. 2014

There are many suggestions for how to bring the Deep Web

into the Semantic Web using Semantic Web Services (SWS)

Describe input data

Describe output data

Describe how the system manipulates the data

Describe how the world changes as a result

None, so far, has proven to be wildly successful

(in my opinion)

…because describing what a Service does is HARD!

Page 63: Presentation to the J. Craig Venter Institute, Dec. 2014

Lord, Phillip, et al. The Semantic Web–ISWC 2004 (2004): 350-364.

Page 64: Presentation to the J. Craig Venter Institute, Dec. 2014

Lord, Phillip, et al. The Semantic Web–ISWC 2004 (2004): 350-364.

Scientific Web Services are DIFFERENT!

Page 65: Presentation to the J. Craig Venter Institute, Dec. 2014

Lord, Phillip, et al. The Semantic Web–ISWC 2004 (2004): 350-364.

“The service interfaces within bioinformatics are relatively simple. An extensible or constrained interoperability

framework is likely to suffice for current demands: a fully generic framework is currently not necessary.”

Page 66: Presentation to the J. Craig Venter Institute, Dec. 2014

Scientific Web Services are DIFFERENT!

They’re simpler!

So perhaps we can solve the Semantic Web Service problem

as it pertains to this (important!) domain

Page 67: Presentation to the J. Craig Venter Institute, Dec. 2014

With respect to the Semantic Web

What is missing from this list?

Describe input data

Describe output data

Describe how the system manipulates the data

Describe how the world changes as a result

Page 68: Presentation to the J. Craig Venter Institute, Dec. 2014

http://semanticscience.org/resource/SIO_000243

causally related with

Page 69: Presentation to the J. Craig Venter Institute, Dec. 2014

http://semanticscience.org/resource/SIO_000243

The Semantic Web gets its semantics from relationships

causally related with

Page 70: Presentation to the J. Craig Venter Institute, Dec. 2014

http://semanticscience.org/resource/SIO_000243

In 2008 I published a set of design-patterns

for scientific Semantic Web Services

that focuses on the biological relationship that the Service “exposes”

causally related with

The Semantic Web gets its semantics from relationships

Page 71: Presentation to the J. Craig Venter Institute, Dec. 2014

Design Pattern for

Web Services on the Semantic Web

Page 72: Presentation to the J. Craig Venter Institute, Dec. 2014

AACTCTTCGTAGTG...

BLAST

Web Service

Page 73: Presentation to the J. Craig Venter Institute, Dec. 2014

AACTCTTCGTAGTG...

BLAST

SADI

has

homology

to

Terminal Flower

type

gene

species

A. thal.

SADI requires you to explicitly declare

as part of your analytical output,

the biological relationship that your

algorithm “exposed”.

sequence

has_seq_string

AACTCTTCGTAGTG...

sequence

has_seq_string

Page 74: Presentation to the J. Craig Venter Institute, Dec. 2014

I want to share several stories that demonstrate

the cool things that happen when you use

SADI + deep semantics

Page 75: Presentation to the J. Craig Venter Institute, Dec. 2014

The Semantic Health

and Research Environment

Story #1: SHARE

Page 76: Presentation to the J. Craig Venter Institute, Dec. 2014

A proof-of-concept workflow orchestrator

+ SADI Semantic Web Service registry

Objective: answer biologists’ questions

Page 77: Presentation to the J. Craig Venter Institute, Dec. 2014

The SHARE registry

indexes all of the input/output/relationship

triples that can be generated by all known services

This is how SHARE discovers services

Page 78: Presentation to the J. Craig Venter Institute, Dec. 2014

SHARE demonstrations

with increasingsemantic complexity

Page 79: Presentation to the J. Craig Venter Institute, Dec. 2014

What is the phenotype of every allele of the

Antirrhinum majus DEFICIENS gene

SELECT ?allele ?image ?desc

WHERE {locus:DEF genetics:hasVariant ?allele .?allele info:visualizedByImage ?image .?image info:hasDescription ?desc

}

Page 80: Presentation to the J. Craig Venter Institute, Dec. 2014

What is the phenotype of every allele of the

Antirrhinum majus DEFICIENS gene

SELECT ?allele ?image ?desc

WHERE {locus:DEF genetics:hasVariant ?allele .?allele info:visualizedByImage ?image .?image info:hasDescription ?desc

}

The query language here is SPARQL

The W3C-approved, standard query language for the Semantic Web

Page 81: Presentation to the J. Craig Venter Institute, Dec. 2014

What is the phenotype of every allele of the

Antirrhinum majus DEFICIENS gene

SELECT ?allele ?image ?desc

WHERE {locus:DEF genetics:hasVariant ?allele .?allele info:visualizedByImage ?image .?image info:hasDescription ?desc

}

Note that there is no “FROM” clause!

We don’t tell it where it should get the information,

The machine has to figure that out by itself...

Page 82: Presentation to the J. Craig Venter Institute, Dec. 2014

What is the phenotype of every allele of the

Antirrhinum majus DEFICIENS gene

SELECT ?allele ?image ?desc

WHERE {locus:DEF genetics:hasVariant ?allele .?allele info:visualizedByImage ?image .?image info:hasDescription ?desc

}

Starting data: the locus “DEF” (Deficiens)

Page 83: Presentation to the J. Craig Venter Institute, Dec. 2014

What is the phenotype of every allele of the

Antirrhinum majus DEFICIENS gene

SELECT ?allele ?image ?desc

WHERE {locus:DEF genetics:hasVariant ?allele .?allele info:visualizedByImage ?image .?image info:hasDescription ?desc

}

Query: A series of relationships v.v. DEF

Page 84: Presentation to the J. Craig Venter Institute, Dec. 2014

Enter that query into

SHARE

Page 85: Presentation to the J. Craig Venter Institute, Dec. 2014

Click “Submit”...

Page 86: Presentation to the J. Craig Venter Institute, Dec. 2014

...and in a few seconds you get your answer.

Based on the relationships in your query, SHARE queried its registry

to automatically discover SADI Services capable of generating those triples

Page 87: Presentation to the J. Craig Venter Institute, Dec. 2014

Because it is the Semantic Web

The query results are live hyperlinks

to the respective Database or images

(The answer is IN the Web!)

Page 88: Presentation to the J. Craig Venter Institute, Dec. 2014

What pathways does UniProt protein P47989 belong to?

PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#>PREFIX ont: <http://ontology.dumontierlab.com/>PREFIX uniprot: <http://lsrn.org/UniProt:>SELECT ?gene ?pathway WHERE {

uniprot:P47989 pred:isEncodedBy ?gene . ?gene ont:isParticipantIn ?pathway .

}

Page 89: Presentation to the J. Craig Venter Institute, Dec. 2014

What pathways does UniProt protein P47989 belong to?

PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#>PREFIX ont: <http://ontology.dumontierlab.com/>PREFIX uniprot: <http://lsrn.org/UniProt:>SELECT ?gene ?pathway WHERE {

uniprot:P47989 pred:isEncodedBy ?gene . ?gene ont:isParticipantIn ?pathway .

}

Page 90: Presentation to the J. Craig Venter Institute, Dec. 2014

What pathways does UniProt protein P47989 belong to?

PREFIX pred: <http://sadiframework.org/ontologies/predicates.owl#>PREFIX ont: <http://ontology.dumontierlab.com/>PREFIX uniprot: <http://lsrn.org/UniProt:>SELECT ?gene ?pathway WHERE {

uniprot:P47989 pred:isEncodedBy ?gene . ?gene ont:isParticipantIn ?pathway .

}

Note again that there is no “From” clause…

I have not told SHARE where to look for the

answer, I am simply asking my question

Page 91: Presentation to the J. Craig Venter Institute, Dec. 2014

Enter that query into

SHARE

Page 92: Presentation to the J. Craig Venter Institute, Dec. 2014
Page 93: Presentation to the J. Craig Venter Institute, Dec. 2014
Page 94: Presentation to the J. Craig Venter Institute, Dec. 2014

Two different

providers of

gene

information

(KEGG &

NCBI);

were found &

accessed

Two different

providers of

pathway

information

(KEGG and

GO);

were found &

accessed

Page 95: Presentation to the J. Craig Venter Institute, Dec. 2014

The results are all links to the original data(The answer is IN the Web!)

Page 96: Presentation to the J. Craig Venter Institute, Dec. 2014

Show me the latest Blood Urea Nitrogen and Creatinine levels

of patients who appear to be rejecting their transplants

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creatFROM <http://sadiframework.org/ontologies/patients.rdf>WHERE {

?patient rdf:type patient:LikelyRejecter .?patient l:latestBUN ?bun . ?patient l:latestCreatinine ?creat .

}

Page 97: Presentation to the J. Craig Venter Institute, Dec. 2014

Show me the latest Blood Urea Nitrogen (BUN) and

Creatinine levels of patients who appear to be

rejecting their transplants

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX patient: <http://sadiframework.org/ontologies/patients.owl#> PREFIX l: <http://sadiframework.org/ontologies/predicates.owl#> SELECT ?patient ?bun ?creatFROM <http://sadiframework.org/ontologies/patients.rdf>WHERE {

?patient rdf:type patient:LikelyRejecter .?patient l:latestBUN ?bun . ?patient l:latestCreatinine ?creat .

}

Page 98: Presentation to the J. Craig Venter Institute, Dec. 2014

Likely Rejecter:

A patient who has creatinine levels

that are increasing over time

- - Mark D Wilkinson’s definition

Page 99: Presentation to the J. Craig Venter Institute, Dec. 2014

Likely Rejecter:

…but there is no “likely rejecter”

column or table in our database…

only blood chemistry measurements

at various time-points

Page 100: Presentation to the J. Craig Venter Institute, Dec. 2014

Likely Rejecter:

So the data required to answer this question

DOESN’T EXIST!

Page 101: Presentation to the J. Craig Venter Institute, Dec. 2014

My definition of a Likely Rejecter is encoded in

a machine-readable document written in the OWL Ontology language

Basically:

“the regression line over creatinine measurements should have an increasing slope”

Page 102: Presentation to the J. Craig Venter Institute, Dec. 2014

Our ontology refers to other ontologies (possibly published by other people)

to learn about what the properties of “regression models” are

e.g. that regression models have slopes and intercepts

and that slopes and intercepts have decimal values

Page 103: Presentation to the J. Craig Venter Institute, Dec. 2014

?

Page 104: Presentation to the J. Craig Venter Institute, Dec. 2014

Enter that query into

SHARE

Page 105: Presentation to the J. Craig Venter Institute, Dec. 2014

SHARE examines the query

Burrows around the Web reading the various ontologies

then uses the discovered Class definitions as a template to map a path from what it has, to what it needs, using

SADI services

Page 106: Presentation to the J. Craig Venter Institute, Dec. 2014

Based on the Class definition

SHARE decides that it needs to do a

Linear Regression analysis

on the blood creatinine measurements

Page 107: Presentation to the J. Craig Venter Institute, Dec. 2014

?

Page 108: Presentation to the J. Craig Venter Institute, Dec. 2014

The conversation between SHARE and the registry

reveals the use of “Deep Semantics”

Q: Is there a SADI service that will consume instances of Patient and give

me instances of LikelyRejector

A: No

Q: Okay... So LikelyRejectors need a regression model of increasing slope

over their BloodCreatinine, so... Is there a SADI service that will consume

BloodCreatinine over time and give me its linear regression model?

A: No

Q: Okay... Blood Creatinine over time is a subclass of data of type

X/Y coordinate, so is there a service that consumes X/Y data and

returns its regression model?

A: Yes here’s the URL.

Page 109: Presentation to the J. Craig Venter Institute, Dec. 2014

The SHARE system utilizes SADI to discover

analytical services on the Web that do linear regression analysis

and sends the data to be analyzed

Page 110: Presentation to the J. Craig Venter Institute, Dec. 2014

This happens iteratively(e.g. SHARE also has to examine the slope of the regression line

using another service, find the “latest” in a series of time measurements, etc.)

There is reasoning after every Service invocation

(i.e. after every clause in the query)

Once it is able to find instances (OWL Individuals)

of the LikelyRejector class, it continues with the

rest of the query

Page 111: Presentation to the J. Craig Venter Institute, Dec. 2014

VOILA!

Page 112: Presentation to the J. Craig Venter Institute, Dec. 2014

The way SHARE “interprets” data varies

depending on the context of the query

(i.e. which ontologies it reads – Mine? Yours?)

and on what part of the query

it is trying to answer at any given moment

(which ontological concept is relevant to that clause)

Page 113: Presentation to the J. Craig Venter Institute, Dec. 2014

Example?

Blood Creatinine measurements

were not dictated to be

Blood Creatinine measurements

Page 114: Presentation to the J. Craig Venter Institute, Dec. 2014

Example?

The data had the ‘qualities/properties’ that

allowed one machine to interpret

that they were Blood Creatinine measurements

(e.g. to determine which patients were rejecting)

Page 115: Presentation to the J. Craig Venter Institute, Dec. 2014

Example?

But the data also had the ‘qualities/properties’ that

allowed another machine to interpret them as

Simple X/Y coordinate data

(e.g. the Linear Regression calculation tool)

Page 116: Presentation to the J. Craig Venter Institute, Dec. 2014

Benefit

of Deep Semantics

Data is amenable to

constant re-interpretation

Page 117: Presentation to the J. Craig Venter Institute, Dec. 2014

http://www.flickr.com/people/faernworks/

Page 118: Presentation to the J. Craig Venter Institute, Dec. 2014

One example of the “little ways”

that Semantics will help researchers

day-by-day

Story #2: Measurement Units

Page 119: Presentation to the J. Craig Venter Institute, Dec. 2014

Units must be harmonized

Don’t leave this up to the researcher(it’s fiddly, time-consuming, and error-prone)

Page 120: Presentation to the J. Craig Venter Institute, Dec. 2014

NASA Mars Climate Orbiter

Page 121: Presentation to the J. Craig Venter Institute, Dec. 2014

Oops!

Page 122: Presentation to the J. Craig Venter Institute, Dec. 2014

ID HEIGHT WEIGHT SBP CHOL HDL BMI

GR

SBP

GR

CHOL

GR

HDL

GR

pt1 1.82 177 128 227 55 0 0 1 0

pt2 179 196 13.4 5.9 1.7 1 0 1 0

The Reality of Clinical Datasets

(this is a small snapshot of a dataset we worked on,

courtesy of Dr. Bruce McManus & Janet McManus, from the PROOF COE)

Height in m and cm Chol in mmol/l and mg/l

...and other delicious weirdness The clinical analyses described here

were supported in part by the

PROOF Center of Excellence

for the Prevention of Organ Failure

Page 123: Presentation to the J. Craig Venter Institute, Dec. 2014

GOAL: reduce the likelihood of errors by

getting the clinical researcher

“out of the loop”

(as per the Institute of Medicine Recommendations)

Page 124: Presentation to the J. Craig Venter Institute, Dec. 2014

Experiment:

Reproduce a clinical study

(from >10 years ago)

by logically encoding

the clinical diagnosis guidelines

of the American Heart Association

then ask SHARE to automatically

analyse the patient clinical data

Page 125: Presentation to the J. Craig Venter Institute, Dec. 2014

Semantically defining globally-accepted clinical phenotypes;

Building on the expertise of others

SystolicBloodPressure =

GALEN:SystolicBloodPressure and

("sio:has measurement value" some "sio:measurement" and ("sio:has unit" some “om: unit of measure”) and

(“om:dimension” value “om:pressure or stress dimension”) and

"sio:has value" some rdfs:Literal))

GALEN is a popular biomedical ontology

but it is largely, like GO, a series of

named but undefined Classes

Page 126: Presentation to the J. Craig Venter Institute, Dec. 2014

Semantically defining globally-accepted clinical phenotypes;

Building on the expertise of others

SystolicBloodPressure =

GALEN:SystolicBloodPressure and

("sio:has measurement value" some "sio:measurement" and ("sio:has unit" some “om: unit of measure”) and

(“om:dimension” value “om:pressure or stress dimension”) and

"sio:has value" some rdfs:Literal))

So we use OWL to extend the GALEN

Classes with rich, logical descriptors

that take advantage of rich semantic

relationships like “has measurement valule”

and “dimension” and “has unit”

Page 127: Presentation to the J. Craig Venter Institute, Dec. 2014

Semantically defining globally-accepted clinical phenotypes;

Building on the expertise of others

SystolicBloodPressure =

GALEN:SystolicBloodPressure and

("sio:has measurement value" some "sio:measurement" and ("sio:has unit" some “om: unit of measure”) and

(“om:dimension” value “om:pressure or stress dimension”) and

"sio:has value" some rdfs:Literal))

Very general definition

“some kind of pressure unit”

(so that others can build on this as they wish!)

Page 128: Presentation to the J. Craig Venter Institute, Dec. 2014

HighRiskSystolicBloodPressure (as defined by Framingham)

SystolicBloodPressure and

sio:hasMeasurement some

(sio:Measurement and

(“sio:has unit” value om:kilopascal) and

(sio:hasValue some double[>= "18.7"^^double])))

Now we are specific to our clinical study (Framingham definitions):

MUST be in kpascal and must be > 18.7

Semantically defining globally-accepted clinical phenotypes;

Building on the expertise of others

Page 129: Presentation to the J. Craig Venter Institute, Dec. 2014

SELECT ?record ?convertedvalue ?convertedunit

FROM <./patient.rdf>

WHERE {

?record rdf:type measure:HighRiskSystolicBloodPressure .

?record sio:hasMeasurement ?measurement.

?measurement sio:hasValue ?Pressure.

}

RecordID Start Val Start Unit Pressure End Unit

Pt1 15 cmHg 19.998 KiloPascal

Pt2 14.6 cmHg 19.465 KiloPascal

Pt1 148 mmHg 19.731 KiloPascal

Pt2 146 mmHg 19.465 KiloPascal

Running the Clinical Analysis

“Select the patients who are at-risk”

All measurements have now been automatically

harmonized to KiloPascal, because we encoded the

semantics in the model

Page 130: Presentation to the J. Craig Venter Institute, Dec. 2014

While doing this experiment, we noticed

some interesting anomalies…

Page 131: Presentation to the J. Craig Venter Institute, Dec. 2014

Visual inspection of our output data and the AHA guidelines

showed that in many cases the clinician

“tweaked” the guidelines when doing their analysis

------------------

AHA BMI risk threshold: BMI=25

In our dataset the clinical researcher used BMI=26

------------------

AHA HDL guideline HDL<=1.03mmol/l

The dataset from our researcher: HDL<=0.89mmol/l

-------------------

Page 132: Presentation to the J. Craig Venter Institute, Dec. 2014

Visual inspection of our output data and the AHA guidelines

showed that in many cases the clinician

“tweaked” the guidelines when doing their analysis

These Alterations Were Not Recorded

in Their Study Notes!

Page 133: Presentation to the J. Craig Venter Institute, Dec. 2014

Adjusting our Semantic definitions and re-running the analysis

resulted in nearly 100% correspondence with the clinical researcher

HighRiskCholesterolRecord=

PatientRecord and

(sio:hasAttribute some

(cardio:SerumCholesterolConcentration and

sio:hasMeasurement some ( sio:Measurement and

(sio:hasUnit value cardio:mili-mole-per-liter) and

(sio:hasValue some double[>= 5.0]))))

HighRiskCholesterolRecord=

PatientRecord and

(sio:hasAttribute some

(cardio:SerumCholesterolConcentration and

sio:hasMeasurement some ( sio:Measurement and

(sio:hasUnit value cardio:mili-mole-per-liter) and

(sio:hasValue some double[>= 5.2]))))

Page 134: Presentation to the J. Craig Venter Institute, Dec. 2014

Reflect on this for a second... Because this is important!

1. We semantically encoded clinical guidelines

2. We found that clinical researchers did not follow the official guidelines

3. Their “personalization” of the guidelines was unreported

4. Nevertheless, we were able to create “personalized” Semantic Models

5. These models reflect the opinion of an individual domain-expert

6. These models are shared on the Web

7. Can be automatically re-used by others to interpret their own data using

that clinical expert’s viewpoint

Page 135: Presentation to the J. Craig Venter Institute, Dec. 2014

AHA:HighRiskCholesterolRecord

PatientRecord and

(sio:hasAttribute some

(cardio:SerumCholesterolConcentration and

sio:hasMeasurement some ( sio:Measurement and

(sio:hasUnit value cardio:mili-mole-per-liter) and

(sio:hasValue some double[>= 5.0]))))

McManus:HighRiskCholesterolRecord

PatientRecord and

(sio:hasAttribute some

(cardio:SerumCholesterolConcentration and

sio:hasMeasurement some ( sio:Measurement and

(sio:hasUnit value cardio:mili-mole-per-liter) and

(sio:hasValue some double[>= 5.2]))))

PREFIX AHA =http://americanheart.org/measurements/

PREFIX McManus=http://stpaulshospital.org/researchers/mcmanus/

Page 136: Presentation to the J. Craig Venter Institute, Dec. 2014

To do the analysis using AHL guidelines

SELECT ?patient ?risk

WHERE {

?patient rdf:type AHA: HighRiskCholesterolRecord .

?patient ex:hasCholesterolProfile ?risk

}

Page 137: Presentation to the J. Craig Venter Institute, Dec. 2014

To do the analysis using McManus’ expert-opinion

SELECT ?patient ?risk

WHERE {

?patient rdf:type McManus:HighRiskCholesterolRecord .

?patient ex:hasCholesterolProfile ?risk

}

Page 138: Presentation to the J. Craig Venter Institute, Dec. 2014

Flexibility Transparency

Reproducibility Shareability Comparability

Simplicity Automation

Page 139: Presentation to the J. Craig Venter Institute, Dec. 2014

Personalization

(I’m going to return to this point several times)

Page 140: Presentation to the J. Craig Venter Institute, Dec. 2014

Reproduce a peer-reviewed

scientific publication

by semantically modelling

the problem

Story #3: in silico Science

Page 141: Presentation to the J. Craig Venter Institute, Dec. 2014

The PublicationDiscovering Protein Partners of a

Human Tumor Suppressor Protein

Page 142: Presentation to the J. Craig Venter Institute, Dec. 2014

Original Study Simplified

Using what is known about protein interactions

in fly & yeast

predict new interactions with this

Human Tumor Suppressor

Page 143: Presentation to the J. Craig Venter Institute, Dec. 2014

Semantic Model of the Experiment

OWL

Page 144: Presentation to the J. Craig Venter Institute, Dec. 2014

Note that every word in this

diagram is, in reality, a URL

(it’s a Semantic Web model)

i.e. It refers to the expertise of

other researchers, distributed

around the world on the Web

Semantic Model of the Experiment

Page 145: Presentation to the J. Craig Venter Institute, Dec. 2014

In a local data-file

provide the protein we are interested in

and the two species we wish to use in our comparison

taxon:9606 a i:OrganismOfInterest . # human

uniprot:Q9UK53 a i:ProteinOfInterest . # ING1

taxon:4932 a i:ModelOrganism1 . # yeast

taxon:7227 a i:ModelOrganism2 . # fly

Set-up the Experimental Conditions

Page 146: Presentation to the J. Craig Venter Institute, Dec. 2014

SELECT ?protein

FROM <file:/local/workflow.input.n3>

WHERE {

?protein a i:ProbableInteractor .

}

Run the Experiment

Page 147: Presentation to the J. Craig Venter Institute, Dec. 2014

SELECT ?protein

FROM <file:/local/workflow.input.n3>

WHERE {

?protein a i:ProbableInteractor .

}

Run the Experiment

This is the URL that leads our computer

to the Semantic model of the problem

Page 148: Presentation to the J. Craig Venter Institute, Dec. 2014

SHARE examines the semantic model of

Probable Interactors

Retrieves third-party expertise from the Web

Discusses with SADI

what analytical tools are necessary

Chooses the right tools for the problem

Solves the problem!

Page 149: Presentation to the J. Craig Venter Institute, Dec. 2014

SHARE derives (and executes) the following analysis automatically

Page 150: Presentation to the J. Craig Venter Institute, Dec. 2014

SHARE is aware of the context of the specific question being asked

Page 151: Presentation to the J. Craig Venter Institute, Dec. 2014
Page 152: Presentation to the J. Craig Venter Institute, Dec. 2014

There are five very cool things about what you just saw...

Page 153: Presentation to the J. Craig Venter Institute, Dec. 2014

There are five very cool things about what you just saw...

was able to create a

workflow based on a

semantic model1.

Page 154: Presentation to the J. Craig Venter Institute, Dec. 2014

There are five very cool things about what you just saw...

was able to create a

COMPUTATIONAL workflow

based on a BIOLOGICAL model

2.

Page 155: Presentation to the J. Craig Venter Institute, Dec. 2014

There are five very cool things about what you just saw...

(this is important because we want

this system to be used by clinicians and biologists

who don’t speak computerese!)2.

Page 156: Presentation to the J. Craig Venter Institute, Dec. 2014

There are five very cool things about what you just saw...

The workflow it created, and services

selected, differed depending on the

context of the question

taxon:4932 a i:ModelOrganism1 . # yeast

taxon:7227 a i:ModelOrganism2 . # fly

3.

Page 157: Presentation to the J. Craig Venter Institute, Dec. 2014

The workflow it created, and services

chosen, differed depending on the

context of the question

3.

There are five very cool things about what you just saw...

taxon:4932 a i:ModelOrganism1 . # yeast

taxon:7227 a i:ModelOrganism2 . # fly

The machine was contextually “aware of”

BOTH the biological model

AND the data it was analysing

(...remember this... It will be important later!)

Page 158: Presentation to the J. Craig Venter Institute, Dec. 2014

There are five very cool things about what you just saw...

The ontological model was abstract (and

shareable!), but the workflow generated

from that model was explicit and concrete

4.

Page 159: Presentation to the J. Craig Venter Institute, Dec. 2014

There are five very cool things about what you just saw...

The ontological model was abstract (and

shareable!), but the workflow generated

from that model was explicit and concrete

4.

Page 160: Presentation to the J. Craig Venter Institute, Dec. 2014

There are five very cool things about what you just saw...

The ontological model was abstract (and

shareable!), but the workflow generated

from that model was explicit and concrete

4.

This matters because…

Page 161: Presentation to the J. Craig Venter Institute, Dec. 2014

RememberTrend #1

“the most common errors are simple,

the most simple errors are common”

At least partially because the

analytical methodology was inappropriate

and/or not sufficiently described

Page 162: Presentation to the J. Craig Venter Institute, Dec. 2014

RememberTrend #1

“the most common errors are simple,

the most simple errors are common”

At least partially because the

analytical methodology was inappropriate

and/or not sufficiently described

Here, the methodology leading to a result is explicit

and automatically constructed from an abstract template

so this is (at least in part) a

Solved Problem

Page 163: Presentation to the J. Craig Venter Institute, Dec. 2014

There are five very cool things about what you just saw...

The choice of tool-selection was

guided by the knowledge of

worldwide domain-experts encoded in

globally-distributed ontologies

(e.g. Expert high-throughput statisticians, etc...)

5.

Page 164: Presentation to the J. Craig Venter Institute, Dec. 2014

There are five very cool things about what you just saw...

The choice of tool-selection was

guided by the knowledge of

worldwide domain-experts encoded in

globally-distributed ontologies

(e.g. Expert high-throughput statisticians, etc...)

And this matters because…

5.

Page 165: Presentation to the J. Craig Venter Institute, Dec. 2014

RememberTrend #2

Even small, moderately-funded laboratories

can now afford to produce more data

than they can manage or interpret

These labs will likely never be able to afford

a qualified data scientist

Page 166: Presentation to the J. Craig Venter Institute, Dec. 2014

RememberTrend #2

Even small, moderately-funded laboratories

can now afford to produce more data

than they can manage or interpret

These labs will likely never be able to afford

a qualified data scientist

But if the expert knowledge of data scientists is

encoded in ontologies, and can be discovered

in a contextually-aware manner… then this is a

SOLVED PROBLEM

Page 167: Presentation to the J. Craig Venter Institute, Dec. 2014

Can we make the Health information

on the Web

more “personal”?

Story #4: Personalized Health Info

Page 168: Presentation to the J. Craig Venter Institute, Dec. 2014

Remember when I said...

The machine was contextually “aware of”

BOTH the biological model

AND the data it was analysing

Page 169: Presentation to the J. Craig Venter Institute, Dec. 2014

This “dual-awareness” provides some

very interesting opportunities

for personalizing a patient’s Health Research activity

Page 170: Presentation to the J. Craig Venter Institute, Dec. 2014

PROBLEM:

Patients are self-educating

both about their personal medical situation

(e.g. getting themselves sequenced)

also surfing the Web, getting dubious advice

from sites of dubious authority

and joining social-health groups

to exchange (often anecdotal)

medical “advice” with other patients

Page 171: Presentation to the J. Craig Venter Institute, Dec. 2014

PROBLEM:

Patients are self-educating

The information on any given site

may or may not

be relevant to THAT patient

Information on the Web is, by nature, not personalized

Page 172: Presentation to the J. Craig Venter Institute, Dec. 2014

PROBLEM:

Clinicians often have patients

(especially chronically-ill patients)

on a “trajectory” of treatment

Medicine is complicated!

e.g. the treatment trajectory of the patient can be

multi-step, and a specific sign/symptom might be

perfectly normal at a particular phase in their

“flow” of treatment

Page 173: Presentation to the J. Craig Venter Institute, Dec. 2014

PROBLEM SUMMARY

Patients are reading non-personalized medical text

of dubious quality and relevance

Clinicians have no way to intervene

in this self-education process

explaining to patients how the information they read

relates to their personal “health trajectory”

Page 174: Presentation to the J. Craig Venter Institute, Dec. 2014

Now you might see why this is so relevant!

The machine was contextually “aware of”

BOTH the biological model

AND the data it was analysing

Page 175: Presentation to the J. Craig Venter Institute, Dec. 2014

This is an early prototype of a

Patient-driven Personalized Medicine

Web interface

Page 176: Presentation to the J. Craig Venter Institute, Dec. 2014

Basically, it is a set of SHARE queries

Attached to a local database

of patient information

Running behind a Web bookmarklet

Page 177: Presentation to the J. Craig Venter Institute, Dec. 2014

The queries text-mine a Web page

then compare the concepts in the page

to the patient’s personal data

using a SHARE query

Page 178: Presentation to the J. Craig Venter Institute, Dec. 2014

The queries text-mine a Web page

then compare the concepts in the page

to the patient’s personal data

using a SHARE query

(that could contain ontologies...

...ontologies designed by their clinician!!)

Page 179: Presentation to the J. Craig Venter Institute, Dec. 2014
Page 180: Presentation to the J. Craig Venter Institute, Dec. 2014
Page 181: Presentation to the J. Craig Venter Institute, Dec. 2014
Page 182: Presentation to the J. Craig Venter Institute, Dec. 2014
Page 183: Presentation to the J. Craig Venter Institute, Dec. 2014

Matching based on official name, compound name, brand name, trade name,

or “common name”

Page 184: Presentation to the J. Craig Venter Institute, Dec. 2014

Still needs some work...

??!?!?

Page 185: Presentation to the J. Craig Venter Institute, Dec. 2014
Page 186: Presentation to the J. Craig Venter Institute, Dec. 2014
Page 187: Presentation to the J. Craig Venter Institute, Dec. 2014
Page 188: Presentation to the J. Craig Venter Institute, Dec. 2014

Link out to PubMed

Why the alert?

Page 189: Presentation to the J. Craig Venter Institute, Dec. 2014
Page 190: Presentation to the J. Craig Venter Institute, Dec. 2014

The SADI+SHARE workflow and reasoning was

personalized to YOUR medical data

Page 191: Presentation to the J. Craig Venter Institute, Dec. 2014

In future iterations, we will enable the workflow

to be further customized through “personalized”

OWL Classes (e.g. Provided by your Clinician!!)

Page 192: Presentation to the J. Craig Venter Institute, Dec. 2014

These OWL Classes might include information about the

current trajectory of your treatment for a chronic disease,

for example, such that what you read on the Web is

placed in the context of your expert Clinical care...

Page 193: Presentation to the J. Craig Venter Institute, Dec. 2014

Frankly, I think it’s quite cool that people

patients

are creating and running

“personal health-research” workflows

at the touch of a button!

Page 194: Presentation to the J. Craig Venter Institute, Dec. 2014

Almost the end…

Three brief final points....

Page 195: Presentation to the J. Craig Venter Institute, Dec. 2014

Publication

Discourse

Hypothesis

Experiment

Interpretation

??

Page 196: Presentation to the J. Craig Venter Institute, Dec. 2014

The Semantic Model represents

a possible solution to a problem

Page 197: Presentation to the J. Craig Venter Institute, Dec. 2014

The Semantic Model represents

a possible solution to a problem

By my definition, that is a hypothesis

Page 198: Presentation to the J. Craig Venter Institute, Dec. 2014

The Semantic Model represents

a possible solution to a problem

That hypothesis is tested by automatically converting it into a workflow;

Page 199: Presentation to the J. Craig Venter Institute, Dec. 2014

The Semantic Model represents

a possible solution to a problem

That hypothesis is tested by automatically converting it into a workflow;

the workflow, and the results of the workflow are intimately tied to the hypothesis

Page 200: Presentation to the J. Craig Venter Institute, Dec. 2014

The Semantic Model represents

a possible solution to a problem

i.e. You (or anyone!) can determine exactly which aspect

of the hypothesis led to which output data element, why, and how

Page 201: Presentation to the J. Craig Venter Institute, Dec. 2014

The Semantic Model represents

a possible solution to a problem

“Exquisite Provenance”

a perfect record not only of what was done, when, and how

but also WHY

Page 202: Presentation to the J. Craig Venter Institute, Dec. 2014

And this is important because...

Page 203: Presentation to the J. Craig Venter Institute, Dec. 2014

“Exquisite Provenance”

is required

for the output data and knowledge

to be published as...

Page 204: Presentation to the J. Craig Venter Institute, Dec. 2014

Richly annotated, citable, and queryable snippets of

scientific knowledge encoded in Linked Data/OWL

i.e. a way to publish data and knowledge on the Semantic Web

Page 205: Presentation to the J. Craig Venter Institute, Dec. 2014

Publication

Discourse

Hypothesis

Experiment

Interpretation

Page 206: Presentation to the J. Craig Venter Institute, Dec. 2014

A “modest” vision for

pure in silico Science

Page 207: Presentation to the J. Craig Venter Institute, Dec. 2014
Page 208: Presentation to the J. Craig Venter Institute, Dec. 2014

Last point… perhaps this is not yet obvious…

Page 209: Presentation to the J. Craig Venter Institute, Dec. 2014

SADI services consume Linked Data on the Web

Page 210: Presentation to the J. Craig Venter Institute, Dec. 2014

SADI services consume Linked Data on the Web

The ontologies provided to SHARE are

written in OWL, and are therefore

inherently part of the Web

Page 211: Presentation to the J. Craig Venter Institute, Dec. 2014

SADI services consume Linked Data on the Web

The ontologies provided to SHARE are

written in OWL, and are therefore

inherently part of the Web

SADI services create novel semantic links

between existing data-points on the Web, or

between existing data and new data

Page 212: Presentation to the J. Craig Venter Institute, Dec. 2014

SADI services consume Linked Data on the Web

The ontologies provided to SHARE are

written in OWL, and are therefore

inherently part of the Web

SADI services create novel semantic links

between existing data-points on the Web, or

between existing data and new data

The output of the automatically-generated workflow

is therefore Linked Data

and is therefore inherently part of the Web

Page 213: Presentation to the J. Craig Venter Institute, Dec. 2014

SADI services consume Linked Data on the Web

The ontologies provided to SHARE are

written in OWL, and are therefore

inherently part of the Web

SADI services create novel semantic links

between existing data-points on the Web, or

between existing data and new data

The output of the automatically-generated workflow

is therefore Linked Data

and is therefore inherently part of the Web

The concluding NanoPublications are a combination

of Linked Data and OWL, and are published directly to the Web

Page 214: Presentation to the J. Craig Venter Institute, Dec. 2014

The Life Science “Singularity”

The Semantic Web is a cradle-to-grave

biomedical research platform

that can, and will, dramatically improve

how biomedical research is done

WeAre

Here!

Page 215: Presentation to the J. Craig Venter Institute, Dec. 2014

The important people

Luke McCarthy

(SADI/SHARE)

Benjamin Vandervalk

(SHARE)

Dr. Soroush Samadian

(clinical experiments)

Ian Wood

(Experiment-replication experiment)

Page 216: Presentation to the J. Craig Venter Institute, Dec. 2014

Microsoft Research