J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

21
KUPKB: Sharing, Connecting and Exposing Kidney and Urinary Knowledge using RDF and OWL Julie Klein & Simon Jupp Bio-health informatics group University of Manchester www.kupkb.org

description

Presentation at BOSC2012 by J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

Transcript of J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

Page 1: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

KUPKB: Sharing, Connecting and Exposing Kidney and Urinary

Knowledge using RDF and OWL

Julie Klein & Simon JuppBio-health informatics group

University of Manchester

www.kupkb.org

Page 2: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

The problem domain

Thousands of studies have been conducted by the kidney research community

On different species

On different materials

• On different biological levels

gene

human mouse

urine tissue

protein

cell

Large diversity Integration of the knowldege is complex

Page 3: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

Where does the data go?Research Papers

Bespoke kidney laboratory databases

Generalist databases

Scattered, hidden in figures, coming in different formatsMost of the data is lost!

Page 4: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

The Kidney and Urinary Pathway Knowledge Base:

SHARE AND CONNECT

The iKUP Browser:

EXPOSE

www.kupkb.org

Page 5: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

Stucture

KUP Ontology(schema)

Experimental data

KUP Knowledge Base

RDF triple store

iKUP Browser

Populous

RightField

Page 6: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

Ontologies provide the schema

What has been observed, where and when?

Disease ontology

Animal model

Gene Ontology

Experimental factors

Cell type ontology

We needed to connect these reference ontologies.

Creation of a specialized Kidney and Urinary Pathway Ontology (KUPO)

Mouse anatomy ontology

http://www.e-lico.org/public/kupo/

Page 7: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

Ontologies by stealth

Populous generates simple Excel based templates

The domain experts are the experts so get them build it

Anatomy (MAO)

Anatomy (MAO)

Biological processes(

GO)

Biological processes(

GO)

Cells (CTO)Cells (CTO)

Spreadsheet

Ontology

OP

PL

Scr

ipts

http://www.e-lico.eu/populous.html

Page 8: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

Describing/Collecting experimental data

Gathering good meta-data AND data again by stealth using RightField

Content of the meta-data cells is constraint to the relevant set of KUPO terms

http://www.sysmo-db.org/rightfield

Page 9: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

Describing/Collecting experimental data

Gathering good meta-data AND data again by stealth using RightField

Content of the meta-data cells is constraint to the relevant set of KUPO terms

Page 10: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

Mashing it all together

Kidney and Urinary Pathway Ontology~1800 classes (~40,000 after imports closure)

Experimental data220 KUP experiments integrated

Owl reasoning

KUP Knowledge Base

RDF triple store

~35M triples

Page 11: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

SPARQLing results

We can now ask queries that span several databases We can exploit OWL semantics for intelligent answers

Make it all RDF/OWL and expose a SPARQL endpoint…

…then we are done right?

BUT!

Easy to use application… …this is what the biologist really want

Page 12: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

The iKUP browser

Built as an easy-to-use and light Google Web Toolkit application

Page 13: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

To expose data from the KUPKB

Page 14: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

Doing some biology

1. A biological question

Accepted for publication in the FASEB J!

Can calreticulin be associated to the development of human kidney disease?

2. No answer with classical tools

Search in Pubmed and Google does not return any relevant result!

3. Querying the KUPKB

4. Validation in the wet-lab

KUPKB in silico result confirmed.

5. Publish an innovative result

Page 15: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

Reusing and Building

Ontologies provide the schema Experimental data

Owl reasoning

KUP Knowledge Base

RDF triple store

Page 16: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

Reusing and Building

Ontologies provide the schema Experimental data

Owl reasoning

KUP Knowledge Base

RDF triple store iKUP Browser

Kidney and Urinary Pathway OntologyTool to facilitate building of onto.

Annotations, homogenizationTool to facilitate data annotation

Page 17: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

What next

User study and evaluation experiments ongoing with Manchester Web Ergonomics Lab

Application to other biological domains Change the domain model in the ontologies and we can construct any

organ knowledge base in this way Already interests in gut, liver, heart and metabolic diseases

Page 18: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

Acknowledgments• Simon Jupp

• Stuart Owen, Matthew Horridge, Katy Wolstencroft and Carole Goble @ University of Manchester for RightField

• Joost Schanstra, Panagiotis Moulos, Jean-Loup Bascands @ Renal Fibrosis Lab, Toulouse, France

• Aristidis Charonis, Bénédicte Buffin-Meyer, Myriem Fernandez for the CALR example

• e-LICO FP7 project and EuroKUP

• Robert Stevens, ontology development, University of Manchester

Open Source License: GNU Lesser General Public LicenseCode: http://code.google.com/p/kupkb-dev/

Page 19: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

Thank you for listening…

www.kupkb.org

Page 20: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

Some rough stats…• 195 KUP experiments integrated• KUPKB RDF store ~35M triples• KUPK Ontology ~1800 classes. ~40,000 after imports closure

Architecture• Sesame and BigOWLIM for the RDF store• Web site developed with Google web toolkit• OWL API and HermiT reasoner for classification and faceted

browsing

Page 21: J Klein - KUPKB: sharing, connecting and exposing kidney and urinary knowledge using RDF and OWL

Summary

The KUPKB RDF store is a mashup of biological knowledge relating to the KUP domain

Ontologies provide the schema and a consistent data annotation mechanism

We expose this knowledge base through a simple web interface that real biologists can use, the iKUP

iKUP and KUPKB provides a faster mechanism for the biologist to survey the data in biological publications and helps the hypothesis generation process.

It is a testament to the tools and APIs that such applications are now being delivered at relatively low cost