The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational...

51
WORKSHOP ON SEMANTICS IN GEOSPATIAL ARCHITECTURES: APPLICATIONS AND IMPLEMENTATION October 2013 DAMIAN GESSLER, Ph.D. SEMANTIC WEB ARCHITECT UNIVERSITY OF ARIZONA dgessler (at) iplantcollaborative (dot) org The iPlant Collaborative Semantic Web Platform

Transcript of The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational...

Page 1: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

W O R K S H O P O N S E M A N T I C S I N G E O S PAT I A L A R C H I T E C T U R E S : A P P L I C AT I O N S A N D I M P L E M E N TAT I O N

O c t o b e r 2 0 1 3

D A M I A N G E S S L E R , P h . D .

S E M A N T I C W E B A R C H I T E C T

U N I V E R S I T Y O F A R I Z O N A

d g e s s l e r ( a t ) i p l a n t c o l l a b o r a t i v e ( d o t ) o r g

The iPlant Collaborative Semantic Web Platform

Page 2: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

Classicism and Scholasticism

w w w . i p l a n t c o l l a b o r a t i v e . o r g 2

St. Thomas Aquinas Fra Bartolommeo (1472–1517)

Source: http://en.wikipedia.org/wiki/File:Thomas_Aquinas_by_Fra_Bartolommeo.jpg

Think, then look Look, then think

Page 3: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

Empiricism

w w w . i p l a n t c o l l a b o r a t i v e . o r g 3

Source: http://en.wikipedia.org/wiki/File:Pourbus_Francis_Bacon.jpg

Look, then think Think, then look

Sir Francis Bacon Frans Pourbus the younger (1569–1622)

Page 4: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

Scientific Method

w w w . i p l a n t c o l l a b o r a t i v e . o r g 4

Source: http://en.wikipedia.org/wiki/File:Pourbus_Francis_Bacon.jpg

Hypothesis

Prediction

Experiment Analysis

Conclusion

[ support or refutation ]

Sir Francis Bacon Frans Pourbus the younger (1569–1622)

Page 5: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

The Experiment as the Rate Limiting Step

w w w . i p l a n t c o l l a b o r a t i v e . o r g 5

Charles Darwin Leonard Darwin, 1874

Source: http://en.wikipedia.org/wiki/File:1878_Darwin_photo_by_Leonard_from_Woodall_1884_-_cropped_grayed_partially_cleaned.jpg

Hypothesis

Prediction

Experiment Analysis

Conclusion

[ support or refutation ]

Page 6: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

Bio(Geo)logy

w w w . i p l a n t c o l l a b o r a t i v e . o r g 6

Not too long ago: “If you love science and hate math, do Bio(Geo)logy”

Page 7: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

All-in-one Analysis + Manuscript Prep + Data Management Plan

w w w . i p l a n t c o l l a b o r a t i v e . o r g 7

Not too long ago: “What? Me worry? My backups are safe”

Source: http://en.wikipedia.org/wiki/File:IBM_PC_5150.jpg

Page 8: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

Quantity has a Quality all of its own

w w w . i p l a n t c o l l a b o r a t i v e . o r g 8

Hypothesis driven

Data Driven

Page 9: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

Evidence-based Decision-making

w w w . i p l a n t c o l l a b o r a t i v e . o r g 9

http://blog.lib.umn.edu/ellis271/arch1701/bigstockphoto_Global_Warming_217540%203.jpg http://www.smartpower.org/blog/wp-content/photos/field_turbines.jpg

Decisions have downstream and unintended consequences; analyses and decisions about our Natural world that utilize a scientific approach bias our odds towards viable solutions.

Page 10: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

The Analysis as the Rate Limiting Step

w w w . i p l a n t c o l l a b o r a t i v e . o r g 10

Source: http://en.wikipedia.org/wiki/File:Mapping_Reads.png

Hypothesis

Prediction

Experiment Analysis

Conclusion

[ support or refutation ]

high throughput sequencing

from climate change to

Page 11: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

Re-revolutionizing the Revolution

w w w . i p l a n t c o l l a b o r a t i v e . o r g 11

13th century 20th century

Think, then look Look, then think

Hypothesis driven Data Driven

Page 12: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

Re-revolutionizing the Revolution

w w w . i p l a n t c o l l a b o r a t i v e . o r g 12

16th century 21st century

Look, then think Think, then look

Data driven Hypothesis Driven

Page 13: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

Hypoth

Predict

Exprmnt

Analysis

Conclsn

w w w . i p l a n t c o l l a b o r a t i v e . o r g 13

Source: http://en.wikipedia.org/wiki/File:Pourbus_Francis_Bacon.jpg

Hypoth

Predict

Exprmnt

Analysis

Conclsn

Hypoth

Predict

Exprmnt

Analysis

Conclsn

Hypoth

Predict

Exprmnt

Analysis

Conclsn

Analysis

Scientific Method, on hyper-cycles

Page 14: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 14

Bridging HPC, Enterprise, and Web assets

High Performance Computing 500K core, 10 PetaFLOPS*

Petabyte scale storage

PetaFLOP: 1015 (million billion) floating point operations per second

Foundational Infrastructure iPlant Data Store, HPC, etc.

The iPlant Collaborative

Mission: To build a cyberinfrastructure for the nation’s plant scientists.

iPlant is a Service/Infrastructure project

Page 15: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 15

Bridging HPC, Enterprise, and Web assets

Foundational Infrastructure iPlant Data Store, HPC, etc.

Enterprise Class Discovery Environment, Atmosphere

Enterprise Class Virtual Work desk,

Cloud, Virtual machines

Discovery Environment: world class bioinformatics’ work station at your browser

Atmosphere: “instant” dedicated work station on the cloud: load, use, discard, repeat

breadth

dep

th

High Performance Computing 500K core, 10 PetaFLOPS*

Petabyte scale storage

PetaFLOP: 1015 (million billion) floating point operations per second

Page 16: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 16

The Greatest Informatic Asset of all Time

Web

Just a portal and browser ... ... or an infrastructural asset?

Page 17: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 17

Bridging HPC, Enterprise, and Web assets

Enterprise Class Discovery Environment, Atmosphere

Web

What is the infrastructural role for MODS, CODS, and trillions $$$ in web assets? How does iPlant engage, leverage, and enhance the Gramene’s, the TAIR’s, Soybase’s, SGN’s, MazieGDB’s, PlexDB’s, ..., of the world? How does iPlant engage anything that is not a downloadable, installable Linux/MS program?

MODS: Model Organism Databases CODS: Clade-Oriented Databases

Foundational Infrastructure iPlant Data Store, HPC, etc.

Page 18: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 18

iPlant Semantic Web Program

Foundational Infrastructure iPlant Data Store, HPC, etc.

Enterprise Class Discovery Environment, Atmosphere

Web

SSWAP Semantic Integration

Semantic Pipelining

Enterprise Class Virtual Work desk

Cloud, Virtual machines

Distributed Semantic Web Services Logic-driven semantics

High Performance Computing 500K core, 10 PetaFLOPS*

Petabyte scale storage

PetaFLOP: 1015 (million billion) floating point operations per second

Page 19: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 19

iPlant Semantic Web Program

Web

SSWAP Semantic Integration

Semantic Pipelining

Distributed Semantic Web Services Logic-driven semantics

It is the Semantic aspect of the Semantic Web that allows us to leverage the Web from being an external resource into an integrated infrastructural asset.

Page 20: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

The Actors

You

Community MODS (Model Organism Databases)

and CODS (Clade Oriented Databases)

iPlant Computational Resources (e.g., TACC)

The World Your lab

The World

Semantic Mediation Layer

w w w . i p l a n t c o l l a b o r a t i v e . o r g 20

Page 21: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

Semantic Mediation

Sidney Harris © 2006

w w w . i p l a n t c o l l a b o r a t i v e . o r g 21

Page 22: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

The Antibody Analogy as the Mediation Layer

w w w . i p l a n t c o l l a b o r a t i v e . o r g 22

Page 23: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

Simple Semantic Web Architecture and Protocol W3C OWL RDF/XML

• Establish the framework for Web resources to describe themselves and their offerings

• Establish the framework for ontological integration

• Engage first-order, description logic reasoning

• Provide a semantically enabled Discovery Server for service and pipeline coordination

http://sswap.info/protocol

w w w . i p l a n t c o l l a b o r a t i v e . o r g 23

Page 24: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

SSWAP Enables Reasoner-assisted Workflows

• A protocol allows a reasoner to connect chains of resources (services) based on logical (not just lexical) matching of what various resources consume and produce.

• Web Discovery: from (any) Web site -> semantic pipeline

• Semantic Pipeline: single chain workflows of distributed semantic Web services hosted anywhere on the Web

• Workflows constructed via reasoner-assisted, first-order subsumption matching of the output of one service into the input of another

w w w . i p l a n t c o l l a b o r a t i v e . o r g 24

rdfs:subClassOf

Resource1 Subject Object

Resource2 Subject Object

Resource3 Subject Object

rdfs:subClassOf

Page 25: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

iPlant Semantic Architecture

Data and Service Providers

Web Resource

Data

RDG RIG RRG

Algorithm

Web

Semantic Broker / Discovery Server

INTERPRETER

IND

IREC

TIO

N L

AYE

R

KB

BROKER

RDG RQG RRG EXPLICIT INTERFACE

IND

IREC

TIO

N L

AYE

R

Clients

RQG

Client

RIG RRG

Ontologies

Ontology Servers

Protocol Ontology

OWL RDF/XML

OWL RDF/XML

REPOSITORY

RESTful API

1

2

4 3

EXPLICIT INTERFACE

Semantic documents described in this talk: PDG: Provider Description Graph

RDG: Resource Description Graph RIG: Resource Invocation Graph RRG: Resource Response Graph

RQG: Resource Query Graph

PDG

w w w . i p l a n t c o l l a b o r a t i v e . o r g 25

Page 26: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 26

sswap.info/example

Page 27: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 27

From TreeGenes to High Performance Computing

Page 28: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 28

Semantic Integration from Third-party Web sites DiversiTree

Javascript snippet to launch

data for Web Discovery

with the press of a button

Page 29: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

HTTP API: sswap.info/api

w w w . i p l a n t c o l l a b o r a t i v e . o r g 29

JSON -> OWL RDF/XML

transformation

transparent to the user

(via the SSWAP HTTP API)

Page 30: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 30

Web Discovery into Semantic Pipelines Reasoner-assisted Web workflows

Reasoner uses first-

order, description logic

to present services

and pipelines that can

operate on the data at

any given step

Page 31: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

Just-In-Time Ontology Hosting sswap.info/jit

w w w . i p l a n t c o l l a b o r a t i v e . o r g 31

Page 32: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 32

Web Discovery into Semantic Pipelines Reasoner-assisted Web workflows

Reasoner uses first-

order, description logic

to present services

and pipelines that can

operate on the data at

any given step

Page 33: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 33

RESTful Pipeline Execution

Page 34: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 34

Data Tree View “Ontologized” data and metadata

•on-the-fly Data Tree views

•pre-defined renderers

Page 35: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 35

Semantic Integration into Third-party Web sites TreeGenes’ CartograTree

Third-party web sites can

engage as renderers on

result sets

Page 36: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 36

TreeGenes’ Sequenced, Genotyped, and Phenotyped

Geographical browsing into

the TreeGenes database

• 1 265 tree species

• 901 113 sequences

• 24 142 786 genotypes

• 19 441 phenotypes

http://dendrome.ucdavis.edu/treegenes

Page 37: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 37

AmeriFlux Sites CO2, Water, Energy

http://public.ornl.gov/ameriflux

Page 38: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 38

WorldClim 1 Km2 climate grids

http://www.worldclim.org

Page 39: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 39

ArcGIS Layers

Soil layers

http://maps2.arcgislonline.com

Page 40: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 40

TRY-DB Phenotypes

http://www.try-db.org

Page 41: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 41

CartograTree Custom user interface data selection and analysis

• Select

• Analyze

• Web Discovery

Page 42: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 42

Semantic Integration from Third-party Web sites Data slicing and contextual augmentation

Page 43: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 43

Direct and Indirect Data Referencing URI dereferencing of arbitrarily large data sets

Serialize the data itself,

or a URI to where the

data is located

Page 44: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 44

High Performance Computing Services engage like any other Web services

Page 45: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

Custom Service Parameterization

w w w . i p l a n t c o l l a b o r a t i v e . o r g 45

Page 46: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 46

Publishing Pipelines Private data, shared service parameterization

Page 47: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

Manage and Publish your Pipelines

Page 48: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 48

Phylogenetics Pipeline runs are persisted in OWL and can start new pipelines

Page 49: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

w w w . i p l a n t c o l l a b o r a t i v e . o r g 49

TreeViz Multi-location, multi-institution, Web/HPC run

Page 50: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

Quick Vitals

• 185,000 lines of code

• 100+ libraries

• Open source

• Free SDK (Software Development Kit) for semantics and reasoning on your servers; run pipeline manager on our servers

• More info: sswap.info/wiki

w w w . i p l a n t c o l l a b o r a t i v e . o r g 50

Page 51: The iPlant Collaborative Semantic Web Platform · iPlant Semantic Web Program Foundational Infrastructure iPlant Data Store, HPC, etc. Enterprise Class Discovery Environment, Atmosphere

Acknowledgements

Special thanks to:

• iPlant Collaborative

• UC Davis Dendrome / TreeGenes

• Semantic Web engineering by Clark and Parsia, LLC

• NSF grants #0943879 and #EF-0735191

w w w . i p l a n t c o l l a b o r a t i v e . o r g 51