Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI)...
-
Upload
gaige-canby -
Category
Documents
-
view
228 -
download
2
Transcript of Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI)...
![Page 1: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/1.jpg)
Mark Wilkinson UBC (Lead PI)Michel Dumontier Carleton (Co-PI)
Christopher J. O. Baker UNBSJ (Co-PI)
C-BRASSCanadian Bioinformatics Resources as
Semantic Services
![Page 2: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/2.jpg)
Mandate
• Expose Canadian bioinformatics Web resources in a unified and automatable manner using Semantic Web Services framework.
• Bioinformatics data and tools will be easier to discover and utilize, and integrate to hasten discovery.
• First widespread deployment of a grid-framework where the messages are “meaningful” to the machine, and can be interpreted/re-interpreted under a wide range of scenarios.
![Page 3: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/3.jpg)
Goals
• Utilize novel SWS technologies to expose Canadian informatics resources on the emergent Semantic Web
• Create toolkits for semantically “lifting” legacy resources into a SWS framework
• Create prototype applications demonstrating a variety of ways of constructing, utilizing, visualizing, and interpreting the services, analytical pipelines, and resulting semantically-enriched datasets.
![Page 4: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/4.jpg)
Web Service Adoption
The low uptake of modern Web integration frameworks by the bioinformatics community stems from two primary facets:
• Challenges in implementing these solutions
• A gap between the abilities of existing technologies and the needs and skills of the target end-user.
![Page 5: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/5.jpg)
SOAP
• Simple Object Access Protocol (SOAP) messaging only successful within well-defined, often project-specific situations.
• Lack of Semantics" in the Web Service interface descriptions which precludes the automated discovery of appropriate services, and automated pipelining of data between those services.
![Page 6: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/6.jpg)
Semantic Web Service (SWS)
• Achieved modest level of automated interoperability due to limitations in the way the semantics of Web Services are modeled:
• SWS frameworks are implemented to support legacy data representation frameworks, in particular XML and XML Schema.
• SWS have annotated XML Schema components describing services based on "meaning" of various input and output fields.
![Page 7: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/7.jpg)
Semantic Web Services (SWS)
• Automating workflow construction and semantically validating the "sensibility" of the connections between services (often referred-to as Schema-mapping)
• XML Schema is semantically opaque, Applying semantics to it through annotation is extremely limited; – semantically-annotated XML tag can have only
one interpretation
![Page 8: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/8.jpg)
SWS Frameworks describe:• Input and output data-structures • Operations of a Web Service. • BioMoby Service Type ontology– a vocabulary describing analytical operations.
• OWL-S and WSMO/WSML Process Model– Before and After – Transformations during that state-change.
• Single-term semantics - too simplistic• Process Models too complex, - No adoption
![Page 9: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/9.jpg)
In transition
• Data on the Semantic Web is encoded in RDF, while data in most Web Service frameworks is encoded in XML
• From XML/Schema-based to OWL/RDF-based data representation
• SAWSDL W3C Rec in 2008– inputs and outputs of Web Services can be
described in terms of ontological models.
![Page 10: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/10.jpg)
User Communities (I)
• End-user community does not usually have a "process model" or "business model" in-mind when searching for a Service.
• Biologists execute a BLAST alignment • NOT because they wish to run a sequence similarity
matrix over their input data; • BUT because they are interested in finding sequences
that are related to their input sequence by homology. • Key is the relationships between the input and output
data.
![Page 11: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/11.jpg)
Bioinformatics Community
Needs:• New metadata, i.e.
Bioinformatics Web Service annotations that describes the biological properties between input and output that are generated by that Web Service.
![Page 12: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/12.jpg)
• SADI facilitates novel data discovery, interoperability, and integrative behaviours that closely mirror the needs and expectations of our end-user community simply by indexing services based on this predicate.
• Semantic Web data vs data derived from Web Service.
![Page 13: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/13.jpg)
• SADI simply comprises a set of standards-compliant conventions and suggested best-practices for data representation and exchange between Web Services that fully utilizes Semantic Web technologies.
• SADI mandates the inclusion of a single required annotation in the Web Service metadata that describes the biological relationship ("predicate") that is created between the input and output data of that Service
![Page 14: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/14.jpg)
SADI Web Service Discovery
![Page 15: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/15.jpg)
hasProteinSequence
Predicate-based web service invocation. Using the hasProteinSequence predicate in a query automatically invokes a web service capable of obtaining the amino acid sequence for UniProt entry P04637.
![Page 16: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/16.jpg)
SADI: Standards-compliant recommendations for implementation
• SADI consists of several bioinformatics services • SADI Services are stateless and atomic. • SADI Services consume and provide data via HTTP, POST and GET. • SADI Services consume and produce data in RDF format. • SADI Service interfaces are defined in terms of OWL-DL classes;
– the property restrictions on these OWL classes define what specific data elements are required by the Service and what data will be provided by the Service, respectively.
• Input RDF data – data is compliant / classifies into Input OWL Class - is "decorated" or
"annotated" by the service provider to include new properties reflecting activities performed by the Web Service.
• Output RDF data – is an instance of the OWL Class that defines the output of the service.
![Page 17: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/17.jpg)
SADI RegistryPredicate Map
![Page 18: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/18.jpg)
![Page 19: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/19.jpg)
What can it do ?
• SADI provides the functionality to automatically and dynamically discover, access, and integrate relevant data from distributed, non-uniform data-sources using disparate ontologies. Key promises of the Semantic Web !
• SHARE implementation allows users to query over data that might not exist at the time they pose their query. A query-specific database is dynamically generated as a query is being processed; effectively, the database required to answer the question is automatically generated as a result of the question being posed.
![Page 20: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/20.jpg)
Find Gene Ontology terms (biological process, cellular component, and molecular function
annotations) for proteins associated with Parkinson's disease:
PREFIX pred: <http://es-01.chibi.ubc.ca/~benv/predicates.owl#>PREFIX ont: <http://ontology.dumontierlab.com/>PREFIX keyword: <http://biordf.net/moby/Global_Keyword/>
SELECT ?term ?nameWHERE { ?protein ont:hasTag keyword:parkinson . ?protein pred:hasGOTerm ?term . ?term pred:hasTermName ?name}
![Page 21: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/21.jpg)
SHARE connects SADI middleware to Pellet SPARQL query engine and DL Reasoner.
Semantic Health And Research Environment (SHARE) prototype.
![Page 22: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/22.jpg)
SADI Toolkit "RDFizing“• Virtuoso Sponger: • Bio2RDF:
Native Service Provision and "Wrapping" legacy CGI and WSDL
• Seahawk: • Dashboard:
Core SADI Service Codebase • SADI::Service::Core: • jSADI:
Quality of Service Testing• myGrid/Moby unit-Test and
the Testing Agent:
Ontology Development Tools• Protege 4 and Top Braid
Composer:
Client Applications • Taverna: • SHARE: • IO Informatics Sentient
Knowledge Explorer plug-in:
![Page 23: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/23.jpg)
SADI Training Course CurriculumTarget Audience - The target audience for the training sessions includes primary or secondary data / service providers as well as the full spectrum of bioinformatics students and professionals from academia and industry.
• Syntactic Web vs. Semantic Web:
• Interoperability: • Knowledge reprsentation
Standards: • RDF 101 - • OWL 101 - • Ontology Editors and Ontology
Design: • Inference and Reasoning: • Reasoning Engines: • Web Service Description
Languages
• Web Service Registries and Service Discovery:
• Service Ontologies: • Workflow composition: • SAWSDL: • MyGrid: • SADI 101• Bioinformatics Web Service
Requirements: • SADI Enabled services: • SADI toolkit:
![Page 24: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/24.jpg)
Action Plan
• Tier 1 involves active, hands-on migration of native resources to a Semantically-enabled Service.
• Tier 2 involves “wrapping” resources from non-participating providers via Services hosted on C-BRASS servers.
• Tier 3 involves on-site training in Semantic Web Service technologies, and support for their self-directed resource migration.
![Page 25: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/25.jpg)
Success Criteria
• Number of Services created/migrated, and their use by consumers worldwide; (Minimum 400 in Canada)
• Number of software tools created, and their use by third-parties;
• Number of Canadian HQP trained in construction of Semantic Web Services.
![Page 26: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/26.jpg)
Deliverables
• A fully-documented definition of the SADI Semantic Web Service framework, including submission of this to an appropriate standards body (e.g. OASIS or OMG)
• A set of core ontologies describing properties and relationships for entities in the biomedical domain
• A costing-model, for use by future Semantic Web Service providers, outlining the establishment and maintenance costs for the migration from legacy Web or Web Service resources to a Semantic Web Service framework.
![Page 27: Mark Wilkinson UBC (Lead PI) Michel Dumontier Carleton (Co-PI) Christopher J. O. Baker UNBSJ (Co-PI) C-BRASS Canadian Bioinformatics Resources as Semantic.](https://reader035.fdocuments.in/reader035/viewer/2022062216/56649cb65503460f9497b1f5/html5/thumbnails/27.jpg)
Mark Wilkinson UBC (Lead PI)Michel Dumontier Carleton (Co-PI)
Christopher J. O. Baker UNBSJ (Co-PI)
C-BRASSCanadian Bioinformatics Resources as
Semantic Services