ICIS5 and the new Internet Benjamin Good CIHR and MSFHR Strategic Training Program in Bioinformatics...
-
date post
21-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of ICIS5 and the new Internet Benjamin Good CIHR and MSFHR Strategic Training Program in Bioinformatics...
ICIS5 and the new Internet
Benjamin Good CIHR and MSFHR Strategic Training
Program in Bioinformatics
Department of Molecular Biology and Biochemistry
Simon Fraser University
Vancouver, Canada
Objectives
• Briefly introduce the concept of web services• Provide some reasons for you to be interested in them• Introduce the problem of translation• Introduce two approaches to service development that deal with this
problem in different ways– One uses XML-Schema– One uses the Biomoby data-type ontology
• Describe a potential strategy for ICIS development that takes advantage of the strengths of both of these approaches with a minimum duplication of effort.
• Conclude with the predicted consequences of this strategy
Web Service Intro: Standards
• Web Services are programs that can be executed by other programs connected via the internet.
• HTTP, TCP/IP make internet communication possible.
• English makes this presentation possible• SOAP makes communication between
anonymous web services possible. – As we will see, additional standards will
become necessary to describe the content of communications between web services.
– But first, time to clean up with SOAP.
SOAP Web Services
• SOAP - a new XML protocol– Simple Object Access Protocol.– Wraps communications in a consistent XML structure.– Results in a “Mail-like” system.
<input>hello</input> SOAP Packaging SOAP unpackagingParse and process xml
<output>world</output>SOAP PackagingSOAP unpackaging
Client or “Consumer” Service or “Provider”
A SOAP example<?xml version='1.0' ?><env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope"> <env:Header> <m:reservation xmlns:m="http://travelcompany.example.org/reservation"
env:mustUnderstand="true"> <m:reference>uuid:093a2da1-q345-739r-ba5d-pqff98fe8j7d</m:reference><m:dateAndTime>2001-11-29T13:20:00.000-05:00</m:dateAndTime>
</m:reservation> <n:passenger xmlns:n=env:role=env:mustUnderstand="true"> <n:name>Benjamin Good</n:name> </n:passenger> </env:Header> <env:Body> <p:itinerary xmlns:p="http://travelcompany.example.org/reservation/travel"> <p:departure>
<p:departing>New York</p:departing> <p:arriving>Los Angeles</p:arriving> <p:departureDate>2001-12-14</p:departureDate> <p:departureTime>late afternoon</p:departureTime> <p:seatPreference>aisle</p:seatPreference> </p:departure> <p:return> <p:departing>Los Angeles</p:departing> <p:arriving>New York</p:arriving> <p:departureDate>2001-12-20</p:departureDate> <p:departureTime>mid-morning</p:departureTime> <p:seatPreference/> </p:return> </p:itinerary></env:Body></env:Envelope> http://www.w3.org/
Optional extension
Forces processing
Primary Payload,Could be any valid XML
Why Should We Bother?
• Participation in a newly forming global community– By incorporating web service clients, ICIS can take advantage of
services offered by other institutions. For example, NCBIs Entrez information retrieval system is now provided as web services.
– In the opposite sense, ICIS services can be shared with other applications at institutes around the world.
• For example, Gramene could embed calls to ICIS methods directly within their website.
• Inter ICIS collaboration– Provides a protocol for data and software sharing between
institutes or applications.• Example: Windows ICIS application written in C makes use of
Java methods provided as web services (Gerard Sylvester).
How Should We Proceed?
• Apache Axis– Handles all SOAP processing, leaves only the content of the XML
payload to worry about.– The content of the payload must be described somewhere in order
for a transaction to succeed.• Typically XML-Schema• Alternatively Moby data-type (object) ontology
XML-Schema
• Defines the allowable structure of XML documents• Embedded in WSDL• Can be used to map from XML to other representations
– Axis can automatically map between XML-Schema and Java classes, if the classes follow the Java Beans protocol or are primitive types like Strings or Integers.
• No arguments allowed in the constructor.• Private attributes manipulated using “get” and “set” methods.• For more information see
http://java.sun.com/products/javabeans/docs/spec.html
– Otherwise custom serialization/deserialization routines must be developed.
Evaluation of XML-Schema
• Pro– Many users and tools, allows infinite possible data representations.
• Con• Without some shared common vocabulary and data model,
collaboration is challenging.– Semantically identical things may be represented in different ways
by different groups. Thus multiple serialization routines must be constructed to accomplish the same task.
– Self-describing documents are not sufficient to create a common language.
• Not sufficient for service discovery by anonymous users. Must employ a registry like the UDDI or Moby Central.
Moby Web Services
• Provides a shared protocol for building and accessing web services.• This shared protocol is formally encoded in two principal extensible
ontologies.• “Ontology” = A set of related terms that describe some domain.• Data type ontology
– Defines the data types used as input and output to services.• Sequence, genbank record, blast output …
• Service ontology– Defines kinds of services that are available.
• Data retrieval, analysis…
The Class Ontology
Object
NucleotideSequence
VirtualSequence
String
Integer
ISA
ISA
ISA
ISA
HAS-A
HAS-A
DNASequence
AminoAcidSequence
ISA
ISA
text/plain
text/html
ISA
ISA
text/base64ISA base64_gifISA
Generic Sequence
XML serialization of Classes
<String namespace=' ' id=' '/> <Object namespace='NCBI_gi' id='83 '/> <Integer namespace=' ' id=' '/>
<VirtualSequence namespace='NCBI_gi' id='163483'> <Integer namespace='' id='' articleName='Length'>
975 </Integer></VirtualSequence>
<GenericSequence namespace='NCBI_gi' id='163483'> <Integer namespace='' id='' articleName='Length'>
975 </Integer>
<String namespace='' id='' articleName='Sequence'> ATGG... </String></GenericSequence>
ISAISA
ISA
ISA
HASA
HASA
Note: the Classes contained through HASA and HAS relationships are named; at the present time, these names are ~not in a controlled vocabulary, and are thus are only
human-readable
Value of the Moby Approach
1. Common language that allows for interoperability– Avoids naming problems. DNASequence != dna_sequence– Avoids wheel reinvention.- re-use other people objects.– Potential for automatic pipeline generation
2. Service Discovery– Moby Central is better than the UDDI for bioinformatics applications
because:• It is specific to biology.• It can traverse the Moby ontologies. • For example:
– A_Service processes GenericSequences– I have a DNASequence – I can find and use A_Service because DNASequence isa
GenericSequence.
Moby vs. XML-Schema
Measure XML-Schema Moby
Popularity: Proliferation of tools and support
Clear advantage
Service Discovery Clear advantage in the biological domain
Ease of Client development
Depends on complexity of Schema and presence of clients-side tools
Easy to access the XML strings. Parsing support is lacking but on the way
Moby + XML-Schema
• Services can be deployed using both protocols.• If the input and output to the exposed methods takes the form of
primitives or of Java Beans, this process is greatly simplified for both approaches.
Suggestion:
1. Establish a Bean-based data model for ICIS java data classes.
2. Web services can be deployed for any bean, bean out class using Axis and writing deployment descriptor. No code needs to be written. XML-Schema is produced automatically.
3. Establish data-binding framework for mapping from ICIS beans to moby XML. “Castor” is designed for this and could probably automatically perform all serialization/deserialization.
4. Use data-binding methods in wrappers for the original ICIS methods.
Germplasm search: Input Classpackage org.cgiar.icis.GMSexample;public class QueryBean {
/** A query string (may contain wildcard characters) */private String query;/** The first row to return */private int startRow;/** The last row to return */private int lastRow;/** The gms class for limiting the search */private int gmsClass;
//bean accessor methodspublic String getQuery(){return query;}////getStartRow, getLastRow, getGmsClass
//bean setter methodspublic void setQuery(String s){query = s;}////setStartRow, setLastRow, setGmsClass}
Germplasm search: Output Classpackage org.cgiar.icis.GMSexample;public class GmsBean {
public class GMSBean {private int nid;private int gid;private int ntype;private int nstat;private int nuid;private String nval;private int nlocn;private String ndate;private int nref;
//bean accessor methodsGet…
//bean setter methodsSet…
Germplasm Search: Method callpublic GMSBean[] getList(QueryBean qb)throws java.rmi.RemoteException{//set input parameters String germplasmName = qb.getQuery(); int germplasmType = qb.getGmsClass(); int batchNum = qb.getStartRow(); int batchSz = qb.getLastRow(); ---> Execute logic to ensure batch number specified is positive.
// Formulate the query. String sqlCondition = prepareSearchString(germplasmName); final String queryStmt = "SELECT g.gid, g.methn, g.glocn, n.*" + " FROM germplsm g, names n" + " WHERE g.gid = n.gid" + " AND g.grplce = 0" + (germplasmType > 0 ? " AND n.ntype = ?" : "") + " AND (" + sqlCondition + ") LIMIT ?, ?";--->bind input parameters --->Execute the query--->For each row in the result construct a bean and append it to the output array//Return the array of GmsBeans
}
Germplasm Search: Deployment
<deployment xmlns=http://xml.apache.org/axis/wsdd/ xmlns:java=”http://xml.apache.org/axis/wsdd/providers/java"> <service name="MockGMSBeanEater" provider="java:RPC"> <parameter name="className" value="org.cgiar.icis.GMSexample.MockGMSBeanEater"/> <parameter name="allowedMethods" value="getStandardName,getPreferredName, getGermplasmList"/> <beanMapping qname="myNS:QueryBean" xmlns:myNS="urn:BeanService" languageSpecificType="java:org.cgiar.icis.GMSexample.QueryBean"/> <beanMapping qname="myNS:GMSBean" xmlns:myNS="urn:BeanService" languageSpecificType="java:org.cgiar.icis.GMSexample.GMSBean"/> </service></deployment>
Germplasm Search: Deploy
• Execute the Axis Admin client on the deployment descriptor.– Posts the web service.– Generates a WSDL document that describes it, including XML
schema for mapping between XML and Java Beans• WSDL file consists of:
– “Definitions” XML-Schema for parameters of services– A set of descriptions of the deployed services
• Names of the classes• Location• Input, Output
Consuming GMS Search: XML-Schema Method
• Axis -> WSDL2Java generates all needed client classes based on the published WSDL document. This includes – To execute service:
MockGMSBeanEaterService service = new MockGMSBeanEaterServiceLocator();
java.net.URL u = new java.net.URL("http://iris-genome:8081/TestWeb/services/MockGMSBeanEater");
MockGMSBeanEater m = service.getMockGMSBeanEater(u);QueryBean qb = new QueryBean();qb.setQuery("IR64"); qb.setStartRow(0);…GMSBean[] gbs = m.getGermplasmList(qb);
Germplasm Search: Moby Input
<moby:IcisQuery namespace=“” id = “”>
<moby:String attributeName=“query”>IR 64</moby:String>
<moby:Integer attributeName=“startRow”>0</moby:Integer>
<moby:Integer attributeName=“endRow”>0</moby:Integer>
<moby:Integer attributeName=“GmsClass”>0</moby:Integer>
</moby:IcisQuery>
Germplasm Search: Moby Output
<moby:GMS namespace=“IRIS” id = “707”><moby:String attributeName=“gmsId”>12</moby:String><moby:Integer attributeName=“NID”>1210</moby:Integer><moby:Integer attributeName=“GID”>70</moby:Integer><moby:Integer attributeName=“NTYPE”>1</moby:Integer>
.
.
.</moby:GMS>
GermplasmSearch: Moby Method
Public String mobyGMS(String moby){
queryBean qb = MobyQueryBinder.Demarshall(moby);
gmsBean gb = gms.getGermplasmList(qb);
String mobyOut = MobyGmsBinder.Marshall();
return mobyOut;
• Can be accomplished with XML parsers and constructors.• Castor www.castor.org conceals parsing and maps directly from XML
to Bean-like Java classes.
Consuming GMS Search: XML-Schema Method
• Service location and intent known• Axis, .NET or equivalent present• Access XML-Schema based method• Axis -> WSDL2Java generates all needed client classes.
– To execute service:
Service = ..
Gms.execute
Consuming GMS Search:Moby method
• Service discovered using moby central• Accessed via Moby Java API.
– Find, execute, bind/parse
My Rotation Products
• Code templates that could be used to deploy ICIS web services using two popular paradigms.
• Code templates for consumption of XML-Schema based and Moby Ontology based web services.
• Technical design document describing the suggested approach in greater conceptual and technical detail.
• Many new friends and a truly unique experience.