XML and Web Services

XML AND WEB SERVICES

NOTES

1 ANNA UNIVERSITY CHENNAI

UNIT I

XML INTRODUCTION

1.1 INTRODUCTION TO XML

1.1.1 Introduction to Markup Languages

A markup language describes the form of a document i.e. they explain how thedocument should be processed or interpreted. HTML may flash in your memory immediatelyafter reading the term markup language. It is no wonder that HTML is the most “popular”markup language. You may clearly find out the differences between XML and HTML inlater portions of this text.

You may be aware of the fact that there exist wide varieties of markup languages. Toname a few: HTML (Hyper Text Markup Language), SGML (Structured GeneralizedMarkup Language), VRML (Virtual Reality Markup Language) etc. Every markup languagehas its own specialized purpose. HTML, for example is the de facto standard for designingweb pages. VRML is used in the virtual reality domain.

The forerunner of all these markup languages is SGML. SGML is called the motherof all markup languages. All other markup languages are the derivates of the SGML in oneway or another. So is XML. XML is undoubtedly one among the most used markuplanguages in today’s scenario.

1.1.2 What XML is?

XML stands for eXtensible Markup Language. The name itself makes you realizethat it is a markup language and it is extensible in nature. XML is a World Wide Webconsortium (W3C) specification. XML was released during 1998. But the official XML1.0 recommendation was released on 6th October 2000.

Any technology to become the official W3C standard it has go through the stageslike notes, working draft, candidate recommendations, recommendations. The completeexplanation about each stage is outside the scope of this text. XML has gone through allthese stages and now it has become the official W3C recommendation. If you would liketo know these stages in more details surf to http://www.w3.org. The formal XMLrecommendation can be accessed at http://www.w3.org/TR/REC-xml.

DMC 1801

NOTES


XML is a technique to organize information in a specific way. Extensibility is the keyto the success of XML. Extensibility refers to the fact that you can create your own tags inXML. This has made the boundaries of XML wider. You can find out the applications ofXML in almost all domains. Using this extensibility factor a large number of applicationswere developed in various fields.

1.1.3 Roles of XML

XML is playing a key role almost all the technologies like J2EE, .NET etc. In this youwill learn about various roles that XML is playing in today’s Information Technology world.Biggest advantage with XML is it is both powerful and simple. XML is plain text in nature.The below list is a set of roles that XML plays:

Structured Data Representation Separating Data from User Interface Standard for Transfer of data Rule based data representation Providing platform independence Customized display of data

1.1.3.1 Structured Data Representation

The primary role of XML is to represent the data in a structured way. Normally forthe purpose of storing data in a structured way you would use databases. But theintroduction of database adds complexities like you may need a Data Base ManagementSystem(DBMS) and binding with a particular vendor’s product. The other ways ofrepresentations are CSV(comma separated values) or TSV (tab separated values).

Figure 1 : Example CSV file

From the above representation in Figure 1 you may not get any further information. Ifthe same example is represented using XML then it may look like as shown in figure 2.From this listing you can gather information like these data is about a group of students andit contains information regarding their name, age and weight. So XML represents data in astructured way from which we can easily extract certain information from the first lookitself.

Ram, 10, 25

Raj, 15, 30

John, 12, 40


NOTES


Figure 2: XML data representation example

The idea here is not to derive that xml would replace databases but to emphasize thatXML has the capability to represent the data in a meaningful way. Otherwise the domainsof database and xml don’t overlap too much.

Since XML is simple text the size of this representation is lesser comparing to otherproprietary binary representation.

1.1.3.2 Separating Data from User Interface

One of the major problems with HTML is that it clubs the data and presentation logictightly. In HTML, the data that is to be formatted would be given in tags. Each tag has itsown formatting functionality. The tag would format the data given in it according to itsfunctionality.

The major problem with this approach is tight coupling of presentation with data. Itbecomes a tough problem to reformat the same data differently. But in the case of XMLyou can clearly separate the data from its presentation logic. As explained already, no

<group>

<student>

<name> Ram </name>

<age> 10 </age>

<weight> 25 </weight>

</student>

<student>

<name> Raj </name>

<age> 15 </age>


</student>

<student>

<name> John </name>

<age> 12 </age>


</student>

</group>

DMC 1801

NOTES


XML tag has built in functionality to do anything. You would be attaching separate sectionfor specifying the rendering logic.

For example you can associate XSLT (XML Style Language Transformation) withXML to render the contents of XML in a way you want. You could find more about XSLTin later sections. If you have some idea about CSS (Cascading Style Sheets), you wouldfeel at home with XSLT.

This idea of separating content from formatting gives us the following advantages.

Figure : Data Separation from Content in XML

If you want to render the same content in different devices for example in a PC,mobile phone etc you can associate different XSLT for different types of devices with thesame data. The above figure represents this idea of separating content from presentation inXML with the help of XSLT.

1.1.3.3 Standard for Transfer of Data

As specified already major role of XML is transfer of data which is represented in astructured manner. At this point you can have a question that what is very special intransferring data with XML. After reading this section certainly you will have the answerfor this question.

XML is an open standard i.e. it is not proprietary. This makes XML operates aboveall the application boundaries. For example, if you have two applications, one developedusing Microsoft .Net and another with J2EE then nothing blocks you to transfer data from.NET application to J2EE and vice versa. How this becomes possible is because of thefact that XML is an open standard.

XML Data

XSLT for PC XSLT for Mobile XSLT for print


NOTES


Figure: XML data transfer between two applications with different technologies

This scenario is depicted in above figure. You can see that the arrow between twoapplications is double sided i.e. both the applications can transfer and send data to andfrom another.

Another huge advantage of transferring data with XML is that it is purely text. Thebenefit of XML being text is that it would not be blocked by the firewalls. If you transferdata from one network to another in a binary format there is fair enough chance that it maybe blocked by firewall.

XML will not be blocked by firewalls because of the fact that it is a text standard.Being a text XML has no executable code in it. So it can not carry any self executablecode like viruses. This makes XML the desirable standard for transfer of data from onenetwork to another. This scenario is explained in above Figure.

Figure : Data transfer using proprietary format and XML

In the above figure you can observe that the data in proprietary binary format isblocked by the firewall. Whereas the data in XML format is allowed to reach the networkB. The reason for this to happen is that the firewall considers XML file as simply a text file

Microsoft .NET application

J2EE application

XML

Network A

Network B

Firewall

Data in proprietary format

Data in XML Format Allowed

DMC 1801

NOTES


and it allows text transfer. In the case of proprietary formats it checks for executablemalicious code and if any such things are found then the transfer would be blocked.

In the Internet dominated world of today security plays a vital role. The binaryrepresentations of data shall carry malicious code in it. This may led to some seriousproblems in the internet scenario because of the huge number of senders and receivers init. Internet can not assure the genuineness of the parties’ transferring data. In thesecircumstances there is desperate need for a standard which has the capabilities of fasterand secure transfer of data. XML has the both these capabilities. So it has become mostlyused standard for transfer of data in internet.

A question may arise in your mind at this point regarding how XML transfer is faster.The answer lies in the fact that it doesn’t contain any unnecessary content other than theactual data and the tags around it. So the size of the file becomes very small comparing toother proprietary binary standards which may contain additional information to carry outcertain operations. The power of XML lies in this simple nature.

1.1.3.4 Rule Based Data representation

Though the XML data is simply text you can associate certain rules with itsrepresentation. Previous section explained you that XML data is simply text. In the realworld scenario you may need to associate certain rules with the data. To make it cleareryou can specify what should be format of the data.

For example, you are representing employee data it can be made sure that it shouldcontain elements like employee id, name, department and salary. By making definitionswith DTD (Document Type Definitions) you can assure the elements of the data. If thesupplied data doesn’t adhere to the standards given by document type definition it wouldraise errors.

This facility of specifying the structure information with a XML file is an importantfeature of XML. Let us imagine that if such feature is not there any body can enter any dataand it becomes highly impossible to achieve synchronization between applications. As themajor purpose of XML is data transfer between different applications it becomes ultimatelynecessary to have certain rules governing the data representations.

The key point to understand here is that these rules are not imposed as the generalconditions for all the xml files. In fact these rules are formulated by the system designers foreach file individually. The reason for not having the generalized conditions is that power ofxml lies in extensibility. So if hard rules regarding the data format are introduced then thereis threat to this extensibility feature.

The application of DTD makes an XML file to follow certain rules which are specificto that file alone. This facilitates the integrity with respect to format of the data. So DTD is


NOTES


a corner-stone in the world of xml application development. Another point worth mentioninghere is that this DTD requirement is not mandatory. You can have a xml file which doesn’thas DTD specification associated with it. So the control is in the hands of the designer whocan decide whether to attach the DTD with a XML file or not.

1.1.3.5 Providing Platform Independence

Another important role of xml is providing platform independence for datarepresentation. In section 1.3.3 you learned that xml can be used to transfer data betweentwo applications developed using different technologies like Microsoft .Net and J2EE etc.

In fact it is not only the application development technologies that xml can achieveindependence but it also supports cross platform data transfer.

For example let us imagine two applications one running on Linux operating systemand another on Mac operating system. XML can be used to perform effective data transferbetween applications running on entirely different operating systems. This is depicted in thefollowing Figure.

Figure: Platform Independence with XML

The advantage that you can achieve with that is the seamless integration applicationsrunning on different operating systems. The internet technologies are also cross platform.So XML becomes a friendly tool to co-exist with these web technologies.

You can have a question now that how this becomes possible. The answer lies in thefact there exists parsers available for XML in all the major operating systems. You willlearn more about xml parsers in later sections.

Application running on Linux

Application running on MAC OS

XML

DMC 1801

NOTES


1.1.3.6 Customized display of data

In section 1.3.2, you learned that xml clearly specifies boundaries between contentand the user interface or representation. It gives us flexibility now that it can render samedata in different formats.

For example consider a scenario where you have data about hundred students. Youhave that data in XML format. Now you can display this data in tabular format or anyother custom format that you wish. How to achieve this will be answered in later portionsof this text. For the moment you can have the understanding that customized display ofsame data is possible in XML.

The technique that enables us to have this customized display is XSLT. Using XSLTyou can achieve the customized display of same data in different format. In section 1.3.2you learned that the same xml file can be formatted to look differently for a personalcomputer, a mobile device and for a printer output.

The way to achieve this is by attaching different XSLT files with the same xml file. Themoment you change the XSLT file the display format is also modified. The reason for thisis XML tags by itself don’t have any display logic in them. So it is the XSLT that changesthe look and feel of the xml content.

Now you can get a question that what happens if I don’t attach any XSLT files withxml. The answer is browser dependent. But most of the browsers display the xml file in atree structure which you can fold and unfold. In the case of Mozilla Firefox it clearly saysthat it is displaying the content with default style specification if you don’t attach any specificstyle information.

This customization feature of XML has been one of the important reasons for successof xml. If you have some ideas in HTML and CSS, you clearly understand that the relationbetween XML and XSLT is same as the relation between HTML and CSS.

1.1.4 Web and XML

World Wide Web has become the biggest collection of information. You can findinformation about anything ranging from child’s toy to nanotechnology in web. Web hasbecome such a big repository of information. The size of the web is increasing in exponentialproportions. The latest technologies in the world of web enable anyone to post informationon to the web. This causes the phenomenon of web explosion. Reading all through theabove line you would have got the feel about massive size of the web.

The massive size of the web is both an advantage as well as disadvantage for web.On the advantage side you may say that all kinds of information are available in web. Onthe disadvantage side you can quote the problem retrieving the relevant information andorganizing such a massive collection.


NOTES


You would have observed that the above two paragraphs doesn’t consist of the wordXML. You can have a question that where XML fits in to this scene. The objective of thissection is to provide answer to this interesting question.

1.1.4.1 HTML Vs XML

Hyper Text Markup Language is the most used markup language on World WideWeb. Let us explore the similarities and differences between HTML and XML in thissection.

The primary role of HTML is rendering of web content. All the browsers in the worldunderstand HTML. HTML is a markup language which consists of a collection tags whosebehavior is predefined i.e. each and every tag in HTML has an in-built meaning associatedwith it. The browser would interpret it and render the output. Of course there would subtledifferences between various browsers in displaying the HTML content. The underlyingfact is that all the browsers can understand HTML and render the output

If you compare HTML and XML with respect to display behavior, later doesn’t havetags which have predefined meaning with them. The tags in XML are defined by the users.So they won’t have any display functionality associated with them. Their main purpose isto organize the data rather than displaying the data in a display format. This becomes themajor difference between XML and HTML in the view point of displaying the tags in webbrowsers.

Another important difference between XML and HTML lies in the strictness offollowing rules. You can easily say that rules in HTML are not strict i.e. there exist certainrules which can be followed or neglected. For example, if you think of closing tags, it is notmandatory in HTML where as it is compulsory in XML.

Figure: Comparing HTML and XML with respect to closing tags

If you look at the above Figure, HTML code snippet is given on one side andXML snippet is given on another side. The HTML snippet has tags like <b>, <input>which are not closed. Though the closing tags are missing HTML code snippet is a valid.No browser would throw an error message for this HTML snippet. At the same time if youlook at the XML snippet you can observe that closing tag for <last> is missing. Here onlyone closing tag is missing. But XML will not accept this. Closing tags are compulsory with

<form> <b> Enter your name here <input type = text> </form>

<name>

<first> Ram </first>

<last> Kumar

</name>

DMC 1801

NOTES


respect to XML. There is a concept called well-formed XML which requires many criteriato be satisfied.

Another important difference with HTML and XML is the nesting of tags. HTMLdoesn’t impose hard rules on nesting of tags. XML is very strict on nesting of tags. The tagthat is opened last should be closed first. This rule can not be violated in XML. But HTMLis lenient regarding this rule.

Figure: Comparing HTML and XML with respect to nesting tags

If you observe the figure, it compares HTML and XML with respect to nesting oftags. In HTML snippet the <b> tag is closed after the </form> tag which is not propernesting because <form> tag has been given before <b> itself. So the <b> tag has to beclosed before <form> tag. But this rule is not followed here. But HTML doesn’t throw anerror message for this. At the same time, if you look at the xml snippet the <last> tag is notclosed following nesting rules. In this scenario XML will not accept this as a valid XMLsnippet. So there would be problem during parsing of this XML code snippet.

The final difference between HTML and XML is the case-sensitiveness. The formeris not case-sensitive but the later is. HTML tags are differentiated with respect to case. Butin xml two tags with same letters but with different case are not the same. They would beconsidered two different tags. This is depicted in Figure.

Figure : Comparing HTML and XML with respect to case-sensitivity

In the above figure, HTML has tags where opening is given in one case and closing inanother case. For example the closing tag for <form> is given as </FORM> i.e. inupper case. Here nothing would go wrong with respect to HTML. If you look at XML

<form> <b> Enter your name here <input type = text> </form> </b>

<name>


<last> Kumar

</name>

</last>

<form> <b> Enter your name here <input type = text> </FORM> </b>

<name>


<last> Kumar </LAST>

</name>


NOTES


code snippet, the tags <last> and </LAST> would not be considered as a pair. The reasonfor this to happen is the case sensitiveness of the XML tags.

So in general, you can reach a conclusion that HTML is a markup language whereyou would find hard and fast rules. But in the case of XML if there is a rule for somethingit can not neglected for any reason. On the other hand, XML is flexible with respect tocreation of new tags i.e. you can create your own tags out of blue where as in HTML thisis not possible. HTML has a predefined set of tags with which the rules are not strict. XMLhas infinite set of tags with which rules are compulsory.

Such a lengthier discussion of comparison between XML and HTML has becomenecessary because you must become very clear with respect to the features of XML andHTML. The reason for emphasizing this fact is that you should study XML in a HTMLviewpoint. Both these tools are different and their functionalities also differ.

The only similarity between XML and HTML is that both of these are the derivativesof SGML (Standard Generalized Markup Language). This SGML is the mother of allmarkup languages.

1.1.4.2 Need for XML in Web

The need for XML in web is multidimensional. In this section you would learn aboutvarious dimensions of XML in Web i.e. what are all the different purposes for which XMLis used in World Wide Web?

As stated in earlier part of this text web consists of huge collection of users who intentto interchange data among them. Their platform may be different, their technology may bedifferent, their hardware may be different, but with all these differences there is a need fora technology which would work above all these barriers. Answer for this question is XML.The XML is platform neutral, technology neutral, device neutral which suits to the WorldWide Web scenario.

WWW is a heterogeneous collection of clients with different technology, structureand platform. The glue that holds all these items together is XML. More over XML is anopen standard so there are no hidden gimmicks with XML. This encourages the clients inWWW to go for XML.

XML can be integrated seamlessly with HTML. This interoperability factor makesXML an ideal choice in the web domain. The point you should remember here is that XMLis not a technology to replace HTML. But they can co-exist with each other. The placeswhere HTML fails because of its inherent problems with its architecture XML becomes ahandy tool. As you learned in the previous section regarding the nature of rules in HTMLand XML, if you are in a scenario where the rules needs to be followed absolutely you cango for XML.

DMC 1801

NOTES


In World Wide Web, different parties work together to achieve certain goals. If thereis no technology which is above all the differences then the power of web would becomea question mark. An example scenario is depicted in figure.

Figure: XML on Web

The above figure explains a typical scenario in web. You can observe from the figurethat there is a distributor who receives items from manufacturers. There is a retail storewhich contacts with the distributor. All these parties may be on different platforms as wellas technologies. But they all use XML as medium of their communication. If XML is notthere then it becomes almost impossible to establish communication among various parties.

Another advantage of using XML on web is the ability to update portions of webpages instead of updating the whole. The recent technologies like AJAX (AsynchronousJavaScript and XML) use XML as a communication medium. There exist communicationtechniques like XMLHttpRequest to communicate between client and web server. ThisXML request enables the developers to make web applications similar to desktopapplications. These applications are generally called Rich Internet Applications (RIA). Soin RIA xml becomes the core technology in addition to JavaScript. The seamless integrationof Document Object Model (DOM) technology with XML is another important advantagefor which you can use XML on Web.

XML also enriches the search capability of web content. XML provides certain contextbased information, so that search can retrieve document which are more relevant. Youwould have heard of Semantic Web where XML is the key technology. Semantic web,

Retail Store

Distributor

Internet

XML

XML

XML

Manufacturer 1 Manufacturer 2

XML


NOTES


with the help of XML adds an entirely new dimension to the World Wide Web. This XMLenriched semantic web is becoming one of the promising trends in the world of web.Search engines can retrieve contents which are more relevant in semantic web than in thenormal web.

The capability of all the browsers in handling XML is another big advantage for it.There would be slight variation between them but at the core level they all support XML.If you go through any XML book that was published four or five years before, there youwould find a sentence saying “XML is the next big thing in the world wide web”. Now timehas come to strikeout the word “next” in those sentences because already it has becomethe big thing in web.

Based on the importance factor, you can compare the role of XML in web with therole of ASCII (American Standard Code for Information Interchange) in desktop paradigm.In desktop paradigm ASCII was playing vital role being the common representation standardacross all the applications. Now XML has taken that role in Web. With the added diversitiesof Web, XML is doing a similar thing which ASCII was doing for desktop paradigm. Byreading the above line you should not reach a conclusion that ASCII and XML are relatedtechnologies. The comparison between those technologies has been given in the prospectiveof critical roles they play in those paradigms.

So we can conclude that XML role on web is multi-faceted. On one hand it acts asthe glue technology in bridging application running on web. On the other hand it enrichesthe search capabilities of contents on web. In another dimension the extensionality natureof XML suits most to the World Wide Web.

QUESTIONS

Part A

Objective type Questions

1. Which of the following is considered as predecessor for XML?

a. W M L

b. S G M L

c. R M L

d. None of the above

2. Official XML recommendation was released during

a. Oct 2000

b. Sep 2001

c. Jun 1998


DMC 1801

NOTES


3. XSLT refers to

a. eXtended Secure Language Technique

b. eXclusive Style Language Tool

c. XML Style Language Transformation


4. DTD refers to

a. Document Tool Design

b. Data Tool Design

c. Document Type Definition


5. XML is

a. Not case sensitive

b. Less strict than HTML

c. Nor related to Web


Answers

1. b 2. a 3. c 4. c 5. d

Part B

Short Questions6. List out the roles of XML.

7. 7.How XML provides platform independence8. List out the needs of XML in Web.9. How XML separates data from user interface

Part C

Descriptive Type Questions

10. Compare and Contrast XML and HTML.1.2 XML Basics

1.2 XML BASICS

1.2.1 Introduction

In this Chapter you would learn about XML syntax basics. After going through thischapter you would be able to create your own XML files. In addition to this you wouldalso have an introductory idea about various nomenclature used in XML. The structure ofthis chapter has been arranged in a step by step explanatory manner so that you will learnone thing at a time and towards the end of it you would have collective knowledge aboutthose topics.


NOTES


1.2.2 Setting Up The EnvironmentTo start with let us explore about the various essential things that are required to

create and test your XML files. To get started with any computer language you wouldrequire an editor through which you can create and save files on your system. As you havealready learned, XML is a simple and powerful technology. To keep its simplicity intact itdoesn’t requires any special editors to work with. You can simply create XML files usingany available editors. For example in windows you can use the default editor i.e. Microsoft’sNotepad itself. If you are in Linux environment go for simple editors like GEdit etc. Evenfor that matter you can use the default “vi” editor in UNIX environment.

Of courses there exist plenty of special purpose editors available for XML fromproprietary and open source teams. Simply by Googling you can find plenty of XMLeditors on internet. In this text you would not find recommendation for any specific editorsfor XML. The reason for doing this is, each developer would find comfort in editor whichhe/she is using for other technologies like php, JavaScript etc. They can continue withthose same editors. One advantage that you get with special purpose editors is that theywould have XML specific features embedded in to them. One such feature is the syntaxhighlighting of the XML content. By syntax highlighting we mean that specific colors wouldbe given for tags, text, keywords etc. This text leaves the choice of editors at your owndecision.

Now you have an editor to work with XML. What next? Definitely you would like totest your XML content. Where to test this XML file that you have created just now? Theanswer to this question is Web Browsers. You can check the content that you have createdwith web browsers. Again the choice of browser is yours.

Other than these text editor and browser, what do you need to start with XML? Theanswer is nothing. With these two tools you can start working with XML. Have you realizedthe fact that these two tools are already available in your computer? Yes you are correct.You can start working with the first XML file in your system right now. The figure givenbelow shows a simple xml file output in a browser.

<?xml version="1.0" encoding="UTF-8"?> <!—This XML file list 3 players --> <team> <player> <name> Dhoni </name> <age> 26 </age> </player> <player> <name> Rahul </name> <age> 33 </age> </player> <player> <name> Sachin </name> <age> 35 </age> </player> </team>

DMC 1801

NOTES


Figure : Simple XML file and its display in browser

1.2.3 Anatomy of A Xml File

In the previous section you learned about how to create a simple XML file. Afterreading this section you would become familiar with each every component i.e. the anatomyof a XML file. You would find a scan of XML file starting right up from the first line to theend of file.

1.2.3.1 The Declaration

The first line of XML file looks like the following.

<?xml version=”1.0" encoding=”UTF-8"?>

This line is called XML declaration. It describes important attribute of the file. Theline begins with <?xml. This indicates that the following file is XML file. Immediately followingthis you have “version” attribute. Here you can find the value 1.0 is the above example.This attribute tells that which version of XML that you are using. The current possiblevalues for this attribute are 1.0 and 1.1. The real meaning of these attribute is that how theywould be parsed by the browsers or applications. If you want to on the safer side go forthe version 1.0 because it is supported in almost all the popular browsers and applications.Another point to note down here is that the version attribute is optional i.e. you can evenomit this version attribute. Another interesting thing mentioning here is that in the initialdrafts of XML they used <?XML ?> but later it was modified to the lowercase <?xml?>.Now you have to use only the later one i.e. the lowercase.


NOTES


The next attribute is “encoding”. It refers to the “character set” that would be used torepresent your file. The default attribute in many windows based editors is ASCII i.e.American Standard Code for Information Interchange. ASCII has the capability to representonly the text documents, to be precise only the pure text documents. What does thatmeans is these files can have only pure text content like A to Z, a to z, 0-9 etc. Totalnumber of symbols possible in ASCII is 256. For example if you say “A” the ASCII valueis 65, for B it is 66 etc. The drawback with using ASCII is its inability to express manyother languages other than English like Chinese, Hindi etc. The World Wide Web is notonly for English. It supports many other languages. In the previous Chapter you wouldhave read that XML breaks all the barriers of technology, platform etc. So it can notrestrict with only one human Language i.e. English. XML has to support all other languages.

The solution to the above problem is to move towards a character set which supportsmore number of characters than ASCII preferably the characters of many of the humanlanguages. One such character set is Unicode. Unicode would support 65,536 (216)characters in total. Unicode is 2 Byte long. There is another character code which is calledUniversal Character System (UCS) which supports almost 2 billion symbols. The UTF-8that you have seen for the encoding attribute is “UCS Transformation Format-8 (UTF-8)”. The specialty of UTF-8 is that it uses a mixture of one byte and two bytes. For symbolthat can be represented with one byte itself it uses only one byte for example alphabets likea to z. For other symbols which are not in the boundary of one byte it goes for two bytesper symbols.

There is an another format called UTF-16 where the lowest count itself it two bytesand for less commonly used symbols it uses more than two bytes. The point to note downhere is that you can use the values UTF-8 and UTF-16 for the encoding attribute. Like wespecified for version attribute UTF-8 is supported by all the XML processors. So if youdon’t have requirements to use symbols from other languages you can go for UTF-8which more space conscious thing to do. The default value for the encoding attribute isUTF-8 i.e. even you omit the encoding attribute the value UTF-8 would be consideredautomatically.

There is one more attribute which you can use here. That attribute is called “standalone”attribute. The example is given below:

<?xml version = “1.0” standalone=”yes” encoding=”UTF-8"?>

The purpose of using standalone attribute is that, it indicates whether this XML file iscomplete by itself or it needs support from other files. If it doesn’t require support fromother files you can use values of this attribute as “Yes”. If it requires support from other filesyou can use the value “No”.

In general this XML declaration line comes under the category called XML prolog.There exist various other parts of XML prolog. You can find more information regardingXML prolog in later sections.

DMC 1801

NOTES


1.2.3.2 The comments

Any coding would be incomplete if there are not comments. XML also is not anexception. You can insert comments in a XML file. Explaining about the file in general orparticular portion for future reference is the purpose of XML comments.

In our example code you can find a comment line right up in the second line itself.

<!—This XML file list 3 players —>

This comment is similar to the comments that you would have used with HTML. TheXML comment would begin with the symbols <!— . Then you can have the actual commentand it has to end with —>.

Though you insert comments according to your wish there are certain conditionswhich are to be followed while placing comments. One such constraint is that you can nothave a comment as break in the tag. For example

<name <!—This is one player name —>> Dhoni </name>

is an invalid comment. The tag <name> has been broken and in between the tag commenthas been used. XML processors would not accept such kinds of comments.

According to XML 1.0 specification, placing a comment before the XML declarationis invalid. So the first line of a XML file should be XML declaration. After that only you canuse either the XML tags or comments.

1.2.3.3 XML Tags

XML tags are the basic element of a XML file. XML tags are similar to HTML tags.Any tag in XML has to start with the symbol <. Then you would have the tag name. Thereare certain rules for XML tag names. They are as explained below

XML tag name can contain alphabets, numerals and special characters. Any XMLtag can not start with a number or punctuation. XML tag names can not hold a space inthem. Another important thing is the XML tag names can not begin with the term XML.

The following table lists various xml tags and indicate whether they are valid or invalid.It also provides you the reasons for which these tags are considered either as valid orinvalid.

Apart from following strict rules there are certain best practices which would makeyour xml listing more professional. For example you can avoid using ‘.’ in xml


NOTES


Table XML valid and Invalid Tags

because ‘.’ is reserved for some other purposes in many programming languages.Similarly you can avoid using ‘:’ is your xml tag names because they tend to create somemisinterpretation among the readers. If you follow these types of ethics in your tag namesthey would definitely increase the readability of your xml file.

Immediately following tag name there is > symbol. Similar to HTML any starting tagwould have an ending tag. The syntax for end tag is similar to the start tag except theinclusion of / symbol. For example the valid closing tag for <player_name> is </player_name>.

1.2.3.4 XML Elements

Having understood about XML tags the next step for you is to understand XMLelements. Actually XML tags are part of an XML element. To make things much clear thecomponents of an XML element is as given below:

XML start tag + Actual Text + XML end Tag

For example

<name> Dhoni </name>

Where

<name> = XML start Tag

Dhoni = Text

</name> = XML end Tag

The purpose of XML elements is to provide some additional information regardingthe text. In other words it is the Meta information regarding the text.

For certain elements there may not be any text at all. These kinds of elements arecalled Empty elements. For example

<lastname />

Tag Description

<xmlname> Invalid because starts with xml

<xml name> Invalid because it contain a space

<1name> Invalid because starts with a number

<player_name> Valid because _ is allowed in names

DMC 1801

NOTES


Empty elements doest not have any closing tag. Instead of this you can close the tagthere itself by leaving a space and putting a ‘/’ before ‘>’. This is shown in the aboveexample.

You are already aware of the rules about XML tags. Now you may have questionwhether there are any rules for the text that you place in between tags? In a broaderperspective the answer is ‘No’. Note down the term ‘broader’. You may place anythingas XML text as you want.

For example consider the following XML element.

<name> 1001010010101001010010100101010010100101001010100101 </name>

In the above example you can find the numbers 1 and 0 in between the <name> tags.Here no restriction to put only names. XML provides you such a freedom. It is only in yourhand to supply the necessary data as XML text.

Another point to note down is no restriction on the length of XML text. The text can be ofany length as you wish. XML doesn’t specify any theoretical limit on the length of the text.

If you insert any white space in between the text the white space is preserved byXML. It is the responsibility of the target application to keep or reject the white spaces.For example Microsoft Internet Explorer strips all the white spaces and displays the outputwithout them. The following example illustrates the same.

<?xml version="1.0" encoding="UTF-8"?> <!—This XML file list 3 players --> <team> <player> <name> Dhoni </name> <age> 26 </age> </player> <player> <name> Rahul </name> <age> 33 </age> </player> <player> <name> Sachin Tendulkar</name> <age> 35 </age> </player> </team>


NOTES


Figure : XML white space striping example

In the sample file above look at the name “Sachin Tendulkar” with many spacesbetween two words. If you look at the output it doesn’t contain the spaces. These spacesare automatically stripped out in the output.

Though XML is flexible in having any thing as text, there is a rule that if you place anycharacters that have special meaning in XML like ‘<’, then it will cause errors.

Look at the following the example.

<?xml version="1.0" encoding="UTF-8"?> <!—This XML file list 3 players --> <team> <player> <name> Dhoni </name> <age> 26 </age> <average> av < 48 </average> </player> <player> <name> Rahul </name> <age> 33 </age> </player> <player> <name> Sachin Tendulkar</name> <age> 35 </age> </player> </team>

DMC 1801

NOTES


Figure : XML text containing < symbol.

Internet Explorer version 6.0 has shown an error because of the symbol < in the<average> element.

In XML text it is always better to go for the entity equivalents for special symbols.Table shows the symbol and its corresponding entity equivalent.

Table : Symbols and its Entity Equivalents

In the following example the ‘<’ symbol is replaced with its entity equivalent i.e. <By doing this you can rectify the above error.

Symbol Entity Equivalent > > < < “ " & & ‘ &apos


NOTES


Figure : Text Error Rectified

This method of replacing symbols with entity equivalents is good for this kind ofscenario. Let us imagine a condition where you have a portion of C program for displaypurpose then it would become tedious to replace all the special symbols by equivalententities. It would also be a problem to edit the contents later on. So the effective solutionfor this kind of scenario is to use the CDATA section. The following example shows anefficient usage of CDATA section.

<?xml version="1.0" encoding="UTF-8"?> <!—This XML file list 3 players --> <team> <player> <name> Dhoni </name> <age> 26 </age> <average> av < 48 </average> </player> <player> <name> Rahul </name> <age> 33 </age> </player> <player> <name> Sachin Tendulkar</name> <age> 35 </age> </player> </team>

DMC 1801

NOTES


Figure: XML – CDATA example

The above XML code contains a portion of C code as text. Imagine you have toreplace all the special symbols with entity equivalents. In order to avoid this you can useCDATA section. Look at the CDATA section syntax. The end of the CDATA sectionwould have the sequence of ‘]]’ symbols.

1.2.4 Well-formed XML

You have already learned in the previous sections that rules in XML are strict innature i.e. they are mandatory. So any XML file has to follow these rules. An XML whichis formed by following these rules is called well-formed. This section summarizes all therules for an XML file to be called well-formed.

<?xml version="1.0" encoding="UTF-8"?> <program> <![CDATA[ <codesection> for(index = 1; index < 100; index++) { printf("%d\n", i); } </codesection> ]]> </program>


NOTES


1.2.4.1 XML Declaration

The first line of a XML file should be the XML declaration. You have already learnedthe syntax of XML declaration. Here the point to be noted is that it is not necessary to haveall the attributes in XML declaration i.e. attributes are optional but the declaration ismandatory. An XML will not be called well-formed if it doesn’t contain a declaration.

1.2.4.2 Root Element

The presence of a root element is mandatory for an XML file to be called well-formed. If you see the previous example in this chapter you would find the root element as<team>. All the other elements of the file are placed inside this root element. The rootelement has to be opened first and it is the root element which is closed at last.

1.2.4.3 Proper Nesting of Tags

You already know that tags can be nested in XML. These nesting should be proper.The tag that is opened last should be the tag to be closed first. Again you can refer theprevious example in this chapter. There you can find out that the tag <team> has beenopened first and it is the same <team> tag which is closed last. Not only this tag but any tagin XML has to follow this nesting rule.

1.2.4.4 Quotation for Attribute values

If you use any attribute in your XML file then the value for those attributes should begiven in quotes. Look at the following example.

The name tag now has an attribute called “role”. The value for this role attributeshould be given in quotes for this XML to be well-formed. You can recall the fact that inHTML this rule of quotes is not followed strictly whereas in XML it is followed strictly.

<?xml version="1.0" encoding="UTF-8"?> <!—This XML file list 3 players --> <team> <player> <name role=”captain”> Dhoni </name> <age> 26 </age> </player> <player> <name role=”opener”> Sachin </name> <age> 35 </age> </player> </team>

DMC 1801

NOTES


1.2.4.5 Paring of tags

Any tag in XML must have a start tag and an end tag. But there is an exception tothis role that empty elements need not follow this rule. Other than this empty element all thetags in XML are paired. Again if you compare with HTML, it has many tags which don’thave a closing tag or closing tag is optional. But XML’s pairing of tags is mandatory. Soany tag other than an empty tag should have a closing tag. You can confirm the same fromthe previous example.

1.2.4.6 Slash for empty elements

In the above paragraph, you learned the necessity of closing tag in XML. Thereyou came across a term called empty element. An element is called empty element if itdoesn’t posses any text between its starting tag and ending tag. Look at the followingexample.

Figure : XML with properly managed empty tags

In the previous example if you look at first two <wickets> tag it doesn’t has a valueassociated with it. So it has no closing tag. But note the presence of “/” before the symbol“>”. You would have noticed a space between the word “wickets” and “/”. This space ismandatory in case of XHTML (more on XHTML later) to cope up with the older webbrowsers. But this space is not mandatory with XML because of the fact that XMLprocessors can recognize the empty elements even without this space in between.

<?xml version="1.0" encoding="UTF-8"?> <!—This XML file list 3 players --> <team> <player> <name> Dhoni </name> <age> 26 </age> <wickets /> </player> <player> <name> Rahul </name> <age> 33 </age> <wickets /> </player> <player> <name> Sachin </name> <age> 35 </age> <wickets> 153 </wickets> </player> </team>


NOTES


1.2.5 XML is not everything

At this point in time, after reading all the previous text you can get an opinion thatXML is omnipotent i.e. XML can do anything. But it is not true. XML do have boundaries.Before proceeding further it becomes necessary that you should come out of this hypedeveloped around XML.

1.2.5.1 XML Vs Programming Languages

Many of you would have got an opinion that XML is a programming language. But itis not. It may sound strange but that is the fact. While introducing XML we clearly mentionedthat it is a “markup” language. The primary difference between markup and programminglanguages is that, markup describe the data while programming language issues many logicalcommands. More over a programming language has many conditional and loopingstatements. XML doesn’t posses any of these conditional or looping statements. So youcan reach a conclusion that XML is not a programming language.

1.2.5.2 XML Vs DBMSXML is not a Database Management System. But XML posses certain basic features

of a DBMS like storage and retrieval data matching certain conditions. With these simplefeatures alone we can not say that XML is a complete database management systembecause DBMS posses many advanced features like data clustering etc. XML is notdesigned to replace DBMS. Its primary purpose is to represent the data that can betransferred across a network. So you should not plan to use XML to store a large database.At the same time you can use XML in combination with DBMS to structure the data andto transfer the data to another application which may be using a totally different databasealtogether. This is depicted in the following figure. Here you have two applications oneusing MS Access as database and another using Oracle DB. Still XML can be used totransfer data from one application to another. You can receive data from MS Access andconvert it to XML and send the data to Oracle DB through Application2.

Figure : XML in Combination with DBMS

Application

1

Application

2

Ms Access

DB

Oracle

XML

DMC 1801

NOTES


Here you are replacing the DBMS tools like MS Access and Oracle with XML butyou can use XML in Combination with these to achieve the interoperability between variousapplications. So the point to remember here is that XML is not DBMS.

1.2.6 Revolutions of XML

From the above discussions you would have understood the basics of XML. Thissection projects three important revolutions of XML i.e. the changes effected by XML invarious dimensions. The following list gives you various XML dimensions.

1. Data Revolutions2. Architecture Revolution3. Software Revolution

1.2.6.1 Data Revolution

Prior to XML, data was considered to be application specific. The data associatedwith an application was in proprietary format of application itself. The primary problemwith this kind of approach is that data becomes locked with in a particular application. If atall you want data to be communicated it has to be sent as parameters to functions whichare again application specific. This is depicted in the following figure.

Figure : Prior to XML, After XML

The top portion of the figure indicates how data was communicated prior to XML.The lower half indicates how XML modified data from parameters to Documents. Thesedocuments would be sent across the web. Each application both sending and receivingwould have capabilities to understand and parse these documents. After parsing thedocuments data would be extracted. The primary advantage that we achieve through thisis the application neutral data. This increase the easy of data transfer between applications.

1.2.6.2 Architecture Revolution

In addition to data revolution XML has provided a drastic paradigm shift in the mannerin which application are architected. Prior to XML applications were tightly coupled i.e.

Application 1 Application 2 Data as parameter to functions

Application 1 Application 2 Data as Document


NOTES


any change in one application may require one or more changes in other applications. Thisis illustrated in the following figure.

XML has provided the loosely coupled applications. The advantage here is that youcan become free of vendor binding and technology binding. You can choose technologiesfrom various vendors suited to specific components and yet achieve the interoperabilityamong these applications using XML.

Figure (a): Tightly Coupled Applications

Figure (b): Loosely Coupled Applications

1.2.6.3 Software Revolution

XML has made a huge impact on how applications are developed. Prior to XMLSoftware development would be strictly in accordance with well-described requirementspecifications. The problem with this approach is that if you like make modification basedon the real time requirements it would be very difficult to carry out. With the introduction ofXML software development has become a collaborative process.

Now the designer has to assemble various components based on the presentrequirements. Whenever there is a change in requirements the corresponding componentscan be introduced in to the assembly.

Application 2

Application 1 Application 3

Application 1

Application 3

Application 2

Web

DMC 1801

NOTES


This approach of XML makes the software to be flexible in nature. Another advantageis that you can select the existing components which are well tested and yet extensible foryour application.

The above specified data, architecture, software revolutions of XML has really creatednew ways of application development. These three dimensional impact of XML on softwaredevelopment has provided a stronger space for XML in the Information technology industry.

Questions

Part A

Objective Type Questions

1. XML can be edited ina. vi editorb. Notepadc. GEditd. All of the above

2. UCS refers toa. Universal Character Systemb. Unicode Compatible Systemc. Useful Characters Sampled. None of the above

3. The entity used for “&”a. &ampb. &amperc. &ampsd. None of the above

4. Which of the following is an requirement for well formed XMLa. XML declarationb. XML definitionc. XML commentsd. None of the above

5. After the introduction of XML, data is sent asa. Documentb. binary formatc. image



NOTES


Answers

1.d 2. a 3. a 4. a 5. a

Part B

Short Questions

6. Explain anatomy of a XML file.7. Explain the components of XML element.8. Explain entities in XML.9. What are the rules for well formed XML.

Part C

Descriptive type questions

10. Explain different revolutions of XML.

1.3 WEB SERVICES, SOAP AND SOA

1.3.1 Introduction

This chapter would introduce you to the technologies like Web Services, SOAP andSOA. In Later parts of this text you would find elaborate information on Web Servicesand SOAP. This Chapter’s objective is to provide a basic outline of these technologies.This introduction will be of great help to understand the concepts which are given in thelater chapters. So a careful reading of this chapter is required to follow the contents of laterpart of this text.

1.3.2 Web Services

The term “Web Services” has become a buzz-word in the industry. Everyone in theindustry is trying to implement something with these Web services. What actually theseweb services are? Answering this question is the objective of this section. After reading thissection you would definitely have a basic understanding of Web Services.

1.3.2.1 What Web Services are?

You can easily find many definitions for the term “web services”. Here we give you adefinition which could understand by a person who doesn’t have any idea related to webservices.

“Web services are program components that reside in some portion of internet,which can be accessed by standard internet technologies like http etc from a remoteplace”

Web services are code sequences to solve a particular problem that doesn’t resideson the same machine where you are executing the program to solve the problem. Basically

DMC 1801

NOTES


web services are business logic that resides on a remote machine which you can easilyaccess by using standard protocols like Hyper Text Transfer Protocol (HTTP), TransmissionControl Protocol (TCP) etc.

Now you would have a question that how web services are different from technologieslike RMI (Remote Method Invocation), CORBA etc.? There is a very basic differencebetween these technologies and web services. The difference is as follows: The technologieslike RMI, CORBA etc are either vendor specific or platform specific whereas web servicesare independent of platform, vendor etc. To use RMI, CORBA or DCOM (DistributedComponent Object Model) both the sender and receiver should be having something incommon like platform or vendor or technology.

But the web that you use is not homogenous i.e. there exist many technologies frommany vendors. But still you would like to establish communication between these. Webservices are there to help you to achieve this interoperability. The reason for achieving thisinteroperability is because of the fact that web services are based on XML. You havealready learned that XML is not bounded to any specific operating system or technologyor vendor. So this neutrality bubbles up to web services from XML.

1.3.2.2 Characteristics of Web Services

After from the above said neutrality factor Web Services do have various othercharacteristics. This section will explain you various characteristics of web services.

1.3.2.2.1 Loosely Coupled

Web is client served based. Normally the client and server on web technologies aretightly coupled i.e. any modification in the server side interface would require one or moremodifications at client side also. In the case of Web services this is not true. Here therequestor or consumer of the service and provide of the service are loosely coupled.Because of this loosely coupled nature the application becomes easily maintainable i.e.you can modify the server interface and still client would be able to access the servicesprovided that certain level of integrity is maintained. More on this is explained in laterportions of this text.

1.3.2.2.2 Synchronous or Asynchronous

By synchronous we mean that, once the client calls a service then it would wait untilthe server execution completes. In the case of Asynchronous client doesn’t wait until theserver execution. Web Services can be invoked either as Synchronous or Asynchronous.The choice is left to the user. So the user can decide based on the situations whether to calla service in synchronous mode or in asynchronous mode.

1.3.2.2.3 Multiple Development Technologies

Web services can be built using variety of technologies. To name a few you candevelop web services in J2EE, .NET, PHP, Perl etc. So to implement a web service,


NOTES


developer can choose his/her own technology. This makes the development of web servicesan easier task.

1.3.2.2.4 Discoverability

Web services can reside anywhere on internet. So it becomes mandatory that theyshould be discoverable. Web Services achieve this discoverability through UDDI (UniversalDescription, Discovery and Integration). By this the location of the web services becomesinsignificant because they are discoverable from any where on internet.

1.3.2.3 Model of Web Services

The web services can be modeled with three basic components. They are Serviceprovider, Service broker and Service requestor. The relationships among these threecomponents are shown in the following figure

Figure : Web service Model with relationships

The role of service provider is to develop and deploy the web services. The serviceprovide also defines the services. The role of service broker is registration and discoveryof services. The primary role of service requestor is to invoke the web services. Here theroles are given very briefly. More on web services would be discussed in later portions ofthis text.

Service Provider

Service

requestor

Service Broker

Discover Service

Invoke Service

Register Service

DMC 1801

NOTES


1.3.2.4 Technologies associated with Web Services

As you have already understood web services are emerging very fast. So there existmany technologies associated with web services. These technologies are also rapidlygrowing. Many more technologies are being introduced regularly. This section would briefabout few technologies associated with Web Services.

SOAP WSDL UDDI

SOAP (Simple Object Access Protocol) packages the XML for transfer betweenvarious clients. The actual XML contents would be overlapped by SOAP structure. Bythis it becomes very easy that any SOAP client can easily access this because of thegenerality nature.

WSDL is Web Service Definition Language. As the name indicates it defines the webservices invocation methodology, parameters etc. WSDL make the interaction betweenclient and the web services smoother.

UDDI is Universal Description, Discovery and Integration. As the name indicates itfacilitates the discoverable nature of web services. It provides a web services repositoryusing which the services can be easily discovered.

Other than the above specified techniques, there exist various other technologies likeWSCI (Web Services Choreography Interface), WSFL (Web Services Flow Language),DSML (Directory Services Markup Language) etc.

1.3.3 Soap

SOAP (Simple Object Access Protocol) is an XML based protocol. During thediscussion on Web Services you learned that SOAP acts as a packaging layer. SOAPprovides set of rules for moving data.

Before the development of SOAP, there were many similar technologies likeMicrosoft’s Distributed Component Object Model (DCOM), Java Remote MethodInvocation (RMI) etc. The difference between these technologies and SOAP is that, SOAPis outside the boundaries of development technologies and platform. Other than thesetechnologies, there are certain XML based protocols similar to SOAP. Few such protocolsare listed below:

XMI ( XML Metadata Interchange) XML RPC ( XML – Remote Procedure Calls) WDDX ( Web Distributed Data Exchange) JABBER


NOTES


The complete description of the above protocols is outside the scope of this text. Butone thing for sure, SOAP includes the advantages of many of these technologies

As the name indicates SOAP is “Simple”. By saying simple it doesn’t mean that itlacks other features like security and reliability. To be precise SOAP is both simple andpowerful.

SOAP operates on top standard internet protocols like HTTP, FTP and SMTP etc.SOAP derives its interoperability nature from XML. Surely you can say SOAP is one ofthe most powerful technologies in the XML family. The following diagram indicates SOAPposition.

Figure : SOAP role in Communication

From the above Figure you can understand that SOAP is used in combination withthe standard internet protocols like HTTP, FTP etc. The advantage that you get because ofthis is the SOAP messages penetrate firewalls. The firewalls are normally configured toallow these communications. So you can achieve this power of accessing across variousnetworks with the help of SOAP.

Internet Protocols like HTTP, FTP, SMTP etc

XML

SOAP

Web Based Network

DMC 1801

NOTES


1.3.3.1 SOAP parts

SOAP usually consists of the four components. These parts are as given below:

1. First one is the envelope that describes the message. It also indicates that how themessage should be processed.

2. The second part is encoding rules. These encoding rules are for data types definedby the application.

3. The third component of SOAP is for RPC i.e. Remote Procedure Calls. It alsohandles the responses.

4. The fourth component of SOAP is a convention for binding. This is for exchangingmessages through the standard internet protocols like HTTP, FTP, SMTP etc.

1.3.3.2 SOAP merits and demerits

SOAP provides many advantages. Few of them are listed below:

1. SOAP is powerful enough to penetrate through firewalls2. SOAP supports most of the standard internet protocols like HTTP, FTP etc3. Since SOAP is based on XML it is platform neutral and language neutral4. SOAP is extensible in nature

No technology is ideal in nature so is SOAP. It does posses few disadvantages asexplained in the following lines. Since SOAP is in XML verbose format, it is little bit slowerthan the other similar technologies like CORBA etc. But this is solved by the inclusion ofcertain techniques to SOAP. More on SOAP is explained in later parts of this text.

1.3.4 Service Oriented Architecture

Service Oriented Architecture (SOA) has been a popular term recently. By goingthrough this section you would get an overview about Service Oriented Architecture.Sometimes SOAP is confused with Service Oriented Architecture. The reason for this isthe similarity between the acronyms SOA and SOAP.

SOA fully utilizes the ultimatum of software engineers i.e. “reusability”. In SOA, agroup of services communicates with each other to achieve the objective. Two importantgoals of SOA are Rapid application development and low cost development process.

At the same time SOA provides few challenges as well. Some of them are responsetime, synchronization etc. The services would be residing at various places and they mayoperate at different pace. So the architect of the application has to achieve synchronization.

There is an important term associated with SOA which is called “Orchestration”.Orchestration is the process linking and sequencing various existing services to achieve thebusiness objectives. So the orchestration of services plays an important role in the ServiceOriented Architecture development process.


NOTES


The Service Oriented Architecture services are loosely coupled in nature. Becausethis loosely coupled nature parts of application can be modified without worrying muchabout the integration issues. This becomes a very big advantage in large enterprise applicationdevelopment process.

Interoperability is another key advantage of Service oriented Architecture. As thepopularity of SOA is increasing, there exists many services available freely now. Forrepeatedly performed task these services can be used as off-the-shelf utilities.

Questions

Part A


1. RMI stands for

a. Remote Method Invocationb. Root Method Invocationc. Rule Modeled Instanced. None of the above

2. WSDL isa. Web System Data languageb. Web Services Definitions languagec. Web System for Data Locationd. None of the above

3. How many components are there is SOAP?a. 5b. 4c. 3d. None of the above

4. SOAP stands fora. Simple Object Access Protocolb. Sample Objects Access Provisionc. Source Of Access Providerd. None of the above

5. Which of the following is/are advantage(s) of SOAa. Reusabilityb. Interoperability

DMC 1801

NOTES


c. Loose couplingd. All of the above

Answers1.a 2. b 3. c 4. a 5. d

Part B6. Explain the discoverability feature of web services7. What are all advantages of SOA.8. Explain the components of SOAP.9. What are the characteristics of Web services.10. Explain the web services model in detail.


NOTES


UNIT II

XML TECHNOLOGY

2.1 XML NAMESPACES & STRUCTURING

2.1.1 XML Namespaces

As it has been already stated, XML provides lots of advantages like using customizedtags etc. This customized tags leads to a potential problem which is called tag duplication.

In other words, if more than one developer is working with the same XML file andthey are using their own tags, there is a possibility for duplication of tags. To avoid thisduplication problem the Namespaces are used.

Initially namespaces were not part of original XML specification. They are added ata later point. You can find the XML namespace specifications at http://www.w3.org/TR/REC-xml-names/.

Namespaces identify the tags with specific groupings. For example tags like <firstname>can be used by one than one sources. In this situation namespaces would provide theidentification regarding the belonging of corresponding tag.

2.1.1.1 Namespace Usage

To define a namespace xml provides xmlns:prefix attribute. After defining thisnamespaces, the prefix would be used at all the later reference locations. This xmlns:prefixattribute would be assigned a unique value which is normally a URL. This URL would beused to provide more information regarding the namespace.

Let us consider the following example.

DMC 1801

NOTES


Let us assume that you would like to insert your opinions on players in this XML file.But you don’t want to disturb the original flow of the XML document. You can very welldo that by using namespaces. The source file with namespaces inserted in to it is as givenbelow:

Figure : Namespace in XML

<?xml version="1.0" encoding="UTF-8"?> <!—This XML file list 3 players --> <team> <player> <name> Dhoni </name> <age> 26 </age> </player> <player> <name> Rahul </name> <age> 33 </age> </player> <player> <name> Sachin </name> <age> 35 </age> </player> </team>

<?xml version="1.0" encoding="UTF-8"?>  <team xmlns:me = ”http://www.myview.com/view” > <player> <name> Dhoni </name> <age> 26 </age> <me:comment> Good Captain </me:comment> </player> <player> <name> Rahul </name> <age> 33 </age> <me:comment> Consistent Player </me:comment> </player> <player> <name> Sachin </name> <age> 35 </age> <me:comment> Committed Player </me:comment> </player> </team>


NOTES


The output of above XML listing is shown in the figure. The URL given in the namespaceis used to provide the documentation regarding the namespace. Though an URL is giventhere as attribute it is not mandatory that it points to an actual page. But it is a goodpractice to give a live URL over there.

2.1.1.2 Namespace definition at child node

It is also possible to use namespaces at the child node level also. The below givenexample illustrates this fact. In this example only for the first player the tag comment hasbeen added. Note that the namespace me has been defined only in that place.

<?xml version="1.0" encoding="UTF-8"?> <!—This XML file list 3 players --> <team> <player> <name> Dhoni </name> <age> 26 </age> <me:comment xmlns:me=”http://www.myview.com/me”> Good Captain </me:comment> </player>

DMC 1801

NOTES


Figure: Namespace at child node

The output for the above listed has been shown in the following figure. The point thatwe would like to insist here is that namespaces can be defined at any depth. It is notmandatory to define the namespace at the root level.

Figure: Namespace definition with child node – output

2.1.1.3 Default Namespace

When you are using multiple namespaces it is also possible to make a particularnamespace as default one i.e. it not required to use that particular namespace explicitly. Ifno namespace is given this would be assigned by-default.

<player> <name> Rahul </name> <age> 33 </age> </player> <player> <name> Sachin </name> <age> 35 </age> </player> </team>


NOTES


An example is given below:

Figure: Default namespace

To make a namespace default use the xmlns attribute without any prefix. Note in theabove example the xmnls attribute used with html tag does not has any prefix. So it isconsidered as default namespace.

2.1.1.4 Rules for namespace prefixes

The namespace prefixes follow certain rules as given below:

Namespace name can start with a letter or underscore. Though the usage of colons is not technically restricted they are not used to avoid

confusion. The terms, xml and xmlns cannot be used as namespace names.

2.1.1.5 Reasons for using Namespace

Namespaces are not compulsory with all the XML files. But there are certain situationswhere namespaces becomes effective tool:

When the particular XML file has the possibility of getting linked with other XMLfiles in future. In such a scenario if those XML files also use tags with same name,namespaces becomes mandatory.

Namespaces are widely used with XML technologies like SOAP and WSDL etc.

2.1.2 Structuring XML

XML allows you to create your own tags and structure. This freedom is very muchthe reason for success of XML. At the same time, to process XML file using variousapplications requires it to follow certain rules regarding its structure. This structuring ofXML files becomes very much important when it get shared across multitude of parties.The structuring of XML can be done through various techniques as given below:

<?xml version="1.0" encoding="UTF-8"?> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <body> <center> <h1>

This is a simple html file </h1> </center> </body> </html>

DMC 1801

NOTES


Document Type Definitions XML Schemas

Both of the above given techniques allows you to validate an XML file. Validatingmeans whether the XML file follows the rules given in its structure definition or not. Lookat the following XML listing.

Figure : Sample XML file

Suppose if your processing applications requires that the input XML file should strictlyadhere to this structure then the following XML file would become invalid.

Figure : XML file with additional tags

In the above example a new tag <category> has been added which is not there in itsversion given previously. Due to this additional tag the processing applications can causecertain problems. To avoid these types of problems the validations of XML becomes animportant task.

This section would focus on Document Type Definitions (DTD). The next sectionwould focus on schemas.

2.1.2.1 Document Type Definitions

As stated above, XML validation can be done in more than one method. DocumentType Definition is the earliest method of validating XML file. Indeed this method has beenderived directly from the ancestor of XML i.e. SGML (Standard Generalized MarkupLanguage).

<?xml version="1.0" encoding="UTF-8"?>

<player>


<age> 26 </age>

</player>


<player>


<age> 26 </age>

<category> Wicket Keeper </category>

</player>


NOTES


The developers who have an understanding of SGML can easily cope up withDocument type definitions. There exist two ways in which Document Type Definitions canbe used in an XML file. They are given in the following figure.

Figure : DTD Types

These classifications have been made based on the location where the DocumentType Definition is located. Whether it is external or internal DTD holds certain rules thatthe attached XML file has to follow.

2.1.2.1.1 Internal DTD

In the case of internal DTD, the definitions are located in the same file itself. The rulesthat the XML file has to follow is given in the XML file itself. Generally they are givenimmediately following the <?XML> declaration. This type of declaration is not used thatmuch popularly as the external document type definitions.

For example for the file given in the beginning of this section, the rules can be given inthe same file itself as shown in figure


<!DOCTYPE player[

<!ELEMENT name (#PCDATA)>

<!ELEMENT age (#PCDATA)>

<!ELEMENT category (#PCDATA)>

<!ELEMENT player(name, age, category)>

]>

DMC 1801

NOTES


Figure : XML file with Internal DTD

The above code when viewed in internet explorer gives the following output.

Figure : Output of XML with inline DTD

2.1.2.1.2 External DTD

When the document type definition is given as a separate file and it is linked with thexml file then it is called external DTD. Generally the definition files are stored with theextension .dtd. The following example lists an XML file with external DTD.

Figure : XML file with external DTD

<player>


<age> 26 </age>


</player>


<!DOCTYPE player SYSTEM “playerrule.dtd”>

<player>


<age> 26 </age>


</player>


NOTES


Here the document type rules are given in the file named “playerrule.dtd”. The contentof this file is as shown below:

Figure : External DTD file

The output of above XML file with external document type definition is as shownbelow in Figure xx.

Figure : Output of XML file with external DTD

2.1.2.1.3 Anatomy of Document Type Declaration

You would have noticed both in internal and external DTD, the DOCTYPE has beenused. This section would focus on elements of this DOCTYPE. Notice that the termDOCTYPE is given in capital letters. If the same is given in lower case then it would beconsidered as invalid. This has been shown in the below figure:

Figure : Error message due to Lower case DOCTYPE





<!ELEMENT player (name, age, category) >

DMC 1801

NOTES


The components of Document type “declaration” are as given below:

DOCTYPE: This keyword refers to DTD element. As stated earlier this has to begiven in upper case letters.

The next item is indicates the root element of the XML file for which the DTD isgiven.

The next item indicated the type of DTD. There exist various types of DTD. Theyare

SYSTEM: The keyword SYSTEM would make a DTD private.

PUBLIC: The keyword PUBLIC is generally used when the DTD has been specified by a standard body. You can notice this in many of the html

sources.

The next item can be a file name or an URI.

The following example illustrates the above said facts.

Figure : Anatomy of Document Type Declaration

2.1.2.1.4 Combining Internal and External DTD

Internal and external DTDs can be combined together in a single XML file. Examplefor this type is given the below Figure.

Figure : Combination of Internal and External DTD


<!DOCTYPE player SYSTEM "playerrule.dtd" [

<!ELEMENT zone (#PCDATA)>

]>

<player>


<age> 26 </age>


<zone> South </zone>

</player>


NOTES


In the above example you can notice that the external declarations are given in the file“playerrule.dtd” and the definition is extended with an additional element zone for whichthe definition is given as internal.

Declaring Elements

Declaration of element in Document Type Definition has the following syntax.

<!ELEMENT [name of the element] [specification of content] >

Note that, here the keyword ELEMENT should be given in upper case similar toDOCTYPE. Otherwise it would through an error.

An example ELEMENT type definition is given below:


Here the keyword PCDATA indicates that this element can hold parsed characterdata. The keyword PCDATA should be preceded with the character “#”. An elementdeclared as PCDATA can hold any character data at the same time it can’t hold childelements. This is illustrated in the following Figure.

Figure : PCDATA invalid element


<!DOCTYPE player[





]>

<player>

<name>

<first_name> M S </first_name>

<last_name> Dhoni </last_name>

</name>

<age> 26 </age>


</player>

DMC 1801

NOTES


In the above example, name has been defined as #PCDATA. At the same time thename element has been given with two child elements which is invalid because of thePCDATA type.

At the same time empty elements are allowed with the PCDATA. For example thefollowing is valid:

<name></name>

When the content specification is given as ANY, it can hold any value. The differencebetween PCDATA and ANY is that the later accepts child elements. This has been illustratedin the following example.

Figure : Content Specification “ANY”

In the above example the content specification for “name” is given as “ANY” whichmeans that it can accept child elements as well. The usage of “ANY” should be donecarefully. Otherwise it can lead to much liberalized rules which would in turn make theusage of DTD itself less effective.

An example for element with child elements is given below.



<!DOCTYPE player[

<!ELEMENT name ANY>




]>

<player>

<name>

<first_name> M S </first_name>

<last_name> Dhoni </last_name>

</name>

<age> 26 </age>


</player>


NOTES


Here the player element holds three child elements namely name, age and category. Theorder of elements is also important.

Various Quantifiers

There exist various quantifiers which can be given in combination with elements. Forexample when the quantifier “+” is given it means that the particular element can appearone or more times. An example is given below:

Figure : Usage of “+” quantifier

In the above example “category” is given with “+” quantifier meaning that it can berepeated for any number of times.

The “?” quantifier is used in those situations where the number of instances is zero orone at the maximum. If the number of instance is more than one then it would becomeinvalid.

An example for “?” quantifier is given in the following figure.


<!DOCTYPE player[




<!ELEMENT player(name, age, category+)>

]>

<player>


<age> 26 </age>


<category> Aggressive Batsman </category>

</player>


<!DOCTYPE player[




DMC 1801

NOTES


Figure : “?” Quantifier

Another quantifier which can be used is “*”. The “*” quantifier allows the child elementto appear zero or more times. For example

<!ELEMENT player(name, age, category, zones*)>

When there are many choices it can be given separated by the character “|”. Forexample

<!ELEMENT matches(odi | test | t20)>

Attribute Declarations

One of the important components of any XML is “attributes”. DTD can be used tostructure the attributes as well. To accomplish this <ATTLIST> is used. The general syntaxof ATTLIST is as given below:

<!ATTLIST [name of the element] [name of attribute] [type of attribute] [defaultvalue] >

<!ELEMENT player(name, age?, category)>

]>

<player>


<age> 26 </age>


</player>


<!DOCTYPE player[


<!ATTLIST name first_name CDATA “”>

<!ATTLIST name last_name CDATA “”>



<!ELEMENT player(name, age?, category)>

]>


NOTES


Here the data type used for attributes is CDATA (character data). There are variousother types also exists like IDREF (it requires a unique id for the specified attribute),ENTITY (allows for an entity to be provided) etc.

Apart from these the values of the entities can be controlled using following keywords.

#REQUIRED: It makes the attribute mandatory. If the attribute is missing then itwould make the XML invalid. For example:

Now if the last_name attribute is missing then it would become invalid where asfirst_name is not mandatory.

#IMPLIED: If the attribute is specified as #IMPLIED then the value is not

mandatory. By default it would be considered as “null” value. For example

#FIXED: Sometimes the values of attributes do not change. For those kinds ofattributes #FIXED can be given. For example

Here the overs_count attribute has been given the fixed value of 50. The differencebetween default and fixed is as follows: In the case of default if the value is missing then thevalue specified in the DTD would be used. But in the case of FIXED no new value can besupplied.

2.1.2.1.5 Declaration of Entities

An XML entity replaces a symbol with character string. By default it provides variousentities like &amp for &, &lt for < etc. Apart from these you can define your own entities.For example


<!ELEMENT name EMPTY>

<!ATTLIST name first_name CDATA “”>

<!ATTLIST name last_name CDATA #REQUIRED>


<!ELEMENT name EMPTY>

<!ATTLIST name first_name CDATA #IMPLIED>

<!ATTLIST name last_name CDATA #REQUIRED>


<!ELEMENT match EMPTY>

<!ATTLIST overs_count CDATA #FIXED “50”>

DMC 1801

NOTES


In the above example three entities were defined.

<!ENTITY WC “Wicket Keeper”>

<!ENTITY RHB “Right Handed Batsman”>

<!ENTITY SB “Swing Bowler”>


<!DOCTYPE player[




<!ELEMENT player (name, age, category) >

<!ENTITY WC "Wicket Keeper">

<!ENTITY RHB "Right Handed Batsman">

<!ENTITY SB "Swing Bowler">

]>

<team>

<player>


<age> 26 </age>

<category> &WC; </category>

</player>

<player>

<name> Dravid </name>

<age> 30 </age>

<category> &RHB; </category>

</player>

<player>

<name> Lee </name>

<age> 26 </age>

<category> &SB; </category>

</player>

</team>


NOTES


The output of above XML listing is as shown below in Figure In the figure you can see thatthe entities are replaced with the actual string values.

Figure : Entities Replaced with Strings

2.1.3 XML Schemas

As described in the previous section DTDs are used to validate the structure of aXML listing. Apart from DTD there is another technique to do this called XML schema.Before discussing XML schema let us first look in to the drawbacks of DTD method.

2.1.3 .1 DTD drawbacks

The Document Type Definition has many disadvantages as listed below:

The syntax of DTD is different from XML syntax. So the XML programmer has tolearn a completely new set of syntactical elements to use DTD. This becomes ahuge disadvantage and reduces the speed of development.

Though DTD can be used to specify the nature of data, it is very limited. In otherwords you cannot precisely set the data type as number etc. For example consider the following DTD and example:

<!ELEMENT Runs (#PCDATA)>

<Runs> no run </Runs>

Here DTD doesn’t restrict the <Runs> element to numeric.

DMC 1801

NOTES


DTD doesn’t allow specifying the number of times an element has to appear.Though there are qualifiers like +, * etc., it doesn’t has provisions to exactly specifythe frequency of occurrences.

Another disadvantage of DTD is that it doesn’t have provisions to reuse the set ofelements defined.

All these drawbacks lead to the usage of XML schema instead of DTD. XML schemaprovides much finer control over the document.

2.1.3.2 XML Schema Introduction

W3C has accepted XML schema as a recommendation in 2004. The biggestadvantage of XML schema is that it follows the XML syntax. Indeed schema documentitself is an XML file and normally saved with “.xsd” extension.

XML schema document has an XML declaration. The simple form of this XMLdeclaration is as shown below:

Each XML schema document has a root element. The root element is <xs:schema>.The simple form this root element is as follows;

Apart from this, <xs:schema> can have other attributes as listed below:

Table : Root element attributes


Attribute Usage attributeFormDefault Indicates whether the “attributes” of the instance

document needs to be prefixed. The possible values are qualified unqualified. If the value is qualified then the attribute’s prefix becomes necessary otherwise they are not necessary.

elementFormDefault Indicates whether the “elements” of the instance document needs to be prefixed. The possible values are qualified unqualified. If the value is qualified then the element’s prefix becomes necessary otherwise they are not necessary.

Version Used to indicate the version of the schema document xml:lang Indicates the language used in the XML schema

document.

<xs:schema xmlns:xs=”http://www.w3.org/2001/XMLSchema”>



</xs:schema>


NOTES


A sample root element is as shown below:

2.1.3.3 Declaration of Elements

XML Schema provides much finer control over the type of data that is valid for aparticular element. It depicted in the following example.

XML sample file for simple element

The XML Schema document for the above is as shown below:

XML Schema Document for simple element

In the above XML schema document you can notice the <xs:element> with typeattribute. The “type” attribute indicates which type of value is valid for the particular element.This type of data restriction is not possible with DTD.

The sample list of data types are given in the following table.

Examples for Data types

Apart from defining these simple elements, XML schema allows to define complexelement types, restricting the number of occurrences of such elements and reusing theexisting groups.


<Runs xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="schema.xsd"> 100 </Runs>


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<xs:element name="Runs" type="xs:decimal" />

</xs:schema>

Data Type Description Boolean Used to indicate the binary valued elements. Date To indicate a date value Decimal A number (+ or -) Double A floating point number with double precision Float A floating point number with single precision String Text value Time Indicate the time instance

DMC 1801

NOTES


Generally XML schema documents are verbose in nature. To ease the work for thedeveloper, there are many tools. Examples for such tools are XMLSpy, Microsoft VisualStudio etc. These tools make the developer’s task simple by providing various optionsthrough graphical editors and Integrated Development Environment.

XML Schema documents play a vital role in the development of Web Services, RSSFeeds etc.

Questions

Part A

Objective Type Questions1. Which of the following term is used frequently in combination with XML namespaces?

a. prefixb. urlc. htmld. none of the above

2. Which of the following is XML structuring techniquea. DTDb. HTMLc. SGMLd. None of the above

3. Which of the following keyword makes a DTD private?a. privateb. systemc. portd. none of the above

4. Which of the following is not a quantifiera. *b. ?c. /d. None of the above

5. Which of the following is used to specify the nature of data in XML schemaa. datab. typec. variabled. None of the above

Answers

1. a 2. a 3. b

4. c 5. b


NOTES


Part B

Short Questions

6. Explain the process of Namespace definition

7. What are all the rules for namespace prefixes?

8. Explain the steps involved in external DTD.

9. What does PCDATA refers to ?

10. List out the demerits of DTD.

PART C

Descriptive type Questions

11. Explain the XML schema usage with a clear case study.

12. Create a DTD specification which would specify rules for maintaining student

information.

2.2 XML PRESENTATION

Through out this text, it has been repeatedly mentioned that XML is a datarepresentation language. If you view XML in a browser like Internet Explorer or FireFoxit would simply display it in a tree structure which can be folded or unfolded at elements.

At the same time, XML can be presented in a smoother manner by using techniqueslike Cascaded Style Sheets (CSS). This section would elaborate how to render a XMLfile with style specifications in CSS. The following steps are involved in this process.

Creating the Source XML file

Creating the CSS file

Linking XML and CSS

Let us consider the following example:

Source XML File

DMC 1801

NOTES


The output of the above file is as shown below:

Output of the XML listing

2.2.1 Defining the CSS

The CSS for the above XML listing is as shown below:

CSS Definition

In the above CSS definition you can notice that style has been defined for each typeof element specified in the XML file.

2.2.2 Linking the CSS and XML

The most important step in this process is the linking of CSS and XML. The CSSLinked XML file is as shown in the figure.


NOTES


Linking XML and CSS

After linking CSS and XML, if the XML file is viewed using same browser then theoutput is as shown in below:

XML output with CSS

You can notice that all the elements are presented with the styles given in the CSS.

2.2.3 Specifying Selectors in Stylesheets

You can add your own selectors to the style sheet. An example is shown in the followingfigure.

Specifying Selectors in CSS

The selector can be used in the XML file using “class” attribute. This is shown in thefollowing figure.

DMC 1801

NOTES


Using CSS style with “class” attribute

The output of above XML file is as shown in the following figure.

XML output

2.2.4 Using Class specific Selector

The selectors can be specified for a particular class. They can be used in the XMLfile with “class” attribute.

The example for this is as shown in the following figure.

Class Specific Selector

It can be used in the XML file as shown below:


NOTES


The output of this XML file is as shown in the following figure:

Output of XML with class specific Selector

Apart from this, you can use “in-line style” also. In this case the style information isgiven is given directly in the XML file itself. This method can be avoided because it createsthe maintenance problems for the styles. If the style has to be modified then it requireschanges in too many locations.

2.2.5 XML Transformation

In the previous section, presentation of XML with the help of CSS was explained.This section would focus on a more advanced tool called XSLT. XSLT refers eXtensibleStyle Language Transformations.

XSLT can be defined as a transformation language for XML. The output of XSLTtransformation can be HTML or text or even XML itself.

2.2.5.1 Performing XML Transformation

To perform XML transformation you need at least two files. They are

The source XML file

XSLT style sheet

Consider the following XML file and XSLT style sheet.

DMC 1801

NOTES


Source XML File

XSLT File

2.2.5.2 Linking XML and XSLT file

Linking of XML and XSLT can be done by adding a line in the source XML file asshown in the following figure.


NOTES


Linking XML and XSLT

The output of above listing is as shown in the following figure:

Output of XML file with XSLT

How does this works?

The XSLT file shown in the above example has a tag called <xsl:template>. TheXSLT processor would scan your source XML file for the element with the name given in“match” attribute. The corresponding matches would be transformed with the style specifiedin <xsl:template>.

DMC 1801

NOTES


In the example illustrated above two matches has been given; one for <team> andanother for <player>. If you look at the output file, you can find that the correspondingXML file has been transformed with the style given in XSLT.

xsl:value-of

The <xsl:value-of> has a attribute called “select” which indicates the value to beselected for display. Here this attribute is having the value “name”. So the “name” appearsin the output. If you change the same with “age” the output would look as shown in thefigure:

xsl:for-each

The “select” attribute would find the only the first match. If there are more than onematch then xsl: for-each can be effectively used. xsl:for-each would gather all the matches.

The following example shows the usage of xsl:for-each.

Source XML file with two “name” tags inside <player>


NOTES


The XSLT style transformation with <xsl:for-each> is as shown in figure.

XSLT Style Information

The output of above XSLT Transformation is as shown in the following figure.

2.2.5.3 Other methods of XSLT Transformation

Apart from performing the client side transformation in the web browser, XSLTtransformation can also be performed using other methods listed below:

Server Side Transformation: Using a server side scripting language like JSP,

the XSLT transformation can be performed. A sample JSP program is as shown in

the figure.

DMC 1801

NOTES


Server Side XSLT Transformation

Standalone Programs: You can also write standalone programs in languages like

Java to perform XSLT transformation. A sample Java program is shown in the

figure.

Standalone program for XSLT Transformation


NOTES


The above examples are written using Java. But the implementation is not restrictedonly to Java. You can use other languages also. This is one of the important advantages ofXML. XML is supported in almost all of the popular languages.

2.2.6 XML Infrastructure

This section would highlight some of the important techniques of XML called XMLparser. XML Parsing

2.2.6.1 XML parsing

Another important aspect of XML is the ability to parse the XML document usingprogramming languages like Java, .NET etc. XML parsing is the process of breaking aXML document in to components so that they can be handled programmatically.

There are two types of parsers available for XML. They are as shown in the followingfigure.

XML parsing methods

DOM

DOM refers to Document Object Model. DOM breaks a XML document in to atree of nodes. That DOM tree can be manipulated using programming languages like Javaeffectively. The nodes in the DOM tree represent the elements in the original XML document.

There are various levels of DOMs recommended by W3C. They are DOM Level1,DOM Level2 etc.

SAX Parser

SAX refers to Simple API for XML. SAX parser breaks the XML document in toset of events. Examples for events are as listed below:

DMC 1801

NOTES


StartDocument,

StartElement,

EndElement,

SAXWarning,

SAXError,

EndDocument.

SAX is not a official W3C recommendation. Some of the useful concepts of SAX areincorporated in to later versions of DOM. But SAX parser is much faster than the DOMparser in handling larger documents. At the same time DOM is very efficient in handlingsmaller XML documents.

Types of Parsers

XML parsers are classified in to two types. They are as shown in the following figure.

XML parser Classification

Non-Validating XML Parser

These types of parser simply validate the XML file for common rules of XML i.e.they don’t check for the attached DTD or Schema.

Validating XML Parser

These types of XML parsers validate the document against the attached DTD orSchema. At the same time they check for the well-formed nature of XML as well.

Few of the XML parsers available in the market are as listed below:

Apache Xerces

XML4J from IBM

MSXML form Microsoft.


NOTES


2.3 XPATH, XLINK AND XQUERY

XPath is used to easily access the data in a XML Document. XPath allows us toaccess specific parts of XML document. A XML document is basically in the form of atree structure. XPath is used to access specific parts (nodes) of this tree structure. As foras XPath is concerned the nodes in a XML document belongs to any of the followingtypes:

Root Node Element Nodes Text Nodes Attribute Nodes Comment Nodes Namespace Nodes Processing instruction nodes.

The XPath data model represent XML document. The root element of XPath datamodel contains the entire XML document. This includes comments, processing instructionsthat occur even before the root element or after it.

The XPath data model doesn’t include Document Type Definition. So the parts ofDTD are not accessible through XPath.

Location Path

A “location path” is used to identify a set of nodes in the XML document. The setreturned by location path might consist of a single node or collection of nodes. The set mayeven be empty.

A location path consists of consecutive “location steps”. Each location step is specifiedrelative to a particular node called “context node”.

The simplest location path is the one that identifies the root element of the XMLdocument. The root location path is simply identified by “/”. (You can easily identify thesimilarities between the path syntax of XPath and Unix path syntax. Recall the fact thateven in UNIX also, the root directory is represented by “/”). An example is as shownbelow:

Figure : Location Path representing “root” node

<xsl:template match="/"> <html><xsl:apply-templates/></html> </xsl:template>

DMC 1801

NOTES


The other elements can be represented with location path. An example for representingthe inner nodes is as shown below:

Figure : Sample XML File

XPath is often used in combination with XSLT. The below given XSLT documentextracts all the player names from the above given sample XML file.

Figure : Extracting all the player names from XML document

<?xml version="1.0" encoding="UTF-8"?> <team> <player> <name> Dhoni </name> <age> 26 </age> </player> <player> <name> Rahul </name> <age> 33 </age> </player> <player> <name> Sachin </name> <age> 35 </age> </player> </team>

<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <HTML> <xsl:apply-templates/> </HTML> </xsl:template> <xsl:template match="player"> <P> <xsl:value-of select="name"/> </P> </xsl:template> </xsl:stylesheet>


NOTES


The output of the above code is as shown in Fig.

Figure : Nodes extracted with XPATH

The comments in the XML document can also be matched with the below given syntax.

Figure : Comment Handling with XPath

Simillarly compound location paths can also be specified with combining node nameswith “/” . For example “/team/player/age”.

Apart from this predicates can also be used to filter out nodes matching particularcondition.

Figure : Filtering the player with specialization Wicket Keeper

An example code filter all the XML document nodes matching a particular conditionare as shown below. In this example when the value of the node “specialization” is “captain”it is shown in bold letters otherwise normal formatting is applied.

<xsl:template match="comment()"> <i>That was a good comment</i> </xsl:template>

<xsl:apply-templates select="//specialization[.= 'WicketKeeper']" />

DMC 1801

NOTES


<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="player.xsl"?> <team> <player> <name>Dhoni</name> <age> 26 </age> <specialization>Captain</specialization> </player> <player> <name> Rahul </name> <age> 33 </age> <specialization>Defensive Batsman</specialization> </player> <player> <name> Sachin </name> <age> 35 </age> <specialization>Opening Batsman</specialization> </player> </team>

Figure : Source file used for filtering

The output for the above listing is as shown below:

<?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="specialization"> <xsl:choose> <xsl:when test=".='Captain'"> <br /> <B><xsl:value-of select="."/> </B> </xsl:when> <xsl:otherwise> <br /> <xsl:value-of select="."/> </xsl:otherwise> </xsl:choose> </xsl:template> </xsl:stylesheet>


NOTES


Figure: Condition matching node is given special formatting

XPath Expressions handling numbers

Apart from location path, there exist other types of XPath expressions which wouldreturn numbers or strings as output. The arithmetic operators supported by XPath are aslisted below:

+ - * mod div

2.4 XLINKS

XLink is used to provide links in a XML document. The XLink target is not restrictedonly to XML documents. The link target can be other documents as well.

An example for XLink is as shown below:

<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="player.xsl"?> <team> <player> <name xmlns:xlink= "http://www.w3.org/1999/xlink" xlink:type = "simple"

DMC 1801

NOTES


Figure : XML document with XLink

The xlink has various attributes as shown in the above example. For example type,href etc. Some of them are optional and others are mandatory.

The “type” attribute has six possible values:

simple extended locator arc title

resource

There exist another attribute called “show”. The possible values for this attribute areas shown below:

new replace embed other

XPointers

XPointers are used to locate specific portions in the XML document. The XPointersare used in combination with XLink. This would make the connection between twodocuments more precise and clear. XPointer would specify the links to specific portionsusing the syntax as mentioned in the location path.

xlink:href = "http://www.teamindia.com/cap.htm"> Dhoni</name> <age> 26 </age> <specialization>Captain</specialization> </player> <player> <name> Sachin </name> <age> 35 </age> <specialization>Opening Batsman</specialization> </player> </team>


NOTES


An example is as shown below:

Figure : A Sample XPointer

The above example would point to the name element specified by the XPointer. Here withposition equals to 1.

2.5 XQUERY

XQuery is more powerful than XPath, XLink and XPointers. Indeed XQuery providesmany of the language constructs.

XQuery treats the XML document similar to a relational database. The XQuery hassyntax features similar to SQL. The XQuery is defined by W3C as given below:

“XML is a versatile markup language, capable of labeling the information content ofdiverse data sources including structured and semi-structured documents, relationaldatabases, and object repositories.

A query language that uses the structure of XML intelligently can express queriesacross all these kinds of data, whether physically stored in XML or viewed as XML viamiddleware.

This specification describes a query language called XQuery, which is designed to bebroadly applicable across many types of XML data sources”

XQuery provides sets of operators and functions which would facilitate the extractionof data matching particular condition. There exist various XQuery processors available.For example “Galax” is one of the popular XQuery processors.

Questions

Part A


1. CSS Stands fora. Cascaded Style Sheetsb. Coding Style Sectionsc. Color Selection Schemed. None of the above

2. Which of the following attribute is used to provide selector?a. selectb. attrib

http://www.teamindia.com/player.xml#xpointer(//name[position( )=1])

DMC 1801

NOTES


c. classd. none of the above

3. SAX stands fora. Simple API for XMLb. Structured API for XMLc. Select Argument extensiond. None of the above

4. Which of the following is a XML parser?a. XCodeb. Xercesc. M-XMLd. None of the above

5. Linking of XML and XSLT is done bya. <?xml-stylesheet>b. <?xml-style>c. <?xml-source>d. none of the above

Answers

1.a 2. b 3. a

4. b 5. a

Part B

Short Questions

6. Write short notes on in-line CSS.

7. Explain about <xsl-template>

8. Explain about xsl:for-each

9. Write short notes on DOM.

Part C

Descriptive Type Question

10. Explain the XSLT presentation technique in detail.

11. Create a XML document representing information regarding books, like isbn, nameetc. Using XML technologies filter out book with corresponding ISBN.


NOTES


UNIT III

SOAP(Simple Object Access Protocol)

Objectives

Providing an overview of SOAP. Introducing XML-RPC, its format. Explaining the Anatomy of SOAP. Introducing various actors in SOAP. Introducing the Faults handling techniques in SOAP. Explaining about the attachments with respect to SOAP.

3.1 INTRODUCTION

As it has been consistently mentioned in this text, XML is a platform neutral language.This platform neutrality nature of XML can be effectively used in communication betweenapplications running in various platforms.

SOAP is an important keyword in the XML domain. It plays a critical role in messagecommunication between applications running in various platforms. The understanding ofSOAP becomes crucial in becoming web service developer. The power of SOAP is that itis totally based on XML.

SOAP is supported by W3C and many vendors in the industry like Sun Microsystems,IBM, HP etc. SOAP can be easily used with technologies like J2EE, Microsoft .NET etc.

This chapter introduces the fundamentals of Simple Object Access Protocol (SOAP)and its applications.

3.2 HTTP (HYPER TEXT TRANSFER PROTOCOL)

The Simple Object Access Protocol (SOAP) can use any of the networking protocolsfor communication. It doesn’t specify one single protocol for doing this. Initially SOAP 1.0specification had indicated HTTP as transport protocol. In the later specifications it hadsupport for most of the widely used protocols (as listed below):

DMC 1801

NOTES


SMTP (Simple Mail Transfer Protocol) FTP (File Transfer Protocol) POP3 etc

Advantages of using HTTP

There exist various advantages of using HTTP protocol (apart from being the simplestbroadly supported protocol).

Support in Languages: Http has a wide support across various languages likeJAVA, C, php etc. There are various libraries exist in these languages to supportHTTP. This would be a crucial factor in making the application development easierand faster. Because of this support in the programming languages, the programmeris not required to build the code from the scratch.

Support across platforms: Http has support across various major platforms likeWindows, Solaris, and Linux etc. So the distributed nature of application wouldnot be restricted by a single domain. The communication between applicationsrunning on these platforms can happen with out issues.

Firewall Pass - through: The http connections can pass through most of thefirewalls. It would be a major advantage in deciding http as a transport protocol.

Text Based: Http protocol is text based. So in case of testing the telnet can beused to check the servers.

Usage of HTTP header: The header associated with the HTTP can be usedeffectively to gather information like document – encoding etc. This is shown in thefollowing figure 3.1

Figure 3.1: Http header Information

Apart from the above mentioned advantages, there exists another important factor tobe considered with respect to HTTP i.e. the understanding that the developers have on thisHTTP.


NOTES


3.3 XML – RPC

To start with, RPC stands for Remote Procedure calls. Normally programs aredeveloped as group of procedures. The following attributes characterizes a procedure:

A procedure is nothing but a chunk of code written to perform a specific task. Procedures can be invoked from locations in the programs where they are required

(and if the permission is there to invoke). Procedures can have zero or more inputs in the form of arguments or parameters. Similarly they can return zero or one value from them. In some of the languages like JAVA, a procedure is called “Method”.

3.3.1 What RPC is?

“A remote procedure call is a technique in which the calling and called programs arenot necessarily present is the same machine”.

In other words, a program running in machine M1 can call a procedure which is therein another machine say M2, and receive the result of computation from that machine.

There exist various Remote procedure call techniques as listed below:

CORBA (Common Request Broker Architecture) Java/ RMI (Remote Method Invocation) XML-RPC etc

The technologies like, CORBA and RMI etc which were developed prior to XMLRPC had some issues as listed below:

Complexity: These techniques were more complex comparing with XML- RPC. Binary Nature: These techniques were binary in nature. So there were problems

while passing through fire-walls etc. Platform Dependency: Another issue related to these techniques is their

dependency towards a particular platform. So it was not very easy to establishcommunication between applications running on various platforms.

3.3.2 What XML-RPC is?

As stated earlier XML-RPC is a remote procedure call technique. This XML-RPChas successfully made an attempt to solve the above mentioned problems like complexity,binary nature and platform dependency etc.

The core concept of XML-RPC is as listed below:

1. There would be a XML document which contains a procedure name and arguments(parameters)

2. This document would be sent to the web server using transport protocol like HTTP.

DMC 1801

NOTES


3. The web server would identify the procedure name and arguments given in thesource document and invoke that procedure in the server side.

4. The result of the procedure is constructed as a XML document and it would besent back to the client from where the original request came-in.

5. The above mentioned simple steps would be carried-out to make a XML-RPC.6. An example for XML –RPC document is as shown in the following figure 3.2.

Figure 3.2: An example RPC Document

By having a closer look at the above XML code snippet, you can observe the followingfacts.

The root element here is <methodCall>. This <methodCall> element has two child elements. They are <methodName> : Indicates the name of the method to be called. <params> : Indicates the arguments associated with this method. The <params>

can have various child elements called <param> each indicating an argument. Thevalues associated with the arguments would be supplied through <value> tag andthe data type would also be specified. In the above example the data type isspecified as <string>. There are other data types like

Int Double Boolean etc

These requests would be sent with a HTTP header. This http header would havefollowing data.

Figure 3.3 : XML RPC header

<?xml version="1.0"?> <methodCall> <methodName>getScore</methodName> <params> <param> <value><string>India</string></value> </param> </params> </methodCall>

POST /target HTTP 1.0 User-Agent: Identifier Host: host.making.request Content-Type: text/xml Content-Length: length of request in bytes


NOTES


The complete XML-RPC request for the above given example may like the following.

Figure 3.4 : A Complete XML RPC Request

The response for the above XML-RPC request would be generated by the server.An example is provided below in the figure 3.5.

Figure 3.5: XML – RPC response

By having a closer look at the response you can observe the following facts:

The XML-RPC response is very similar to XML- RPC request. The methodCall element is now replaced by the methodResponse element. The XML-RPC response contains only parameter. Similar to the XML-RPC request, the XML –RPC response also has associated

http header information. An example is shown in figure 3.6. ( This example showsboth the XML –RPC response message and the header)

It supports HTTP 1.0. The compatability is there for HTTP 1.1 as well. The content type would be indicated as text/xml. The XML-RPC responses uses the “200 OK” as response code. The length of the response would also be indicated in the response header so that

it can be used in required locations.

POST /xmlrpc HTTP 1.0 User-Agent: testXMLRPCClient/1.0 Host: 172.16.12.66 Content-Type: text/xml Content-Length: 168 <?xml version="1.0"?> <methodCall> <methodName>getScore</methodName> <params> <param> <value><string>India</string></value> </param> </params> </methodCall>

<?xml version="1.0"?> <methodResponse> <params> <param> <value><string> 345 For 7 </string></value> </param> </params> </methodResponse>

DMC 1801

NOTES


Figure 3.6: XML- RPC with Header Information

XML – RPC Faults

When the execution of the method specified in the XML – Request fails that time, anXML-RPC fault would occur. XML-RPC fault response is very similar to the normalXML –RPC response except the fact that it would have <fault> tag instead of the <params>tag.

At the same time, the <fault> element can also contain a maximum of only one value,similar to the <params> element.

The XML-RPC fault response may contain an error code. An XML-RPC faultresponse is as shown in the figure 3.7.

Figure 3.6: XML- RPC Fault

XML – RPC data structure

XML-RPC supports data structures like arrays, struct etc. They don’t have thesupport for pointers.

Arrays:

To represent an array the “array” element is used. The array element would have onlyone “data” element and in-turn this data element can hold zero or more value elements.

HTTP/1.1 200 OK Date: Mon, 23 Feb 2009 11:30:04 GMT Server: Apache.1.3.12 (Unix) Connection: close Content-Type: text/xml Content-Length: 156 <?xml version="1.0"?> <methodResponse> <params> <param> <value><string> 345 For 7 </string></value> </param> </params> </methodResponse>

<?xml version="1.0"?> <methodResponse> <fault> <value><string>No such method!</string></value> </fault> </methodResponse>


NOTES


Each value element contains the corresponding data type element and the actual parametervalue.

Figure 3.7 - XML RPC Array

Unlike other programming languages, an XML RPC array can contain values fromdifferent data types. An example is shown below:

Figure 3.8 : XML RPC array with values from different data types

A XML –RPC request with array usage is shown below in figure 3.9.

The XML-RPC response for the above request would look like as shown in figure3.10.

<array> <data> <value><string>Dhoni</string></value> <value><string>Rahul</string></value> <value><string>Sachin</string></value> </data> </array>

<array> <data> <value><string>Dhoni</string></value> <value><string>Rahul</string></value> <value><int>125</int></value> </data> </array>

<?xml version="1.0"?> <methodCall> <methodName>getScore</methodName> <params> <param> <value> <array> <data> <value><string>Dhoni</string></value> <value><string>Rahul</string></value> <value><string>Sachin</string></value> </data> </array> </value> </param> </params> </methodCall>

DMC 1801

NOTES


Figure 3.10 : XML RPC response with Arrays

XML- RPC Structs

XML-RPC supports the concept of structs. A struct is nothing but a collection of

logically related variables.

The structs would be represented by <structs> element. Each member of the strcut would be represented by <member> element. Each member element would have two child elements namely <name> and <value>.

An example is shown in figure 3.11.

Figure 3.11 : An XML-RPC Struct

<?xml version="1.0"?> <methodResponse> <params> <param> <value> <array> <data> <value><int>123</int></value> <value><int>120</int></value> <value><int>144</int></value> </data> </array> </value> </param> </params> </methodResponse>

<struct> <member> <name>sachin</name> <value><int>145</int></value> </member> <member> <name>Dhoni</name> <value><int>158</int></value> </member> </struct>


NOTES


Validating XML – RPC

As we mentioned in the previous chapters, validation of a XML document becomesimportant for many reasons.

There exist no standard validation techniques available for XML RPC. But, at thesame time, it can be validated with both of the below given techniques.

DTD Method XML schema method.

Though you can use any one of the above mentioned techniques for validating a XML–RPC, there are certain advantages of using, XML Schema method for validation as givenin the below list.

In XML-RPC, only the methodCall and methodResponse are the legal rootelements. The XML Schema can clearly specify this restriction where as in DTD itis not so.

The values for data types (their ranges) can be easily specified by schema method. The method names and strings should contain only the ASCII values. This restriction

also can be clearly imposed by the schema method.

3.4 SOAP (SIMPLE OBJECT ACCESS PROTOCOL)

As the name suggest, SOAP is a simple protocol. SOAP is based on XML. Thepurpose of Simple Object Access Protocol is to allow applications exchange data overHTTP.

SOAP is playing a major role in the development of Web Services. In that view youcan call SOAP, a protocol for accessing the web services. After the web services dominance,SOAP has become the de-facto standard for accessing applications in a network.

The below given list explains the attributes of Simple Object Access Protocol.

SOAP is a protocol primarily designed for communication between applications ina networking scenario.

SOAP gives a format for sending messages over a network. Since this format hasbeen accepted by majority of software vendors, it becomes very easy to establishcommunication between applications.

The biggest advantage of SOAP is its platform independency. It enables seamlessintegration of applications running across various platforms. For example anapplication running in Microsoft Windows Operating system can communicatewith another application running under some other operating system like Solaris orLinux etc.

DMC 1801

NOTES


SOAP doesn’t propose any language constructs. So it is independent of the languageas well. It becomes easier to establish communication between applicationsdeveloped using different languages.

SOAP is completely based on XML. So it inherits all the advantages that theXML possesses.

Because of the text based nature, it is very easy for the SOAP messages to penetratethrough the firewalls. This becomes an important attribute of SOAP in the securenetwork scenario.

SOAP is endorsed by W3C. This is another advantage in terms of standards. SOAP has been designed keeping in mind the communication through internet. So

it becomes easier to communicate through internet using SOAP. So SOAP canuse many standard protocols used in internet like HTTP or SMTP etc.

SOAP is extensible in nature. This feature also is a gift from XML becauseextensibility is major advantage of XML too.

3.4.1 Origin of Soap Developmentor Inc., is the organization who developed SOAP initially. The primary

goal was to access services and objects among various applications. In 1999, SOAP 1.0 specification became the publicly available. This became

possible because of the joint effort among various organizations like UserLand,Microsoft etc.

During 2000, SOAP 1.1 specification was released as a W3C Note. To this releasethere were contributions from IBM and Lotus Corporation.

3.4.2 What SOAP is not?

There exist many misconceptions regarding SOAP. This section would give an ideaabout what SOAP is not?

The important point to be noted about SOAP is that, it is not a programminglanguage.

It is not a business application component which can be directly used with businessapplication development.

SOAP doesn’t provide any Garbage Collection feature.

SOAP doesn’t support Object activation and Object by reference

It doesn’t have support for message batching.

3.4.3 Components of a SOAP message

A SOAP message is basically an XML document with certain specific elements. Thefollowing figure 3.12 depicts various components of a SOAP message.


NOTES


Figure 3.12 : SOAP message Components

In the above listed components, Header and Attachments are optional. The envelopeand Body are mandatory.

SOAP Envelope

As specified earlier, envelope is a mandatory component in a SOAP message. Theprimary role of envelope is to identify the XML document as a SOAP message.

The SOAP envelope is the primary container of a SOAP message. It is the rootelement of the message.

As per SOAP 1.1 specification, the SOAP messages which don’t have this envelopeas container would be considered as invalid.

Encoding styles can also be present in the envelope. The encoding style attribute isused to represent the data types used in the document. The encoding style can bespecified on any attribute. It would apply for that element and its child elements.

By default SOAP message doesn’t has any encoding.

It would have the namespace as indicated in figure 3.13.

Figure 3.13 : SOAP Envelope

<?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding"> ... Message information goes here ... </soap:Envelope>

DMC 1801

NOTES


SOAP Header

As specified earlier, SOAP header is a optional component. It would be representedas the first child element of envelope specified.

The header can contain optional child elements. If the child elements are there they should be qualified with a namespace. An example for SOAP header is provided in Figure 3.14

Figure : 3.14 SOAP Header

SOAP has defined three attributes in the default namespace. The attributes are as listedbelow:

Actor mustUnderstand encodingStyle

The role of these attributes is to indicate the recipient on how it should process themessage.

Actor attribute

As stated earlier the purpose of SOAP is accessing the remote application. So itwould have a source and destination (or sender and receiver). When the message is sentfrom source to destination, it may travel to the destination by passing various endpoints.

At times, the complete message may not be for a single end-point. It may be formore end points on the path. The actor attribute is used to address a particular endpoint.

The general format for specifying the actor attribute is as shown below:

Soap:action = “URI”

<?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding"> <soap:Header> <m:Trans xmlns:m="http://www.test.com/transaction/" soap:mustUnderstand="1">888</m:Trans> </soap:Header> ... ... </soap:Envelope>


NOTES


The URI indicates an accessible location on internet. The following example inFigure 3.15 depicts the usage of “actor” attribute. You can locate the “actor” attributepointing to the URI, http://www.test.com/test/.

Figure : 3.15 Actor attribute

mustUnderstand attribute

The next attribute in SOAP header is the “mustUnderstand” attribute. The possiblevalues for this attributes are “0” or “1”. If the mustUnderstand attribute is set to “0” thenthe recipient can carry on the processing even it doesn’t recognize what it refers to. On theother hand when the mustUnderstand is set to 1, the receipient has to recognize the element.Otherwise it would fail while processing the header. The general syntax for mustUnderstandis as shown below:

soap:mustUnderstand = “0|1”

An example for mustUnderstand attribute is shown in figure 3.16.

<?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding"> <soap:Header> <m:Trans xmlns:m="http://www.test.com/transaction/" soap:actor="http://www.test.com/test/">888 </m:Trans> </soap:Header> ... ... ... ... </soap:Envelope>

<?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding"> <soap:Header> <m:Trans xmlns:m="http://www.test.com/transaction/" soap:actor="http://www.test.com/test/" soap:mustUnderstand = “1”> 888 </m:Trans> </soap:Header>

DMC 1801

NOTES


Figure : 3.16 : SOAP mustUnderstand attribute

encodingStyle attribute

The encoding style attribute is used to define the encoding of the data types used inthe header element entries. As specified earlier, the SOAP message has no encoding bydefault.

The general format of encodingStyle attribute is as shown below:

Soap:encodingStyle = URI

SOAP Body

As indicated earlier, SOAP body is a mandatory component in a SOAP message.The SOAP body is the child element of SOAP envelope. The body would contain someprocessing information which would be used by the destination. A SOAP body may containthe following

A remote method and its parameters The target application specific data A SOAP fault for error conditions.

Figure 3.17: SOAP Body

...

...

...

... </soap:Envelope>

<?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding"> <soap:Body> <m:GetRuns xmlns:m="http://www.onedaycricket.com/scores"> <m:Team>India</m:Team> </m:GetRuns> </soap:Body> </soap:Envelope> </soap:Envelope>


NOTES


If you observe the above given example, you can notice the elements m:GetRuns andm:Team. These elements are specific to a particular application. So they are not part of theoriginal SOAP standard.

The above example, request for score of the Indian team in a cricket match. Theresponse to this message can be similar to as shown below in figure 3.18.

Figure 3.18 : SOAP Response message

In the response message you can observe that, “m:GetRuns” is replaced by“m:GetRunsResponse”. The value of “m:Team” is replaced by “350 for 7”.

SOAP Fault

The SOAP fault element is used to handle the errors. It would be used to identify thestatus information.

The SOAP fault element would appear as a child element for BODY element. The SOAP fault element can appear only once in a SOAP message.

The SOAP fault element can have the following sub elements. They are as listedbelow:

Faultcode: The faultcode would contain a standard value which can be used foridentifying errors. (or the status information). The fault code values are as givenbelow:

VersionMismatch: Indicates that an invalid namespace is defined or the versionis not supported.

MustUnderstand: The header element with the mustUnderstand value set to “1”is not understood.

Client: This faultcode is indicated when the problem originates from the receivingclient.

Server: This fault code is indicated when the problem arises during the processingon server side.

<?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding"> <soap:Body> <m:GetRunsResponse xmlns:m="http://www.onedaycricket.com/scores"> <m:Team> 350 For 7</m:Team> </m:GetRunsResponse> </soap:Body> </soap:Envelope> </soap:Envelope>

DMC 1801

NOTES


The fault code values are as shown in the following figure 3.19:

Figure 3.19: Fault Code Values

Faultstring: The fault string would provide a human readable description of theSOAP fault.

Faultactor: This would provide information about which actor has caused thefault to happen.

Detail: This would provide the application specific error or status information.

Comparison between XML RPC and SOAP

In the earlier part of this chapter XML RPC has been discussed in detail. In theprevious section of this chapter SOAP has been discussed. This section would compareboth of these techniques i.e. XML RPC and SOAP.

SOAP has extended the features from the place where XML RPC has left in someaspects.

Both the techniques have their own advantages and disadvantages. The advantagesof XML RPC are as listed below:

Advantages of XML RPC

XML RPC is there in the industry for a considerable period of time. So in someaspects it is more stable.

The greatest advantage of XML RPC is “simplicity”. Learning of XML RPC isless complicated. It has short learning curve.

Advantages of SOAP

The advantages of SOAP are as listed below: Though XML RPC supports arrays and structs they are un-named. At the same

time SOAP structs and arrays can be named. Customization is the greatest advantages of SOAP. This would make the developer

to feel comfortable while creating customized messages.


NOTES


The support from industry leaders for SOAP is another big advantage for SOAP.For example, Microsoft has given importance to SOAP in their .NET framework.Similarly technologies like J2EE also support SOAP.

It supports developer specified character set. It support developed defined data types. It has support for message specific processing instructions.

Disadvantages of XML RPC

XML-RPC imposes constraints on the names of methods. It is mandatory thatthe method name should contain only certain limited characters.

The structs and arrays in the XML-RPC doesn’t have any name associated withthem.

XML-RPC doesn’t support developer specified character set. It doesn’t supports developer defined data types. It has no support for message specific processing instructions.

Disadvantages of SOAP

The documentation support associated with SOAP is limited. Considering all the factors, SOAP is certainly more powerful than XML RPC.

SOAP HTTP Binding

A SOAP method is nothing but a HTTP request or response with a special characteristicthat it has to comply the encoding rules of SOAP. This can be specified as shown below infigure 3.20.

Figure 3.20 : SOAP = HTTP + XML

SOAP request can be HTTP POST or HTTP GET request. The HTTP POST requestshall have at least two headers. They are as listed below:

Content-Type

Content- Length

DMC 1801

NOTES


Content-Type: Content-Type is used to specify the MIME type for the messages.It may have optional item called charset. It would indicate the charset associated wit thebody of request or response.

Content-Type: application/soap+xml; charset=utf-8

Content-Length: The content length would indicate the number of bytes in the body ofthe request or response. An example is as shown below:


Content-Length: 300

Figure 3.21 : SOAP Request with HTTP header Information

In the above diagram, you can notice that the HTTP header information holdinformation like HOST, Content-Type and Content-Length.

A possible response for the above SOAP request is as shown in Figure 3.22

POST /Scores HTTP/1.1 Host: www.onedaycricket.com Content-Type: application/soap+xml; charset=utf-8 Content-Length : 290 <?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding"> <soap:Body> <m:GetRuns xmlns:m="http://www.onedaycricket.com/scores"> <m:Team>India</m:Team> </m:GetRuns> </soap:Body> </soap:Envelope> </soap:Envelope>

HTTP/1.1 200 OK Content-Type: application/soap+xml; charset=utf-8 Content-Length: 327 <?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2001/12/soap-envelope" soap:encodingStyle="http://www.w3.org/2001/12/soap-encoding"> <soap:Body> <m:GetRunsResponse xmlns:m="http://www.onedaycricket.com/scores">


NOTES


Figure 3.22 : SOAP response with HTTP header information

In you look at the above response message; you can notice the presence of “200OK” which indicates successful HTTP response.

The HTTP header has additional information like Content-Type, charset and ContentLength.

SOAP Intermediary

As stated earlier, SOAP is a communication protocol between applications.

SOAP is a stateless protocol. The applications involved in the SOAP communication are called SOAP nodes. A SOAP message may travel by passing various SOAP nodes. The SOAP nodes are represented as endpoint URI. There exist three different types of SOAP nodes. They are as listed below:

SOAP Sender: The node from which the message starts its journey.

SOAP Receiver: The node to which the message is intended for. The SOAPreceiver would process the message received. It can response with processoutput or with a SOAP Fault.

SOAP intermediary: The SOAP intermediary nodes can both receive andsend SOAP messages. The SOAP intermediaries are optional. There may bea communication in which the message from the sender can reach the receiverwith out going through any of the intermediaries.

The SOAP message exchange model is as shown in Figure 3.23.

Figure 3.23 SOAP Intermediary

<m:Team> 350 For 7</m:Team> </m:GetRunsResponse> </soap:Body> </soap:Envelope> </soap:Envelope>

DMC 1801

NOTES


The SOAP intermediaries can be classified in to two types. They are as shown infigure 3.24.

Figure 3.24: Intermediary Types

Forwarding Intermediaries: This type of intermediaries’ primary role is to forwardthe message it received from other nodes.

Active Intermediaries: This type of nodes performs additional processing otherthan simply forwarding.

SOAP Intermediary advantages

The SOAP intermediaries provide the following advantages:

Store and Forward Intelligent routing Transactions Security and Logging

It may also be used to provide value additions to the SOAP.

SOAP with Attachments

As stated earlier, the root element of a SOAP message is “envelope”. The contentsinside the SOAP envelope should strictly follow the rules of XML. There is a part outsidethe SOAP envelope called “SOAP attachments”.

SOAP attachments can contain data in ASCII or binary format. SOAP attachments are not part of SOAP envelope. Though the attachments are outside the envelope, they are related to the message

sent. Each attachment of the message can be identified with an ID called Content-ID. The attachment can be identified with content locations as well. The attachments allows any kind of data to be associated with a SOAP message

which is very helpful in scenarios where you would like to send an image or someother file with SOAP message.


NOTES


SOAP message structure with attachment is as shown below in figure 3.25.

Figure 3.25: SOAP message with attachments

Questions

Part A

Multiple Choice Questions

1. SOAP Stands fora. Service Oriented Architecture Protocolb. Simple Object Access Protocolc. Serial Object Access Protocold. All of the abovee. None of the above

2. Find the odd itema. XML RPCb. JAVA RMIc. SOAPd. AJAX

3. HTTPa. is independent of the platformb. text basedc. can pass through firewallsd. all of the abovee. none of the above

4. CORBA isa. Complexb. Platform dependentc. binaryd. all of the abovee. none of the above

DMC 1801

NOTES


5. SOAP isa. Platform dependentb. complexc. binaryd. all of the abovee. none of the above

6. Which of the following played a key role in SOAP origin?a. Developmentorb. DevelopSoftc. DSoftd. All of the abovee. None of the above

7. Which of the following is not a component of SOAP message?a. Envelopeb. Headerc. Bodyd. All of the abovee. None of the above

8. Which of the following is not a valid “mustUnderstand” value?a. 1b. 0c. -1d. all of the abovee. none of the above

9. Find the odditema. VersionMismatchb. MustUnderstandc. Clientd. Version

10. Which of the following is not there in http headera. Content-Typeb. Content-Lengthc. Content-Coded. All of the abovee. None of the above


NOTES


Answers1. b 2. d 3. a 4. d 5. e

6. a 7. e 8. c 9. d 10. c

Section B

Short Answer Question11. List out various communication protocols.12. Explain the structure of XML-RPC response13. List out various component of SOAP message.14. Write short notes on SOAP Header15. Write short notes on SOAP Attachments.16. List out components of SOAP Fault.17. Write short notes on SOAP intermediaries

Section C

18. Explain the working mechanism of XML RPC19. Explain the SOAP components in details.20. Compare XML RPC with SOAP.

DMC 1801

NOTES



NOTES


UNIT IV

WEB SERVICES

4.1 INTRODUCTION

As we learnt from the UNIT I, web service is nothing but code sequences to solve aproblem that doesn’t reside on the same machine where we are executing the program tosolve a particular problem. In other words it can be perceived as the program componentsthat reside in some portion of the internet which can be accessed by standard internettechnologies like HTTP from a remote place to solve the problem. The objective here is toestablish communication between various technologies irrespective of platform or theproduct. Is it possible? Yes, it is possible through the web services. The characteristics ofweb services allow businesses to use the internet to publish, determine and aggregateother web services using the protocol SOAP.

4.2 WHAT IS A WEB SERVICE?

Let us the see the formal definition again for the web Service:

A Web service is a software system designed to support interoperable machine-to-machine interaction over a network. It has an interface described in a machine-processable format (specifically WSDL). Other systems interact with the Web servicein a manner prescribed by its description using SOAP messages, typically conveyedusing HTTP with an XML serialization in conjunction with other Web-relatedstandards

Is it confusing? Don’t worry. Let us make it clear: Over the years, three primarytechnologies have emerged as worldwide standards that make up the core of today’s webservices technology. These technologies are: SOAP, WSDL and UDDI. Follow thedefinitions for each of them.

4.2.1 Simple Object Access Protocol (Soap)

SMTP, HTTP and FTP are some of the standard Internet Technologies for transportingcontents or documents. But SOAP acts as a standard cover over XML documents whichwraps and makes it ready for transporting. Is it the only job done by SOAP? Not so, italso defines encoding and binding standards. What is the use of it? The standards are used

DMC 1801

NOTES


for encoding non-XML RPC invocations in XML for transport. The structure SOAPprovides is simple for doing RPC: document exchange. The result of this is the heterogeneousclients and servers can easily become interoperable by having a standard transportmechanism. For example,

NET clients can invoke EJBs exposed through SOAP Java clients can invoke .NET Components exposed through SOAP

4.2.2 Web Service Description Language (WSDL)

Now what is the immediate requirement? The answer is a way to represent the web services in terms of the input and output parameters of an invocation

externally the function’s structure the nature of the invocation (in only, in/out, etc.) the service’s protocol binding

Yes it is through WSDL where WSDL is an XML technology that describes theinterface of a web service in a standardized way. WSDL allows the clients to automaticallyunderstand how to interact with a web service.

4.2.3 Uni scription, Discovery, and Integration (UDDI)

We have a protocol to transfer and a way for the description of the contents. Nowthere is a need to provide a place which is capable of having the inventory of web servicesdeployed for various tasks. Yes, UDDI provides a worldwide registry of web services foradvertisement, discovery, and integration purposes. Everybody including Business analystsand technologists use UDDI to discover available web services by searching for names,identifiers, categories, or the specifications implemented by the web service. Hence UDDIprovides a structure for

Representing businesses Business relationships Web services Specification metadata Web service access points

Individually, any one of these technologies is only evolutionary. Each provides a standardfor the next step in the advancement of web services, their description, or their discovery.However, one of the big promises of web services is seamless, automatic business integration:a piece of software will discover, access, integrate, and invoke new services from unknowncompanies dynamically without the need for human intervention. Dynamic integration ofthis nature requires the combined involvement of SOAP, WSDL, and UDDI to provide a


NOTES


dynamic, standard infrastructure for enabling the dynamic business of tomorrow. Combined,these technologies are revolutionary because they are the first standard technologies tooffer the promise of a dynamic business.

Now the question may arise whether the equivalent feature was available in the past?Yes it was available, but they weren’t supported by every major corporation and did nothave a core language as flexible as XML.

Now look into the figure 4.1: it will be very clear for you people to understand theconnection between the three technologies and their interactions. Fig 4.1 depicts thecommunication across web using Web services, XML and SOAP using the RepositoryUDDI. In other words, the diagram demonstrates the relationship between these threetechnologies. It explains the web services which builds on SOAP can be exposed to theinterested parties over the internet from any web connected device. This paradigm is basedon the approach “assembly of constituent parts”. SOAP is not a stand alone technology,but the result of synergies between XML and HTTP.

Figure 4.1 : Communication across web using web services

How exactly these technologies work together to solve a problem using the services?The following are the sequence of steps for a client application to locate another applicationor a piece of business logic located somewhere on the network:

The client queries a UDDI registry for the service either by name, category, identifier,or specification supported.

Once located, the client obtains information about the location of a WSDLdocument from the UDDI registry.

DMC 1801

NOTES


The WSDL document contains information about how to contact the web serviceand the format of request messages in XML schema.

The client creates a SOAP message in accordance with the XML schema found inthe WSDL and sends a request to the host (where the service is).

In other words the same may be explained with the functionalities such as Describing,Exposing, Being invoked and Returning & Response.

Describing (WSDL)

Web Services describes its functionality and attributes so that other applications canlocate it and use it.

Exposing (UDDI)

The Web Services have to register with a repository which may contain white pages,Yellow pages and Green pages. All the basic service provider information will be availableon the White pages, the listing of services category wise will be the content of Yellowpages and Green page will contain the information about how to connect and use theservices.

Being invoked

Whenever a web service has been located, a remote application can invoke the service.

Returning & Response

When a service has been invoked, results are returned to the requesting application.

As a whole what is being achieved with these functions? As we know the need forflexible and efficient business collaboration environment in the industry, these conceptscomes in handy as a solution. Technically it is a way to link loosely coupled systems usingtechnology that doesn’t bind them to a particular programming language. This ensures theidea of component assembly which also promises improved collaboration with customers,partners and suppliers.

4.2.4 What qualifies as Web Services?

Any application or piece of software capable of doing a predefined task can beidentified as a web service if it could be discovered and used by another software componentor application. This application or piece of software may be as simple as a sports review,or a weather forecast or as complex as complete air travel package that includes travelbooking, hotel booking and restaurant reservations. Here the idea of web service is thatthe piece of software ensures all the services even from different vendors and theinteroperation between them without requiring the advance knowledge of how services fittogether. Thus web services represent a new model of software availability in distributionand interconnection between them, which is based on the notion of services globally available


NOTES


over the web rather than object-to-object connections over limited networks. Web servicesalso promises the improved collaboration with customers, partners and suppliers.

We know that it is not possible to get additional significant potential services withoutpaying any thing. What is it here? How the web services will play on a large scale?Delivery of simple services is alright. But until the technology matures, an up-front humanelement to solidify agreements is required. Without worrying about it let us concentrate onthe details of the applications and web service protocols.

4.2.5 Practical Applications for Web Services Imagine a Person requires a currency conversion service that converts dollars to

Euros or Rupees to Dollars. Another person requires a natural language translationservice that converts English to French. With the availability of technologicaladvancement the above said piece of component can be achieved through thecross-platform interoperability promised by SOAP and web services. Today,some web sites are available such as www.xmethods.com to host simple webservices.

When we see real companies using web services to automate and streamline theirbusiness processes, this scenario becomes more exciting.

Let’s use the concept of a Business-to- Consumer (B2C) portal. Take a closerlook into the web-based portals, such as those used by the travel industry. Theyoften combine the offerings of multiple companies’ products and services andpresent them with a unified look and feel to the consumer accessing the portal.One thing we have to have it in our mind is to realize the difficulties to integrate thebackend systems of each business to provide the advertised portal services reliablyand quickly.

For example assume there are two companies available namely MAP CAR SYSTEMSand PAM AIRLINES COMPANY. And web services technology is already being used inthe integration between MAP CAR SYSTEMS and PAM AIRLINES COMPANY. MAPCAR SYSTEMS uses the Microsoft SOAP Toolkit to integrate its online booking systemwith PAM AIRLINES COMPANY’s site. MAP CAR SYSTEMS booking runs on aSun Solaris server, and PAM AIRLINES Company’s site runs on a Compaq OpenVMSserver. The net result is that a person booking a flight on PAM AIRLINES’s web site canreserve a car from MAP CAR SYSTEM’s without leaving the airline’s site. The resultingsavings for MAP CAR SYSTEMS are a lower cost per transaction. If the booking is doneonline through PAM AIRLINES and other airline sites, the cost per transaction is about$1.00. When booking through traditional travel agent networks, this cost can be up to$5.00 per transaction.

Let us look into some other application area such as the healthcare industry inwhich web services can be put to use effectively. A doctor carrying a handhelddevice can access your records, health history, and your preferred pharmacy using

DMC 1801

NOTES


a web service. The doctor can also write you an electronic prescription and sendit directly to your preferred pharmacy via another web service. Imagine a situationwhere all pharmacies in the world use a standardized communication protocol foraccepting prescriptions, the doctor could write you a subscription for any pharmacythat you selected. The pharmacy would be able to fulfill the prescription immediatelyand have it prepared for you when you arrive or couriered to your residence.

This model can be extended further in the same application domain. If the interfacesused between doctors and pharmacies are standardized using web services, aportal broker could act as an intermediary between doctors and pharmaciesproviding routing information for requests. In addition it can better meet the needsof individual consumers. For example, consider a situation where a patient mayregister with an intermediary and specify that he wants to use generic drugs insteadof expensive brand names.IN this situation, an intermediary can intercept thepharmaceutical web service request and transform the request into a similar onefor the generic drug equivalent. In this process the intermediary exposes webservices to doctors and pharmacies (in both directions) and handles issues such assecurity, privacy, and non repudiation.

In all these applications, the minimum requirement is that each participant in themultiparty collaboration should know how to construct and deconstruct SOAP messagesand how to send and receive HTTP transmissions. Now that it is clear what are the situationsthese technologies can be used? You can imagine also similar applications where thesetechnologies can be used to achieve the task.

4.2.6 Web Service Architecture

What are the major aspects of Web Service Architecture? There are three majoraspects: service provider, service requestor and broker: Let us explore each one in detail:

A Service Provider provides the software pieces that can carry out a specifiedset of tasks.

A Service Requestor discovers and invokes a software service to provide a businesssolution. Generally the requestor will invoke a remote procedure call on the serviceprovider along with the parameter data and receives a result in reply.

A Broker manages and publishes the services provided by the various serviceproviders.

How can we mange all these three aspects? Are there any underlying key technologiesavailable to readily handle this? Let us recall the definition we gave for UDDI, WSDL andSOAP:

UDDI is a protocol for describing Web Services components that allows providersto register with an Internet Directory to advertise their services.

WSDL is the proposed standard for describing Web Services. WSDL provides featuresfor defining service interfaces and the implementation. WSDL syntax is XML syntax based.


NOTES


SOAP is a protocol for communicating with a UDDI service. The advantage of SOAPis that it can use universal HTTP to make a request and to receive a response.

As the explanation indicates these technologies can be used to realize the three aspectsservice provider, service requestor and broker which is depicted in Fig 4.2 . Let us lookinto detail the technologies

Figure 4.2 Service requestor, Broker and provider

Figure : 4.3 Communication Involving UDDI, SOAP and Web ServiceRepository

WEB

XML/SOAP

Web services client

Web services provider

Web services repository

WSDL

Yellow Pages

Green PagesWhite PagesWSDL

UDDI

SOAP

HTTP

XML

WEB

XML/SOAP

Web services client

Web services provider

Web services repository

DMC 1801

NOTES


4.3 UDDI

UDDI should have the facility to uniformly describe the service description which isstored in a directory and used by any services. UDDI originates from a cooperativeagreement among IBM, Microsoft and Ariba on an XML based specification for establishinga registry of businesses and services on the Internet. In a nutshell, we can say that UDDIdefines an XML based infrastructure for software to automatically discover available serviceson the web, using SOAP as the protocol to invoke services.

4.3.1 UDDI Registries

UDDI registries are the focal points for registering and locating services. The servicesregistered may be for the management of internal requirements of an organization or openservices for all others. Hence it may be known as a public registry or a private registry.Microsoft, IBM and HP have agreed to provide a public UDDI registry which can be usedfor search and connection across the entire internet. Many private registries are alsoavailable which can be used for either internally within companies or among a closely knitfamily of trusted partners and collaborators.

Hence the UDDI- complaint registry should provide an information framework forthe description of the both public and private web service registries. Because of this opennessmany IT industries have started using web service technologies behind their firewalls forapplication to application integration. This encourages the managers and developers togain experience by starting less critical projects and then migrate to more ambitious projects.Let us see now the specifications to describe a service in the registry.

What to be specified?

API specification for UDDI complaint business registry

To perform inquiry

To publish functions

These specifications outline the details of the XML structures associated with thefunctions.

UDDI data structure specifications arebusinessEntitybusinessServicebindingTemplatetModel

All these four data structures specify the structure of the service that can be used todefine the sequence of procedures to be included in the UDDI registry.


NOTES


Example Scenario:

The following is a scenario of interaction for connecting a server using UDDI discovery.The domain assumed to explain the concepts is Books Service.

A company requires software which connects to several book service providers.According to the requirement the software has to compare the prices, deliverytimes, additional charge depending on the place of delivery etc.This requires the connection to be established to the UDDI business registry.

It may be through the Web Interface or the Inquiry API.After establishing the connection, a lookup based on an appropriate yellow

pages listing is required and finally the company obtains the businessEntity.

Using this businessEntity, based on the requirement the information can be obtainedas such or it can be drilled down for more detail. The objective here is to obtaina bindingTemplate to connect to the particular server which provides the service.

Based on the details of the specification provided by the bindingTemplate , thecompany sets up its program to interact with the particular web service. Thesemantics of the service is obtained by accessing the tModel contained in thebindingTemplate for the service.

At runtime, the program invokes the webservice based on the connection detailsprovided in the bindingTemplate.

If the required interface connections as specified in the tModel exists calls to theremote service will be successful. On the other hand, if there is a problem with the interactionbetween client and Web service, UDDI specifies details for failure and recovery.

It is important for clients to detect and recover from failures that occur during interactionwith the remote partners. UDDI caches the calling convention and bindingTemplates. Thiscached information is refreshed based on the current information from a UDDI web registrywhen a failure occurs. The following sequence of steps indicates how error recovery fitsinto web services:

The program developed by a programmer to use a web service also containscaching the appropriate bindingTemplate for the use at runtime.

At the time of executing the program, the cached bindingTemplate that was earlierobtained from the UDDI Web registry is utilized by the program.

In case of any failure, a new copy of the bindingTemplate for this unique webservice is obtained through the bindingKey value and the get_bindingTemplateAPI call.

The program compares the new bindingTemplate information with the cachedversion. If they are different, the program retries the failed call using the newbindingTemplate.

DMC 1801

NOTES


The approach “retry on failure” is followed in case of same also. Hence, theprogram(client) retries the call. This approach proved more efficient than acquiringa new copy of bindingTemplate data. When a business needs to redirect the trafficto a new site this approach proves more useful. It needs only activate the new siteand change the published location information for the affected bindingTemplates.

4.3.2 UDDI tModelWhat exactly is tModel? Why do we need it? How do we use it? It turns out the

tModel concept is just like the XML namespace concept: it is not complex at all, yet it canbe very confusing.tModel Is Used to Represent Interfaces:

UDDI is an online “yellow book” that is used by both the service providers andservice consumers. The service providers will register their Web services into UDDI, andservice consumers will try to find the service descriptions from this online registry whichwill finally lead to the services that they desire. The idea of “interface” in the world ofUDDI is more or less similar to the concept of interface in the world of COM/DCOM,i.e., it is the “contract” that both the service provider and the service consumer will honor:the service provider promises to implement the Web service in such a way that if theconsumer invokes the service by following this contract, his/her application will get what itexpects.

Notice that the interface a Web service implements may or may not be defined by thisservice provider. For instance, some major airlines may get together and form a committeewhich will work out and publish (register) an interface in UDDI for querying the ticketprice on a given date, time, and city pairs. This published interface will become the industrialstandard, and the implementation work is left to be done by each specific airline. Eachairline will then develop a Web service that implements this interface and also register theservice with UDDI. In this case, the interface is not defined by the airline which implementsit. Also, it is quite obvious that the life of a travel agent is now quite easy: although we havequite a few different airlines, there is only one querying interface he/she needs to worryabout.

Now the question is that the Web service a given provider wants to register has nostandard interface at all, in which case, the provider will have to first create and register aninterface with UDDI. After this interface is registered, the service that implements it canthen be developed and registered.

In what kind of “language” is the interface described? The answer gives the first bigrole of tModel: every single interface in UDDI is represented by a tModel.

An example seems to be appropriate at this point. Let us say that we want to createa Web service for CodeProject.com which will accept a String representing a person’sname, and will return a non-negative integer indicating how many articles this person hassubmitted to CodeProject.com. This seems to be a fairly “special” service, so we assume


NOTES


there is no current “standard” for this service, i.e., there is no existing interface we canregister our service against. Therefore, we need to create our own interface first.

So, what we can we think about UDDI? As a whole it can be described as a Project.This Universal Description, Discovery, and Integration (UDDI) Project provides astandardized method for publishing and discovering information about web services. It isan industry initiative that attempts to create a platform-independent, open framework fordescribing services, discovering businesses, and integrating business services. With this letus explore what WSDL is?4.4 WSDL

WSDL is an XML format for describing how one software system can connect andutilize the services of another software system over the internet. WSDL is an altogetherdifferent being, offering a degree of extensibility. This extensibility allows WSDL to beused to:

Describe endpoints and their messages, regardless of the message format or networkprotocol used to exchange them.

Treat messages as abstract descriptions of the data being exchanged. Treat port types as abstract collections of web services’ operations. A port type

can then be mapped to a concrete protocol and data format.If you are feeling not comfortable with these items, don’t worry. We will see fewer

“scientific” definitions as we go along; don’t let the terms scare you away from thistechnology. Are you ready! Let’s start.4.4.1 What Is WSDL?

Right now the need of the hour is finding a standard way of describing two machinesto interact with each other? Do you agree? Why this is important? Since because thenumber of communication formats and protocols used on the Internet continues to increase.Here is simple way WSDL provides: WSDL describes

What a service does? How to invoke its operations? Where to find it?

Hence WSDL has created separate definitions and terminology for defining a webservice, the communication endpoint where that web service exists, the legal format forinput and output messages for the web service, and an abstract way to declare a binding toa concrete protocol and data format. But everything defined within a WSDL file is abstract:it’s just the definition of parameters and constraints for how communication should occurat runtime. Then who’s responsibility is to provide the exact service implementationspecifications? Have this question in your mind!!!

The web service implementation has to adhere to the guidelines defined in the WSDLfile but has some flexibility over specifics. WSDL also provides the ability to define a

DMC 1801

NOTES


binding that attaches an abstract set of message definitions to a concrete protocol or dataformat. A bindingextension is a type of binding defined for a major protocol. WSDLdefines out-of-the-box binding extensions for SOAP 1.1, HTTP GET, HTTP POST, andMIME.

4.4.2 The abstract Structure of a WSDL document

WSDL defines services as collections of network endpoints or ports. Here the abstractdefinition is separated from their concrete network based data bindings. The followingfig. explains about the information that a WSDL file should contain to use it as web service.

4.4.3 Anatomy of a WSDL Document

Now let us see the individual parts of a WSDL document. To make the segmentsimpler the elements from WSDL binding extensions (i.e., SOAP, HTTP, etc.) were notincluded. The following segment describes about the major elements that may appear in aWSDL document. The asterisk (*) specifies that more than one of the indicated elementsmay appear.<definitions><import>*<types>S<schema></schema>*</types><message>*<part></part>*</message>

Port – contains the web address about the availability of the services as provider/client

Messages – the structure of the data that should be communicated between ports Operations – the information about which message go with what data portType – contains about the various operations

dataType , bindings - This part contains data description and the information about the data structure used


NOTES


<PortType>*<operation>*<input></input><output></output><fault></fault>*</operation></PortType><binding>*<operation>*<input></input><output></output></operation></binding><service>*<port></port>*</service></definitions>

< definition> element:

The description of the service is described between the <definitions> and </definitions>elements in a WSDL document. Actually the global declarations of namespaces will bedefined in this place that is intended to be visible throughout the rest of the document.Could you recall about the XML namespace? Just to refresh about namespaces! It isnothing but a name that qualifies element and attribute names.

A namespace provides an alias (code name) to use within the current XML documentfor referring to the rules defined in a separate XML Schema document. In other words, itis used as a qualifier for tags/elements. For example, if two XML Schema documents eachdefine the <name> tag with different subelements, how would an XML file that uses bothschemas know which <name> definition to refer to? The namespace alias is used as aprefix to qualify an XML tag as coming from a particular XML Schema document. Is itokay? Let us move on to the other elements in the service description.

The <import> element

Do you remember the purpose of #include directive in C Language? The <import>element serves a purpose similar to it. You can have zero or more <import> elements. Itallows you to separate the elements of a service definition into independent documents andinclude them in the main document, where ever appropriate. It enhances the modularization

DMC 1801

NOTES


of WSDL documents and creates an environment of reuse that can create clear servicedefinitions. What is the advantage having this element? It allows us to have WSDL documentsstructured. In this way they are easier to use and maintain. But it requires any WSDLparsing engine to perform additional I/O operations to import any externally referencedresource.

For example look into the following statement containing the information about theaddress where to find the name space?

<importnamespace=http://abcd.efgh.net/xer location=”http://abcd.efgh.net/xer/ij.xsd”/>

Thus the <import> element imports the namespace of another file, not the fileitself. When an <import> statement is used, all elements for that given namespace areincluded at the location of the <import> element in the parent document.

<types> Element

The <types> element in a WSDL document acts as a container for defining thedata types used in <message> elements. The use of the <message> element is to define theformat of messages interchanged between a client and a web service. Currently, XMLSchema Definitions (XSD) is the most widely used data typing method. The other typingapproaches are also acceptable by the WSDL. Generally, a study of the <types> elementis a study about the XML Schema.

The <types> element can have zero or more <schema> subelements, which mustfollow the rules for XML Schema documents. Here the advantage is a <types> elementneed not be included directly. Fairly, the schema definitions required for the WSDL documentmay be included via the <import> element. Here the WSDL parser automaticallyunderstands that the included elements must be specified as part of the <types> definition.

<message> Element

The data to be communicated or exchanged as part of web service has to bemodeled. The data contained within a <message> element typed by a <message> elementis abstract. Here the message may be a single message or can be divided into sub messages.If they are to be represented as sub messages then the usage of <part> tag comes intopicture. A <part> subelement identifies the individual pieces of data that are part of thisdata message and the datatypes that the pieces adhere to. Look into the following piece ofcode containing <message> element and <part> element:

<message name=”s_Header”><part type=”xsd:string” name=”id”/><part type=”xsd:string” name=”timeout”/></message>


NOTES


In the previous code, the <message> element is uniquely identified by the name attribute.This message is named s_Header; it has two <part> subelements, of which the first isnamed id and the second is named timeout. Here, each part is typed as an XML Schemastring (xsd:string). But the types used in part definitions aren’t required to come from XMLSchema; they could just as well be defined in the <types> element of the existing WSDLdocument itself.

<portType> Element

A subset of operations supported for an endpoint of a web service is being definedusing the <portType> element. Is it clear? In other words it can be defined as a uniqueidentifier which allows us to specify a group of actions to be executed at a single point.

Then how to define the operations? It is possible using the other element named<operation>. Hence the <operation> element represents an operation. This element is anabstract definition of an action supported by a web service. A WSDL operation can haveinput and output messages as part of its action. The <operation> tag defines the name ofthe action by using a name attribute, defines the input message by the <input> subelement,and defines the output message by the <output> subelement. The <input> and <output>elements reference <message> elements defined in the same WSDL document or animported one. A <message> element can represent a request, response, or a fault.

<portType name=”abce1111PortTypes”>

This element declares that this endpoint has a set of operations that are jointly referencedas abce1111PortTypes. The following lines define the <operation> elements for this<portType>:

<!— Request-response Operations (client initiated) —><operation name=”init”><input message=”initRequest”/><output message=”initResponse”/></operation><operation name=”search”><input message=”searchRequest”/><output message=”searchResponse”/></operation>

According to the behavior, these operations are grouped. When an operation isdefined in a WSDL document, it is made to be abstract; it is purely an operation definition,but how that operation is mapped to a real function is defined later (i.e., the operation canbehave in a number of different ways depending on the actual definition). The WSDLspecification defines the following behavioral patterns as transmission primitives:

DMC 1801

NOTES


Request-response Solicit-response One-way Notification

First, the operation can follow a request-response model, in which a web serviceclient invokes a request and expects to receive a synchronous response message. Thismodel is defined by the presence of both <input> and <output> elements. The <input>element must appear before the <output> element. This order indicates that the operationfirst accepts an input message (request) and then sends an output message (response).This model is similar to a normal procedure call, in which the calling method blocks untilthe called method returns its result.

Second, the operation can follow a solicit-response model, in which the web servicesolicits a response from the client, expecting to receive a response. This model is definedas having both <input> and <output> elements. The <output> element must appear beforethe <input> element. This order indicates that the operation first sends an output message(solicit) and then receives an input message (response).

Third, the operation can be a one-way invocation, in which the web service clientsends a message to the web service without expecting to receive a response. This model isdefined by a single <input> message with no <output> message. This model indicates thatthe operation receives input messages (one-way invocation), but doesn’t deliver a responseto the client.

Fourth, the operation can be a notification, in which the web services send a one-waymessage to the client without expecting a response. This model is defined by a single

<output> message and no <input> message. It indicates that the operation sendsoutput messages asynchronously; i.e., the messages are not in response to a request, butcan be sent at any time. The operation doesn’t expect a response to the messages it sends.

The value assigned to the name attribute of each <operation> element must be uniquewithin the scope of the <portType>. The names of the input and output messages must beunique within the <portType>, not just the <operation>. The value assigned to the messageattribute of an <input> or <output> element must match one of the names of the <message>elements defined in the same WSDL document or in an imported one.

<binding> Element

A <binding> element is a concrete protocol and data format specification for a<portType> element. It is where you would use one of the standard binding extensions—HTTP, SOAP, or MIME—or create one of your own. Each protocol has its own wireformat. For example, HTTP has a simple header/body format. SOAP, which can exist


NOTES


inside of HTTP and other protocols, has its own header and body. A SOAP message canhave attachments included as part of a message.

The WSDL document has already defined the <operation> elements for this webservice. A<binding> element takes the abstract definition of the operations and their input/output messages and maps them to the concrete protocol that the web service uses. Shouldthe <input> element defined in a WSDL document be located in the SOAP header? Shouldit be in the SOAP body? Should it be in the attachment? Also, how should the data shouldbe encoded? Should the supplied schema be used for encoding rules or should literalencoding be used? The answer is the <binding> element provides this mapping.

4.4.4 WSDL Illustration

This portion illustrates a Web Service example which creates a web service thatconverts the temperature from Farenheit to Celsius and Celsius to Farenheit using ASP.NET.(www.w3schools.com/webservices).

The general assumption is that any application can have a Web Service component.Web Services can be created regardless of programming language. This document is saved

as an .asmx file. This is the ASP.NET file extension for XML Web Services.

To run this example, .NET server is required.

Explanation

The following line in the example states that this is a Web Service, written in VBScript,and has the class name “TempConvert” which is very clear from the code.

If we are familiar with .NET framework it is very easily understood from the followinglines is that it imports the namespace “System.Web.Services” from the .NET framework.

The next line defines that the “TempConvert” class is a WebService class type:

Then comes the VB programming. This application has two functions. One to convertfrom Fahrenheit to Celsius, and one to convert from Celsius to Fahrenheit.

The only difference from a normal application is that this function is defined as a“WebMethod()”.

<%@ WebService Language="VBScript" Class="TempConvert" %>

Imports System

Imports System.Web.Services

Public Class TempConvert :Inherits WebService

DMC 1801

NOTES


Look into the WebMethod()” in the code sequence to understand the same.

Then, end the class:

Publish this .asmx file on a server with .NET support. Yes now you have your firstworking example of a web service. Now you may ask the question where that WSDL andSOAP documents. Don’t be panic. With ASP.NET, you do not have to write your ownWSDL and SOAP documents. ASP.NET has automatically created a WSDL and SOAPrequest.

The following is a sample SOAP 1.2 request and response.

POST /webservices/tempconvert.asmx HTTP/1.1

Host: www.w3schools.com


Content-Length: length

<?xml version=”1.0" encoding=”utf-8"?>

<soap12:Envelopexmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”

xmlns:xsd=”http://www.w3.org/2001/XMLSchema” xmlns:soap12=”http://www.w3.org/2003/05/soap-envelope”>

<soap12:Body>

<CelsiusToFahrenheit xmlns=”http://tempuri.org/”>

<Celsius>string</Celsius>

</CelsiusToFahrenheit>

</soap12:Body>

</soap12:Envelope>

HTTP/1.1 200 OK


Content-Length: length

<?xml version=”1.0" encoding=”utf-8"?>

<soap12:Envelopexmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”

xmlns:xsd=”http://www.w3.org/2001/XMLSchema” xmlns:soap12=”http://www.w3.org/2003/05/soap-envelope”>

end class


NOTES


<soap12:Body>

<CelsiusToFahrenheitResponse xmlns=”http://tempuri.org/”>

<CelsiusToFahrenheitResult>string</CelsiusToFahrenheitResult>

</CelsiusToFahrenheitResponse>

</soap12:Body>

</soap12:Envelope>

From this example, it is very clear about the structure of a SOAP 1.2 segment. Thiscode segment illustrates only Celsius to Farenheit conversion. The Farenheit to Celsiuscode can be tried on your own.

Look into the automatically created XML file having WSDL code.

<wsdl:definitions targetNamespace=”http://tempuri.org/”><wsdl:types><s:schemaelementFormDefault=”qualified”targetNamespace=”http://tempuri.org/”><s:element name=”FahrenheitToCelsius”><s:complexType><s:sequence><s:element minOccurs=”0" maxOccurs=”1" name=”Fahrenheit”

type=”s:string”/></s:sequence></s:complexType></s:element><s:element name=”FahrenheitToCelsiusResponse”><s:complexType><s:sequence><s:element minOccurs=”0" maxOccurs=”1"name=”FahrenheitToCelsiusResult” type=”s:string”/></s:sequence></s:complexType></s:element><s:element name=”CelsiusToFahrenheit”><s:complexType><s:sequence><s:element minOccurs=”0" maxOccurs=”1" name=”Celsius”

DMC 1801

NOTES


type=”s:string”/></s:sequence></s:complexType></s:element><s:element name=”CelsiusToFahrenheitResponse”><s:complexType><s:sequence><s:element minOccurs=”0" maxOccurs=”1"name=”CelsiusToFahrenheitResult” type=”s:string”/></s:sequence></s:complexType></s:element><s:element name=”string” nillable=”true” type=”s:string”/></s:schema></wsdl:types><wsdl:message name=”FahrenheitToCelsiusSoapIn”><wsdl:part name=”parameters” element=”tns:FahrenheitToCelsius”/></wsdl:message><wsdl:message name=”FahrenheitToCelsiusSoapOut”>

<wsdl:part name=”parameters” element=”tns:FahrenheitToCelsiusResponse”/></wsdl:message>

<wsdl:message name=”CelsiusToFahrenheitSoapIn”><wsdl:part name=”parameters” element=”tns:CelsiusToFahrenheit”/></wsdl:message><wsdl:message name=”CelsiusToFahrenheitSoapOut”><wsdl:part name=”parameters” element=”tns:CelsiusToFahrenheitResponse”/></wsdl:message><wsdl:message name=”FahrenheitToCelsiusHttpPostIn”><wsdl:part name=”Fahrenheit” type=”s:string”/></wsdl:message><wsdl:message name=”FahrenheitToCelsiusHttpPostOut”><wsdl:part name=”Body” element=”tns:string”/></wsdl:message><wsdl:message name=”CelsiusToFahrenheitHttpPostIn”><wsdl:part name=”Celsius” type=”s:string”/>


NOTES


</wsdl:message><wsdl:message name=”CelsiusToFahrenheitHttpPostOut”><wsdl:part name=”Body” element=”tns:string”/></wsdl:message><wsdl:portType name=”TempConvertSoap”><wsdl:operation name=”FahrenheitToCelsius”><wsdl:input message=”tns:FahrenheitToCelsiusSoapIn”/><wsdl:output message=”tns:FahrenheitToCelsiusSoapOut”/></wsdl:operation>“<wsdl:operation name=”CelsiusToFahrenheit”><wsdl:input message=”tns:CelsiusToFahrenheitSoapIn”/><wsdl:output message=”tns:CelsiusToFahrenheitSoapOut”/></wsdl:operation></wsdl:portType>“

<wsdl:portType name=”TempConvertHttpPost”>“

<wsdl:operation name=”FahrenheitToCelsius”><wsdl:input message=”tns:FahrenheitToCelsiusHttpPostIn”/><wsdl:output message=”tns:FahrenheitToCelsiusHttpPostOut”/></wsdl:operation>“

<wsdl:operation name=”CelsiusToFahrenheit”><wsdl:input message=”tns:CelsiusToFahrenheitHttpPostIn”/><wsdl:output message=”tns:CelsiusToFahrenheitHttpPostOut”/></wsdl:operation></wsdl:portType> <wsdl:binding name=”TempConvertSoap” type=”tns:TempConvertSoap”>

<soap:binding transport=”http://schemas.xmlsoap.org/soap/http”/><wsdl:operation name=”FahrenheitToCelsius”><soap:operation soapAction=”http://tempuri.org/FahrenheitToCelsius”

style=”document”/><wsdl:input><soap:body use=”literal”/>

DMC 1801

NOTES


</wsdl:input><wsdl:output><soap:body use=”literal”/></wsdl:output></wsdl:operation>“<wsdl:operation name=”CelsiusToFahrenheit”><soap:operation soapAction=”http://tempuri.org/CelsiusToFahrenheit”style=”document”/><wsdl:input><soap:body use=”literal”/></wsdl:input>“

<wsdl:output><soap:body use=”literal”/></wsdl:output></wsdl:operation></wsdl:binding><wsdl:binding name=”TempConvertSoap12" type=”tns:TempConvertSoap”><soap12:binding transport=”http://schemas.xmlsoap.org/soap/http”/><wsdl:operation name=”FahrenheitToCelsius”><soap12:operation soapAction=”http://tempuri.org/FahrenheitToCelsius”

style=”document”/><wsdl:input>

<soap12:body use=”literal”/></wsdl:input>

<wsdl:output><soap12:body use=”literal”/></wsdl:output></wsdl:operation>“<wsdl:operation name=”CelsiusToFahrenheit”><soap12:operation soapAction=”http://tempuri.org/CelsiusToFahrenheit”

style=”document”/><wsdl:input>


NOTES


<soap12:body use=”literal”/></wsdl:input><wsdl:output><soap12:body use=”literal”/></wsdl:output></wsdl:operation></wsdl:binding><wsdl:binding name=”TempConvertHttpPost”type=”tns:TempConvertHttpPost”><http:binding verb=”POST”/>

“<wsdl:operation name=”FahrenheitToCelsius”>

<http:operation location=”/FahrenheitToCelsius”/>“

<wsdl:input><mime:content type=”application/x-www-form-urlencoded”/></wsdl:input>

“<wsdl:output><mime:mimeXml part=”Body”/></wsdl:output></wsdl:operation>

“<wsdl:operation name=”CelsiusToFahrenheit”>

<http:operation location=”/CelsiusToFahrenheit”/>“

<wsdl:input><mime:content type=”application/x-www-form-urlencoded”/></wsdl:input>“

<wsdl:output><mime:mimeXml part=”Body”/></wsdl:output></wsdl:operation></wsdl:binding>“

DMC 1801

NOTES


<wsdl:service name=”TempConvert”><wsdl:port name=”TempConvertSoap” binding=”tns:TempConvertSoap”>

<soap:addresslocation=”http://www.w3schools.com/webservices/tempconvert.asmx”/>

</wsdl:port><wsdl:port name=”TempConvertSoap12"binding=”tns:TempConvertSoap12"><soap12:addresslocation=”http://www.w3schools.com/webservices/tempconvert.asmx”/>

</wsdl:port><wsdl:port name=”TempConvertHttpPost” binding=”tns:TempConvertHttpPost”><http:address

location=”http://www.w3schools.com/webservices/tempconvert.asmx”/></wsdl:port></wsdl:service></wsdl:definitions>

This code clearly depicts the WSDL and SOAP message to create a simple webservice to convert Celsius to Farenheit and vice versa.

4.4.5 Web Services stack

When we talk about web services stack, the two important items to be described areweb services protocol stack and web services stack. A web service protocol stack(Wikipedia definition) is a stack of computer networking protocols that are used to define,locate, implement, and make Web services interact with each other. A web service protocolstack typically stacks four types of protocols:

(Service) Transport Protocol: This is responsible for transporting messagesbetween network applications and includes protocols such as HTTP, SMTP, FTP,as well as the more recent Blocks Extensible Exchange Protocol (BEEP).

(XML) Messaging Protocol: This protocol is responsible for encoding messagesin a common XML format so that they can be understood at either end of a networkconnection. Currently, this area includes such protocols as XML-RPC, WS-Addressing, and SOAP.

(Service) Description Protocol: This protocol is used for describing the publicinterface to a specific web service. The WSDL interface format is typically usedfor this purpose.

(Service) Discovery Protocol: This protocol centralizes services into a commonregistry such that network web services can publish their location and description,and makes it easy to discover what services are available on the network. Atpresent, the UDDI API is normally used for service discovery.


NOTES


The web service protocol stack also includes a whole range of recently definedprotocols such as BPEL, SOAP-DSIG.

On the other hand a Web services stack is a rather limited thing. It is software thatsupports the Web services standards so you can send and receive SOAP messages anddo the UDDI and WSDL stuff. Here the ultimate aim is to make every one (vendors likeIBM) to agree on a single Web services stack — the protocols used to define, locate,implement and make Web services interact. We have studied about J2EE and .NETframework in the previous section. Now the question is interoperability between thetechnologies? This web services stack is a solution towards it.

Metro

The Metro Web Services stack delivers secure, reliable, transactional interoperabilitybetween Java EE and .Net 3.0 to help you build, deploy, and maintain CompositeApplications for your Service Oriented Architecture. Metro provides ease-of-developmentfeatures, support for W3C and WS-I standards such as SOAP and WSDL, asynchronousclient and server, and data binding through JAXB 2.0.

GlassFish

GlassFish is an open source, production-quality and Java EE5 compatible applicationserver. GlassFish focuses on ease-of-development with enhanced web services via Metro.Constructing web applications is made easier with JavaServer Faces (JSF) technologyand the JSP Standard Tag Library (JSTL). Java EE 5 supports rich thin-client technologiessuch as AJAX, technologies that are crucial for building applications for Web 2.0.

Web services are Web based applications that use open, XML-based standards andtransport protocols to exchange data with clients. Web services are developed using JavaTechnology APIs and tools provided by an integrated Web Services Stack called Metro.The Metro stack consisting of JAX-WS, JAXB, and WSIT, enable you to create anddeploy secure, reliable, transactional, interoperable Web services and clients. The Metrostack is part of Project Metro and as part of GlassFish, Java Platform, Enterprise Edition(Java EE), and partially in Java PlatForm, Standard Edition (Java SE). GlassFish and JavaEE also support the legacy JAX-RPC APIs.

Axis 2.0 runs on WebSphere, as well as WebLogic from BEA Systems Inc., andApache’s own Tomcat, and has demonstrated interoperability with Microsoft .NETframework. The BEA and JBoss, the division of Red Hat Inc., have chosen to developtheir own Web services stacks. BEA offers SALT 1.1, a native Tuxedo Web service stackbuilt on an open-standard SOAP implementation. JBossWS is a JAX-WS compliant Webservices stack developed to be part of JBoss’ Java EE5 support. It is nice to have a singlestack that runs on WebSphere, Tomcat and WebLogic.

DMC 1801

NOTES


WSBL

Web Service Business Library (WSBL) is a solution for any company which offersfinancial services, by combining agents’ theory, web services and grid computing. Thisapproach would enable the bank to have only one library for pricing all products running inone grid giving service to all of the trading rooms that a bank could have around the world.These services could be sold to third-party users with the appropriate security services.

4.5 EBXML

The central point of the web services architecture is based on the repositories whichallow businesses to find each other and utilize the services provided by each other. Thismethod of finding the required services through web from a centralized information sourceis an effective way to find the required services by the businesses. However, there aresome approaches available than this to achieve the same. One of the approaches is ebXMLwhich is nothing but Electronic Business XML. This represents a global initiative to defineprocesses around which business can interact over the web. Hence, the vision of ebXMLis to create a single global electronic marketplace where enterprises of any size and in anygeographical location can meet and conduct business with each other through the exchangeof XML based messages.To facilitate this, ebXML provides an infrastructure for

data communication interoperability, a semantic framework for commercial interoperability a mechanism that allows enterprises to find, establish a relationship, and conduct

business with each other.

The Data communication interoperability is ensured by a standard message transportmechanism and Commercial interoperability is provided by means of a specification schema fordefining business processes, core components and context model for defining BusinessDocuments. Also ebXML recommends a methodology and provides a set of worksheets andguidelines for creating those models. In order for the actual conduct of business to take place,ebXML provides a shared repository where businesses can discover each other’s businessoffering by means of

partner profile information a process for establishing an agreement to do business (Collaboration Protocol

Agreement, or CPA) a shared repository for company profiles, business-process-specifications, and relevant

business messages.

The brains behind this project are UN/CEFACT (United Nations Center for TradeFacilitation and Electronic Business) and OASIS (the Organization for the Advancementof Structured Information Standards). The Wikipedia definition for ebXML is as follows:


NOTES


“Electronic Business using eXtensible Markup Language, commonly known as e-business XML, or ebXML and it is typically referred to as, is a family of XML basedstandards sponsored by OASIS and UN/CEFACT whose mission is to provide anopen, XML-based infrastructure that enables the global use of electronic businessinformation in an interoperable, secure, and consistent manner by all tradingpartners”

The original project envisioned five layers of data specification, including XMLstandardsfor:

Business processes, Collaboration protocol agreements, Core data components, Messaging, Registries and repositories

This initiative continued to gain support from variety of sources and other standardsorganizations. Some of the oranizations are RosettaNet( a consortium of more than 400companies), The Global Commerce Initiative (representing manufacturers and retailers),The open Applications Group Inc, the Automotive Industry action group, Health LevelSeven and the Open Travel Alliance.

4.5.1 ebXML Technologies

This initiative is based on the set of building blocks that’s makes use of existing standardswhere ever possible. ebXML Technical Architecture is comprised of two basic components:Design Time and Run Time. Business Process and Business Information Analysis is a part ofDesign Time component. The Design Time component deals with the procedures for creatingan application of the ebXML infrastructure, as well as the actual discovery and enablement ofebXML-related resources required for business transactions to take place. The Run Timecomponent covers the execution of an ebXML scenario with the actual associated ebXMLtransactions.

The following are the some of the components of the technical architecture:

Messaging

ISO 15000-2 is the ebXML Messaging Service Specification standard. Basically ituses SOAP to send messages.

Business Process

ebXML distinquishes itself from other XML frameworks by the emphasis given to thebusiness process. The overall process includes

DMC 1801

NOTES


Process Definition utilizing Business Process and Business Document Analysis logical progress to Partner Discovery Partner Sign-Up Electronic Plug-in Process Execution Process Management Process Evolution

Here the modeling languages and charting tools are used to standardize and capture theflow of business data among the trading partners.

Trading partner profiles and agreements:

Here each trading partner will have their own Collaboration Protocol Profile (CPP)document that describes their abilities in an XML format. For example, it may include themessaging protocols they support, or the security capabilities they support. A CPA documentis the intersection of two CPA documents, and describes the formal relationship betweentwo parties. The following information will typically be contained in a CPA document:

Identification information Security information Communication information Endpoint locations Rules to follow when acknowledgments are not received for messages, including

how long to wait before resending, and how many times to resent Whether duplicate messages should be ignored Whether acknowledgments are required for all messages

Registries and repositiories

ISO 15000-4 is the standard for ebXML Registry Services Specification. It containsthe industry processes, messages and vocabularies that define the transactions that occurbetween trading partners.

Core Components

The ebXML Methodology for the Discovery and Analysis of Core Componentsdescribes the process for identifying information components that are re-usable acrossindustries. Core components are used to define domain components and businessinformation objects. Business libraries, which contain libraries of business processspecifications, are instrumental in the discovery and analysis of core components and domaincomponents.


NOTES


The steps in ebXML driven business process:

Review by the required industry to determine the requirements for an ebXMLimplementation

Finding the relevant transaction definitions available in the registry Decision by the industry whether to go for registry transaction or internally generate

the software needed to support To create and register a CPP with the ebXML registry Other companies may query the repository to determine compatibility If the partnering is feasible negotiation can proceed based on CPP Once the agreement is reached the two companies can begin doing business and

engaging in transaction

Thus ebXML adds process to e-business interaction.

Thus web services can be looked into as a bundle, which allows us to take the webfrom content delivery network for server to server interaction. In detail, we explored aboutUDDI for registering, storing and WSDL for figuring out how to connect to existing services.Also we saw how SOAP makes it possible a decentralized, distributed space made possible.Also explained about ebXML in detail

4.6 .NET, J2EE AND BEYOND – INTRODUCTION

So far we have studied about various protocols and standards for Web serviceimplementations such as SOAP, UDDI and WSDL. What are the functionalities and facilitiesthey provide? They facilitate the transporting of services, discovery of services andestablishment of the connections. Even though it looks it is enough, it is not so? Then whatelse is required? These protocols do not provide any functionality for the critical requirementsfor the electronic enterprise as mentioned below:

Transactions Security Identity

Transactions allow multiple interactions to be treated as a single atomic all-or-noneoperation; Security enables privacy and authentication for those transactions; Identityprovides a way for verifying the who’s who over the internet.

Now it is important to understand how far it will cater to the dynamic needs of theindustry? Is there any web service battle lines are shaping up along two fronts Microsoftwith its .NET initiative and Sun with J2EE architecture? In this section let us look into theWeb-services related strategies of both .NET and J2EE.

DMC 1801

NOTES


Generally, the interaction between the servers may be based on any protocol available.The traditional enterprise computing model, based on middleware and application serverstied to tightly coupled networks. But the introduction of loosely coupled message-basedarchitectures has changed the computing landscape of server to server interactions.However, making this loosely coupled Web space commercially viable for service basedinteraction requires transactional capabilities to ensure the following:

Stability and regularity across networks Security to protect transactions Managing the identity in open networks

The above points clearly indicate the significance to be provided for the transactions,security and identity in the emerging world of SOAP and Web Services for the success ofnew web environment. Let us look into the transactions, security and identity in detail inthe next section.

Transactions

Transactions are the set of software operations which are the basic units of electroniccommerce venture. They should posses the properties Atomicity, Consistency, Isolationand Durability also known as ACID properties.

Atomicity requires that all operations of a transaction be performed successfully inorder for the transaction to be considered complete. If not, it is considered to be notcomplete. Hence the effect of the transaction is nothing.

Consistency refers that the transaction should preserve the consistent state of thedata while performing the operations.

Isolation indicates that the data which is manipulated by a transaction should beavailable only for the transaction which is currently manipulating the data. In other words,it can be considered that the other transactions which are running concurrently could notsee the data until the transaction is successfully completed and committed to the work.

Durability means that the updates made by committed transactions be exist in thedatabase in spite of failures which occur even after the commit operation. This means thatthe data changes are recoverable even after any failure or crash.

Hence transactions are vital for any architecture to handle web based e-commerceapplications. This leads to the software vendors to concentrate on the development of thesoftware for transaction monitors, for standard interface to a variety of back-end databasesetc.


NOTES


Security

The internet relies on several security protocols. For web based e-commerce verifyingthe authentication of the web sites, encryption of the data to be transferred are some of thesecurity measures required. Even though the Secure Sockets Layer and Transport LayerSecurity protocols have been successful in achieving this, it is not enough. This has beenexplained in detail in the V Unit.

Identity

Now a day, in the web environment, the user identity is the center of attraction. Herethe focus is changed from the machine towards the user. Initially the machine details wereused as key for licensing and installing the software in that machine. Without such licenses,the software could not be installed, or if installed, it would run illegitimately. However,imagine a situation where hardware needs to be validated not the user. When dealing withthe users connecting via web, the user may be available in any geographical location. Inthis environment, the user will not be in a position use the licensed software unless otherwisehe carries the hardware along with him. Hence a new model is required where the userauthentication becomes a key issue. Here the prime question is whether the software packageis licensed to run for a particular user? The achievement of this is through validating theuser based on permissions stored in some database to determine what the user can andcan’t do.

Let us discuss about the two technologies Microsoft’s Passport and Sun supportedLiberty alliance for managing user identity.

Passport

How Microsoft handles the user identity through the facility called Passport? Passportcan store credit card and address information as part of user’s account. It is also used asan entry point to the .NET My Services which is a way to utilize the Web services toconsumer applications. Now the question is if the user has a passport what are the facilitiesavailable for him and they are going to be useful in how many ways to the user? The onlyanswer is, with access to passport the user can

Participate in express purchasing over the web without manually entering theiraddress information and payment information

Hence Passport is a single-sign-on authentication facility available for the users totake part in e-purchase and e-payment. It has been integrated with Microsoft’s Hotmailemail service.

DMC 1801

NOTES


Liberty alliance

This is an alternative technology to Passport promoted by Sun systems for single-sign-on authentication service. Here the objective is to create a universal digital identityservice based on open standards. Hence users could be able to log in once on a given website and will be an authenticated user for all online services supporting the Liberty standard.

4.6.1 .Net And J2EE

On one side the loosely coupled environment is available. On the other side the tightlycoupled object based environments are available. The technologies on hand to work onthese two environments are enormous. Now the challenge is bringing the relationshipbetween these two environments to bridge the gap. For example, assume that the transactionengines are running under tightly coupled networks. The SOAP based data is availableacross the web space which is based on loosely coupled systems. Now the question isbringing the transactional integrity between the web with its promise of global connectivityand more conventional middleware that holds the key to transactions, security and identity.We can say that the tightly coupled object-based frameworks have been subsumed by.NET and J2EE. While these two technologies are often compared with each other, theyhave elemental differences that make direct comparison difficult. Even this is evident; thefollowing points can be noted to make the points clear:

.NET represents the implementations of a complete enterprise architecture tunedto the window’s platform

J2EE is the specification of architectural components designed to work togetherto define a complete architecture. Since it is a specification, implementation detailslike code from various vendors is required providing the functionalities.

Let us see the implementations in detail in the following section in J2EE and .NETplatforms.

4.6.2 .Net: a Microsoft framework

The Microsoft .NET Framework provides the following:

The infrastructure for developing highly distributed applications The common language runtime a highly secure and fault-tolerant execution environment the tools for creating, deploying, and managing applications

The .NET initiative focuses on a development framework that integrates all the earlierMicrosoft technologies with newer technologies built around XML. What can we do withthis .NET framework? Here is the answer:


NOTES


Microsoft .NET software enables us to develop applications for differentenvironments and devices. Is it not interesting to read? For example the followingtasks can be performed:You can build eXtensible Markup Language (XML) Web services and Web

applications for a highly distributed environment such as the Internet.You can also create traditional Windows-based applications, server components,

and applications that can run on any device, such as a PC or a mobile device. NET enables seamless data exchange between various applications and devices. In addition, The .NET Framework infrastructure provides the execution engine

and run-time services to applications,

Closer look at .NET

NET consists

development tools run-time environments server infrastructure Intelligent software to build applications for various platforms and devices .NET integrates various applications and devices by using standards Hypertext Transfer Protocol (HTTP) XML Simple Object Access Protocol (SOAP) The tools .NET provides are Smart Client software

Using XML web services, a client, a PC, or a mobile device can access data fromany location or device

NET Server infrastructure. The .NET Server infrastructure includes Windows 2000 Servers, Windows NET

Servers, and .NET Enterprise Servers. This provides a highly secure and scalableplatform for deploying .NET applications.

XML Web services. These are core to application integration in the .NET environment Microsoft Visual Studio .NET and the .NET Framework.

Complete solution for building, hosting, and consuming XML Web services is beenprovided by Visual Studio .NET and the .NET Framework. Also Visual Studio .NETsupport a variety of programming environments and languages. In addition, it provides asingle-point access to all the tools that we require thus making it one of the most productivetools available.

DMC 1801

NOTES


XML Web services comprise the core components that enable a client application toexchange data with another client or server application, which is shown in the figure 4.4.Server applications can also exchange data with each another with the help of Web services.Also, applications running on any device can exchange data with the applications runningon any other device.

Figure 4.4 : Components of .NET

Figure 4.5 : .NET Framework Components

Mobile devices

Desktop Computers

XML Web Services

Servers

VB.NET C++ VJ++ VC++.NET …………….

Web forms XML Web services

ASP.NET Windows forms

.NET Framework Class Library

Common Language RUNTIME

Win 32


NOTES


This Framework provides a consistent object-oriented programming model. Theadvantage of this is it can be used to build all types of applications. Here the approachused for creating numerous applications such as Windows-based applications and XMLWeb services is based on the object oriented programming model.

The procedure to create a .NET application is as follows:

Create a class and define the functionality of the application in terms of properties,events, and methods of the class.

For the Web applications, the code that controls the behavior of the Web page isencapsulated within a class.

Hence the classes support object-oriented features such as inheritance,encapsulation, and polymorphism. Therefore, classes are fundamental toprogramming in the .NET environment.

Classes can be created in any language supported by the .NET Framework.

A class written in one language is reusable by classes written in other languages.

Classes inherit across language boundaries because the .NET Framework allowslanguage interoperability and supports cross-language inheritance

Now it is important to understand what this Common Language Infrastructure (CLI)is. It defines the specifications for the infrastructure that the Intermediate Language codeneeds for execution. The CLI provides a common type system (CTS) and services suchas type safety and managed code execution. The .NET Framework provides theinfrastructure and services according to the CLI specifications. They are

Common language runtime.

This includes the CLI and provides the execution environment to .NET applications.

Common type system.

This component provides the necessary data types, value and object types to developapplications in different languages. All the .NET languages share a Common Type System.

For example, a String in Visual Basic .NET is the same as a String in Visual C# orin Visual C++ .NET, since all the .NET languages have access to the same classlibraries.

Type safety.

The .NET Framework ensures that operations to be performed on one value orobject are performed on that value or object only.

DMC 1801

NOTES


Managed code execution.

The .NET Framework loads and executes the .NET applications, and manages thestate of objects during program execution. Here, the framework automatically allocatesmemory and provides an automatic garbage collection mechanism.

Side-by-side execution.

Here an entity called “assembly” is used; it contains the IL code and metadata. Themetadata contains information such as the version of the assembly and the name and versionof the other assemblies on which the assembly depends. What is the use of this “assembly”?The .NET Framework allows you to deploy multiple versions of an application on a systemby using assemblies. Assemblies are the deployment units in the .NET Framework.. Thecommon language runtime uses the version information in the metadata to determineapplication dependencies and enables you to execute multiple versions of an applicationside-by-side.

In addition, this framework takes care of transactions through a Transaction Engine,which serves as a container for components running in the middle tier of a three tierapplication. It allows the programmer to add transaction as a simple attribute in the class,instead writing complex transaction processing code.

4.6.3 J2EE

J2EE is the java centric enterprise platform specification. Even though J2EE originatedwith Sun, the complete specification and changes in the specification are under thecollaborative umbrella of the Java Community Process. Here in J2EE, the Web servicesuse standards-based frameworks to extend an application’s reach. However, a web serviceisn’t the application itself. The web service must still be implemented on a proven applicationinfrastructure—one that supports reliability, availability, serviceability, transactions, security,and other critical enterprise needs which the J2EE infrastructure provides. It includes thefollowing API’s.

JAXP- Java API for XML processing JAXB- Java Architecture for XML binding JAXM- Java API for XML messaging JAX-RPC – Java API for Remote Procedure Calls JAXR – Java API for XML registries

The Java API for XML Messaging (JAXM) and the Java API for XML-based RPC(JAX-RPC) are both part of the Java Web Services Developer Pack. What are the usesof these API’s? These APIs are a key part of Sun’s plans to integrate web servicesinterfaces into future versions of the J2EE platform.


NOTES


JAXM is a method available, which gives ways to define a common set of Java APIsfor creating, consuming, and exchanging SOAP envelopes over various transportmechanisms. It is mainly used for a document-style exchange of information. It requires theuse of low-level APIs to manipulate the SOAP envelope directly.

We can simply JAX-RPC provides a way for performing RMI-like Remote ProcedureCalls over SOAP. It also facilitates rules for such things as client code generation, SOAPbindings, WSDL-to-Java and Java-to-WSDL mappings, and data mappings between Javaand SOAP.

Difference between JAXM and JAX-RPC

The real difference between JAXM and JAX-RPC is that JAXM forces thedeveloper to work directly with the SOAP envelope constructs. But JAX-RPCprovides a high-level, WSDL-based framework that hides details of the SOAPenvelope from the developer.

JAX-RPC uses WSDL to generate the messages and provides an object-orientedinterface to the developer. But what JAXM does? Is it uses WSDL? No, it doesn’tuse WSDL. Then what the developer has to do to construct the message? Thedeveloper must construct messages by hand and send or process them explicitly.

javax.xml.soap package, includes the APIs for constructing and deconstructing aSOAP envelope directly, including a MIME-encoded multipart SWA (SOAP withattachments) message. Here both JAXM and JAX-RPC share this package.

Web services use standards-based frameworks to extend an application’s reach.However, a web service isn’t the application itself. There are different approaches availableto integrate J2EE and web services.

The SOAP-J2EE Interaction

It is clear that we know SOAP is the basis of interoperability between J2EE andweb services. The understanding of how J2EE and web services work together comesdown to analyzing how SOAP and J2EE can work together.

SOAP is a wire protocol that can be layered upon other wire protocols such asHTTP, FTP, and SMTP

J2EE supports these Internet protocols through servlets Hence, servlets and JSP technology will become the entry point into a J2EE framework for web services Within J2EE, servlets, JSPs, EJBs, JMS resources, JDBC drivers, and J2EE CA

adapters provide access to the business logic and enterprise resources that a webservice needs

Servlets and JSPs are designed to encapsulate page-based flow and logic andcan also work with numerous Internet protocols

DMC 1801

NOTES


The servlets are responsible for extracting the SOAP contents from another wirepacket.

The SOAP contents must then be parsed so the servlet can acquire access to theelements and attributes contained within the SOAP document.

A servlet must contain the logic for the following:

Envelope parsing Parsing attachments Validating message format Validating XML Rapid XML parsing XML-Java binding Payload conversion

Here the mapping and translation between a SOAP-over-HTTP message and abackend J2EE component such as an EJB or JMS destination may not be exposed explicitlyin the servlet layer. Higher-level layering may implicitly hide that information from theprogrammer. The servlet API may actually be extended across an RMI or JMS infrastructureand exposed to the service at the ultimate remote destination.

Based upon how WSDL, JAXM, and JAX-RPC eventually define the behavior ofweb services, four fundamental types of messages can be transported over SOAP:

Request/response Solicit/response One-way Notification

These four types of behaviors have been already explained. In this way, the integrationbetween web services and J2EE works.

4.6.4 The Java Web Service (JWS) Standard

A newly proposed standard called the Java web service ( JWS) standard is currentlyin development. It is spearheaded by BEA Systems, which also has a referenceimplementation. What is it? It is nothing but a format designed to integrate non-Javadevelopers with J2EE. Isn’t it Sounds ambitious? At the core of the JWS specification, theidea is that the developers don’t create J2EE components. Rather, developers create aweb service, and a single Java class which contains the code for the web service intendedto. The Java class then has a number of simple, predefined JavaDoc tags that indicatedifferent behavioral implementations of the web service. Based on the values of the JavaDoc


NOTES


tags inserted into the Java class, a behind-the-scenes code generator then creates allnecessary J2EE components required to implement the web service.

The JWS JavaDoc system has tags representing a full range of web service behaviors,including stateless methods, stateful methods, and asynchronous invocations. The challengeleft to JWS implementations is to take the definition of the JavaDoc tags and generateJ2EE components that implement this behavior in a reliable and available manner.

Since it is interesting and appealing the tool vendors can support BEA’s prototypeimplementation. Also it comes with a nice IDE that ties together design, coding, and testing.The concept of deployment is completely hidden from the developer. The goal is to have aframework for developing web services with J2EE that is similar to working in VisualBasic.

Here different vendors like IBM, BEA, Oracle, HP implement these API’s usingJ2EE specification. However, each vendor provides additional features also whiledeveloping these API’s.

Summary

XML and SOAP provides the data and transport facility; web services provides theprotocols for discovery and connection; Using this service oriented applications can bedeveloped. Microsoft’s .NET framework and Sun led J2EE are the two platforms availablefor the vendors to choose. Which platform to choose? The bottom line is that the choicebetween the two will always be a choice between product platforms, driven by the servicesa vendor can provide and a company’s vision of its future.

Questions

1. Web servicea. Is a processb. Is a technologyc. Is a phenomenond. All of the above

2. Which of the following is not possible through web servicesa. The desire to allow businesses to use the Internetb. To improve collaboration with customers, partners and suppliersc. To have complex trading partner interactionsd. None of the above

3. Web services registry supporta. White pagesb. Yellow pages

DMC 1801

NOTES


c. Green pagesd. All of the above

4. White pages providea. Contact information of a given businessb. Categories of businesses based on existing standardsc. Technical information about the web services provided by a given businessd. None of the above

5. Yellow pages providea. Contact information of a given businessb. Categories of businesses based on existing standardsc. Technical information about the web services provided by a given businessd. None of the above

6. Green pages providea. Contact information of a given businessb. Categories of businesses based on existing standardsc. Technical information about the web services provided by a given businessd. None of the above

7. Web services is meant fora. Human to computer interactionsb. Computer to computer interactionsc. Human to human interactionsd. None of the above

8. The web services triad architecture does not includea. A service providerb. A service requesterc. A brokerd. Directories

9. Which of the following specification(s) are included in UDDI framework?a. UDDI Programmer’s API Specificationb. UDDI Data structure Specificationc. UDDI Service specificationd. a & b

10. Which of the following is a major data structure used by UDDI Programmer API?a. businessEntityb. businessService


NOTES


c. tModeld. All of the above

11. QOS issues are addressed by UDDI by defining a calling convention that involves theuse of cacheda. businessEntityb. businessServicec. bindingTemplatesd. None of the above

12. OASIS stands fora. Organization for Advanced Software Initiative Systemsb. Organization for Advancement of Software and Information Systemsc. Organization for Advancement for Structured Information Standardsd. None of the above

13. The technical architecture of ebXML consists ofa. Messagingb. Business processesc. Registries and Repositoriesd. All of the above

14. What is CPA?a. Collaboration Protocol Agreementb. Collaborative Partner Agentsc. Collaboration Protocol Applicationd. None of the above

15. Which one is not a technology component of .NET?a. Development toolsb. Specialized serversc. Legacy softwared. Devices

16. MTS stands fora. Microsoft Transaction Systemb. Microsoft Technology Solutionsc. Microsoft Technical Supportd. Microsoft Transaction Server

17. Adapters are needed toa. Integrate web servicesb. Compose Web services

DMC 1801

NOTES


c. Build connections with Web servicesd. Build connections with legacy systems

18. Hp Web services Registry is used to publish and discovera. Public registriesb. Private registriesc. Both public and private registriesd. None of the above

19. JAXRa. Allows access to emerging XML messaging standardsb. Is the API for doing XML-based procedure callsc. Provides a uniform standard interface to registries of XML business datad. None of the above

20. The .NET approach to software integration is based on aa. Hub-and spoke configurationb. Ring configurationc. Star configurationd. None of the above

Answers

1 – d, 2 – c , 3 – d, 4 – a 5 – b , 6 – c, 7 – b, 8 – d, 9 – d, 10 – d. , 11 – c, 12 – c, 13 –d, 14 – a, 15 – c, 16 – d, 17 – d, 18 – c, 19 – c, 20 – aPart A

1. Define the term “Web Service”2. What are the driving forces behind web services?3. Define “UDDI”, “ XML” and “SOAP”4. Define the term “Registry”5. What is the risk or disadvantage associated with web services?6. Explain the purpose, operation and information dealt by White Pages.7. Explain the purpose, operation and information dealt by Yellow Pages.8. Explain the purpose, operation and information dealt by Green Pages.9. Describe the major components of a web service architecture.10. What are the key technologies that web services rely upon?11. What makes up the UDDI family of specifications?12. How does UDDI address QOS?13. What do you mean by WSDL?


NOTES


14. What is Electronic Business XML?15. What are the components of the technical architecture of ebXML?16. What is Collaboration Protocol Profile17. What is a business message18. Define Collaboration Protocol Agreement.19. What is a passport?20. Explain the Liberty Alliance Project.21. What are the technology components of .NET architecture?22. What are the five main components of a .NET platform?23. Name the APIs that are included in the Web services pack.24. What are the components of BEA WEbLogic E-Business platform?25. What are the adapters used by Oracle to extend the .NET framework?

Part B

1. Write a short note about Web services.2. What are the opportunities and risks associated with Web services?3. Write a brief note on the different directories of UDDI.4. Discuss about the Web services Architecture.5. Discuss about the key technologies that web services rely upon.6. Explain about UDDI failure and recovery7. Write a short note on WSDL.8. Write short notes on ebXML technologies.9. Write short notes on Passport10. Discuss about the technology components of .NET architecture11. Discuss about the five main components of .NET platform12. What is CLR?13. What are the APIs provide in a Web services pack?14. What are the components of a BEA WebLogic E-Business platform?15. Write short notes on oracle adapters.

Part C

1. Explain in detail about web services2. Discuss in detail about UDDI.3. What are the features of ebXML technology?4. Explain about the .NET architecture, .NET platform and the .Net framework.5. Explain in detail about J2EE and its support to XML and web services.

DMC 1801

NOTES



NOTES


UNIT V

XML SECURITY

5.1 INTRODUCTION

You have been introduced with many concepts in web services. In this Unit we aregoing to look into web services security. You may think that whether is it necessary tostudy about this? Won’t the technologies introduced take care of the security issues? Thefollowing section is going to address about it, the various levels of web security considerationsand the advantages of it.

5.1.1 Issues

The novel levels of exchanging; sharing of data and interoperability between themintroduces new challenges for security. Unlike the closed environments, this open andloosely coupled environment has to meet the challenges for the secured environment. Someof the issues are discussed below:

Generally the HTTP traffic flows via port 80, which is accepted as an open hole inthe firewall. All the web applications and their interfaces for the assigned work arepublicly available for every one’s access and they use port 80. Is it safe? Can weassume that all the information coming through port 80 is safe? Applications thatprovide front ends for the critical data will increasingly be exposed through HTTPand accessible to anyone in the outside world. The important issue here is tocheck out the security of the web service being utilized through port 80. For example,these applications can even be published in a public directory for anyone to discover.

It may be argued that since data is being wrapped in SOAP envelopes it is secured.Does it not provide a way to differentiate the structure and meaning of data beingsent over the wire?

Sending and receiving applications don’t have to be implemented by using thesame software platforms; i.e., they don’t have to have the same security librariesfrom the same vendor. Therefore, don’t we need a set of standardized, platform-independent security solutions?

If at all we are using some encryption technique for the XML file, which is generallyextremely verbose, is it not too expensive? Wrapping data in XML can increasethe size of the data that needs to be encrypted tremendously.

DMC 1801

NOTES


The vision of web services includes enabling spontaneous supply-chain communitiesor trading communities via dynamic discovery. This vision requires complexinteractions, in which a SOAP message traverses multiple intermediaries. You maynot have a pre-existing business arrangement with some of these intermediaries,and these intermediaries may not be built on a common infrastructure. How isencryption keys managed in such an environment?

Currently, a new set of security techniques is being developed to address these issues.In this many of the issues are still in the exploration stage having identified partial, immatureearly stage solutions. However, existing security technologies with proven track recordsstill have its advantages. In fact, these new techniques are intended to build upon or augmentexisting security technologies such as Public Key Infrastructures (PKI), Secure HTTP(HTTPS), and the Secure Sockets Layer (SSL). Instead here we focus on new securityissues and solutions that have come about as a result of web services and their relatedtechnologies. Before that let us look into the overview of security.

5.2 SECURITY OVERVIEW

As we know, any e-business application requires secured transactions such as,Confidentiality, Authentication and Data integrity. This three security requirement fulfills theneed of safe sending and receiving information where the information is very sensitive andconfidential. The following section defines Confidentiality, Authentication and Data integrity:

Confidentiality: This is the factor which ensures that the information is not madeavailable to unauthorized agents, processes and entities. Even though any one who happento get the information accidentally or some one who deliberately tapped the data streamshould not be able to understand the valuable information.

Authentication: This is the skill that to determine that the message is really comefrom the scheduled sender. Along with this non-repudiation is also to be considered. Whatis non-repudiation? It is nothing but preventing the originator of a document from denyingit. For a business transaction to be valid, both the sender and receiver should not later beable to deny participation.

Data integrity: How it can be ensured that the information arrived at the destinationis the original message, which has not been tampered with or altered in transit. Data integrityis ensuring the same.

5.2.1 Cryptography

All the above three dimensions of secured transformation rest on the base of thecryptography. All the cryptography algorithms use some function or formula to encode theinformation so that it is difficult to determine its meaning without an appropriate key todecode the information. Approaches to cryptography fall into two main categories such assingle-key cryptography or public-key cryptography. Let us look into it in detail in thefollowing section:


NOTES


5.2.1.1 Single-Key Cryptography

As the name implies, systems based on the Single-key cryptography uses a singlesecret key to encipher and to decipher the information. But, due to the problem of makingthe single key known to the information recipient, the system suffered with the possibility ofleaking the secret key information. The system may use a letter-offset technique such asreplacing a particular alphabet with another alphabet or a state-of-the-art 1024-bitencryption key to mathematically compute a substitute letter. The problem here is still it isrequired to send the secret key to the recipient. What kind of cryptography is used in fixeddevices such as ATM machines? Is it single-key cryptography? Yes. It is. Since encryptionkey can be determined in advance it can be stored in the server or in the ATM machineitself. However, the same single-key systems don’t work well on the web, where thetransaction depends on the users showing interest to do the business. Hence, the single-key method may not be suitable in this environment since the key has to be transferred.The solution for this is public-key cryptography which uses two keys, one private and theother public, to encode and decode data.

5.2.1.2 Public-Key Cryptography

This Public key cryptography depends on a mathematical algorithm that generatestwo keys. Among the two keys one may be used for encryption and the other may be usedfor decryption. These keys are to be used in pairs. It means, when a key from a key pairis used to encrypt the data, only the other key from the pair can be used to decrypt it. Onekey is public, and the other is kept private. Thus this public-key cryptography enablessecure communication between parties without the need to exchange a secret key. Insender-receiver/encryption-decryption usage, the sender uses the recipient’s public key toencrypt the data. Only the intended receiver can decrypt the encrypted data because theappropriate receiver only has the corresponding private key. The problematic part of publickey cryptography is the generation, distribution, and verification of keys. If anybody wantsto do business with anybody in any part of the world, how do people involved in thisprocess will get the key? How do people confirm that the key they have received is theoriginal key sent by the authorized persons and not a forgery?

Now the next question is whether the Public Key Cryptography ensures all the threedimensions of secured transformation?

Confidentiality: Since the owner of the private key never has to disclose the key toanyone, the confidentiality is maintained in decrypting the message. Here the messagesencoded with a public key can be decoded only by the corresponding private key, ensuringthat the message is kept confidential.

Authentication: Even though the public key guarantees secrecy, it is not possible toauthenticate the sender of the message through the public key. On the other hand, if the

DMC 1801

NOTES


message is encoded with the private key by the authenticated user, the decoding can bedone by the public key. This ensures authentication on one side.

Data Integrity: Data integrity makes certain that the message received is the messagesent. How to ensure that using public key cryptography the document has not been tamperedor altered? Is it directly computable?

Hence what is the solution? Along with some other validating technique is it possibleto provide the security? Generally the technology for validating message is called digitalhashing. What is this digital hashing? It is nothing but an algorithmically generated shortstring of characters that uniquely identifies a document. For example, a digital hash isgenerated for a document and sent along with the document. If the document is tamperedby any means while communication, the re-computation of the digital hash will yield adifferent result. If hashes do not match, it is the indication that the data integrity of thedocument has been compromised.

5.2.1.3 Digital Signatures

A digital signature is like engraving the identity of the document across the face of thedocument. In other words, it can be viewed as the electronic equivalent of a written signature.Can we use this along with the public-key cryptography? Isn’t it a method to ensureauthentication and data integrity? Yes, a digital signature in combination with public-keyencryption can be used by distributed applications to authenticate the identity of the senderof a message or document. It also ensures that the message or document is not changed.

Example: A person X wants to electronically send a highly confidential messageregarding his company’s high level policy matter to his attorney. He wants the assurancethat nobody could intercept it along the way and make changes to it. In addition, he wantsthe guarantee that the document goes to the attorney is the one and the same file he actuallysent. The attorney should have the assurance that the information is originally has comefrom person X. The following steps indicate the procedure to accomplish this using thisdigital signature and public-key cryptography:

Person X has to

Write the message Create a digital hash of the message Encrypt the original message and the digital hash with his private key Send the encrypted document to his attorney

Upon receiving the message, the attorney has to

Decode the received document with Person X’s public key, thereby guaranteeingthat the message has been actually sent by Person X


NOTES


Compute the digital hash of the document received Compare the computed digital hash with the hash contained in the message

If the hashes match, the message can safely be used; Otherwise if the hashes do notmatch, it can be inferred that that the message has been tampered with.

Is it clear about the overview of the security? Now let us see how private keys andcertificates have to be managed?

5.2.2 Certificates and Private Key Management

Keeping Certificates and Private keys protected is one of the biggest securitychallenges. Even though private and public key pairs are very difficult to memorize (sincethey are mathematically generated), the problem of ensuring about the confidentiality andauthentication of the keys received by the users still exists. To tackle this, CertificateAuthorities (CA) who represents “trusted entities” in the Web security, issues Certificates.

Once a CA is chosen, the certificates from companies signed by that authority aretrusted. However, trusting a CA is purely the user’s choice. Netscape Navigator andMicrosoft Internet Explorer come with a list of certificates for some trusted CA’s(……..).The browsers support functionalities to manage the list of trusted CA’s and the expiry ofthe certificates issued.

5.3 CANONICALIZATION

Once a hash is computed for a document, then a minor change like introduction ofwhite spaces in the document produces a completely different hash. In other words, asecure hash is intolerant of minor changes in a document. This intolerance of change isessential since a minor modification in the original document must be exhibited. However,this feature presents a problem for XML documents. As we know, the XML documentsare frequently parsed and reparsed as they are transferred from the sender to the recipient.In this process, the parsers can make insignificant modifications such as the elimination ofwhite space or an empty line.

Due to this, the mismatch in the hash will emerge. Hence a novel idea is to put theXML document in a standard or normalized format before going for computing the digitalhash. This process of converting the XML document into a standard format is known asCanonicalization. So we can be confident that the sender and receiver will compute thesame hash regardless of what processing occurred along the way. This canonical formatwas standardized by the W3C in the XML-Canonicalization (xml-c14n) specification.There are some guidelines and high-level rules are available to convert the document to anxml-c14n-compliant canonical format. They are listed below:

DMC 1801

NOTES


UTF-8 is the encoding format for the document Before parsing the document, the line breaks are normalized Attribute values are normalized, as if by a validating processor Character and parsed entity references are replaced CDATA sections are replaced by their character content Removal of the XML declaration and document type declaration(DTD) Start-end tag pairs are used to represent the empty elements White space within start-end tags are normalized White space outside of the document element is also normalized All white space in character content is retained Attribute value delimiters are set to double quotes Special characters used in attribute values and character content are replaced by

character references Superfluous namespace declarations are removed from each element Default attributes are added to each element Lexicographic order is imposed on the namespace declarations and attributes of

each element

Using the above said guidelines the XML document is normalized before going forhash computation.

5.4 XML SECURITY FRAMEWORK

As we learned from the previous sections, the web services in the form of XMLrequire a security framework. The following section explains the three XML securitytechnologies which are driven by W3C.

XML Encryption XML digital Signature XML Key Management Services.

The building blocks of the XML security architecture are XML Encryption, XMLdigital Signature and XML Key Management Services.

5.4.1 XML Encryption

The aim of sending and receiving secured web services can be achieved by usingXML encryption methods provided the XML technology is being chosen as the technologyto realize the task taken. When the XML file to be encrypted contains lot more informationis it necessary to encrypt the entire contents or is there any facility available to encrypt theselected information depending on the confidentiality of the information? Here the XML


NOTES


encryption technology comes in handy with our requirement. The XML encryption supportsthe encryption of all or part of an XML document. The specification is flexible, whichmeans that it allows for complete or partial document encryption in the following way:

The complete XML document An element and all its sub elements The content of an XML element A reference to a resource outside the document

Thus XML encryption extends the power of the XML digital signature system byenabling the encryption of the message that has been signed digitally. Since XML encryptionis not bound to any specific encryption scheme, additional information is to be provided onthe following:

The information itself or a reference to the location of the data Information or a reference to information via a uniform resource identifier about

the keys used in the encryption

Here, the specification outlines a standard way to encrypt any form of digital contentand permits encryption of a full XML message, a partial XML message, or a XML messagethat contains sections that were previously encrypted. For easy remembrance of theprocedure the following steps are given:

Selecting all or part of a XML document to be encrypted Applying Canonicalization on the entire XML document Using public-Key encryption, encrypting the resulting XML document after

Canonicalization Sending the encrypted XML to the deliberate recipient

Let us see how to specify the encryption of a full XML message, a partial XMLmessage, or a XML message that contains sections that were previously encrypted with anexample file.

5.4.2.1 Encrypting XML data

The concept is being explained with an example. Now days, the purchase of itemsthrough internet is common. The support for these online transactions is payment throughcredit cards, which needs secured information exchange between the parties. Here is anexample of Mr. John X‘s purchase of an item through the credit card. The following XMLdocument contains the credit card information related to one of the purchase made by Mr.John X.

<?xml version =’1.0’?/><PaymentDetail xmlns = ‘http:/universalbank.org’>

DMC 1801

NOTES


<Name> John X Ramanoria </Name><CreditCard Limit = ‘25000’ Currency = ‘INR’><Cre_Number> 2525 5252 2255</Cre_Number><Cre_Issuer> Universal Bank</Cre_Issuer><Validity_Upto>09/11 </Validity_Upto></CreditCard></ PaymentDetail>

The above segment indicates that Mr. John X is using a credit card bearing the number252552522255 with a limit of Rs 25,000 INR. The account is available in the bank calledUniversal Bank and valid up to 9/11.

As we discussed earlier there are different ways of applying encryption to the XMLdocument. This totally depends on which part of the document is to be kept confidential.For example if we intend not to disclose any information about the purchase then thewhole document to be encrypted. Otherwise if only the Credit card information to beprotected then only the CreditCard element to be encrypted. Or only the CreditCardNumber to be protected then the element Cre_Number to be encrypted. We will see theXML equivalent of each scenario in the following examples.

Example 1: Entire XML document Encryption

If situation arises that the complete document beginning at the root tag to be encrypted,then all the elements are encrypted as a single encrypted string in the following way:

<?xml version =’1.0’?/><EncryptedData xmlns = ‘http://www.w3.org/2009/01/xmlenc#’><CipherData><CipherValue> A1B2C3D4E5F6G7H8 </CipherValue></ CipherData></EncryptedData>

Example2: Encryption of Sub element and Content Encryption

Depending on a particular situation, if the name of the person is less sensitive than theother credit card information then it is possible to selectively keep the critical dataconfidential. In this example, if it is felt that the name of the person can be shown out butnot the other information. This can be achieved by encrypting only the CreditCard informationas shown below.

<?xml version =’1.0’?/><PaymentDetail xmlns = ‘http:/universalbank.org’>


NOTES


<Name> John X Ramanoria </Name><EncryptedDataType=‘http://www.w3.org/2009/01/xmlenc#Element’xmlns = ‘http://www.w3.org/2009/01/xmlenc#’><CipherData><CipherValue> A1B2C3D4E5F6G7H8 </CipherValue></ CipherData></EncryptedData></ PaymentDetail>

Example 3: Partial XML Element Encryption

If it is required only to encrypt the part of XML element, but not the entire element itis also possible. In other words, if only the card’s number, issuer and the validity period tobe kept confidential then it is possible by writing the following code.<?xml version =’1.0’?/><PaymentDetail xmlns = ‘http:/universalbank.org’><Name> John X Ramanoria </Name><CreditCard Limit = ‘25000’ Currency = ‘INR’><EncryptedDataType=‘http://www.w3.org/2009/01/xmlenc#Content’xmlns = ‘http://www.w3.org/2009/01/xmlenc#’><CipherData><CipherValue> A1B2C3D4E5F6G7H8 </CipherValue></ CipherData></EncryptedData></CreditCard></ PaymentDetail>

In this example it has been encrypted only the following elements Cre_Number,Cre_Issuer and the Validity_Upto which is shown below:

<Cre_Number> 2525 5252 2255</Cre_Number><Cre_Issuer> Universal Bank</Cre_Issuer><Validity_Upto>09/11 </Validity_Upto>

It clearly depicts that the partial element has been encrypted.

DMC 1801

NOTES


Example 4: Encryption of XML element content only

In this example, if it is required to keep only the credit card number as confidentialthen it is nothing but encrypting the actual content value of the element <Cre_Number>.Here all the elements and their values are open to all except the number.<?xml version =’1.0’?/><PaymentDetail xmlns = ‘http:/universalbank.org’><Name> John X Ramanoria </Name><CreditCard Limit = ‘25000’ Currency = ‘INR’><Cre_Number><EncryptedDataType=‘http://www.w3.org/2009/01/xmlenc#Content’xmlns = ‘http://www.w3.org/2009/01/xmlenc#’><CipherData><CipherValue> A1B2C3D4E5F6G7H8 </CipherValue></ CipherData></EncryptedData></Cre_Number><Cre_Issuer> Universal Bank</Cre_Issuer><Validity_Upto>09/11 </Validity_Upto></CreditCard></ PaymentDetail>

Here the attribute type =‘http://www.w3.org/2009/01/xmlenc#Content’ is used toindicate to the receiver that the content of the element alone is being encrypted.

5.4.2 XML Digital Signature

What is digital Signature? In simple words, it can be described as the electronicequivalent of a written signature. Then the question may come to your mind why it isrequired? Actually, we can think of a distributed environment, where it is required toauthenticate the identity of the sender of a message or document. Digital signatures comeinto existence in that situation. In addition to that, it also ensures that the message ordocument is unchanged. Let us investigate now how XML digital signatures can be generatedand used.

The XML digital signature design defines a non-compulsory XML element thatfacilitates the inclusion of a digital signature within an XML document. This facility providesany web service with the facility to ensure data integrity and authentication with any otherweb service. In addition to specifying syntax, the design makes recommendations aboutthe types of data that require a digital signature.


NOTES


In this way, the XML digital Signature specification provides the facilities to defineelements required and the rules for processing it. These signatures provide integrity, messageauthentication and signer authentication services for XML the data.

5.4.2.1 Digital Signature Elements

The XML digital Signature specification defines a set of XML elements for describingthe details of the signatures. Here is the list of some elements.

SignedInfo CanonicalizationMethod SignatureMethod Reference KeyInfo Transforms DigestMethod DigestValue

5.4.2.2 Steps in Signature Generation

Let us see the steps to be done to digitally sign an XML document using the XMLsignature elements:

Create a SignedInfo element with SignatureMethod, CanonicalizationMethodand Reference

Canonicalize the XML document Calculate the SignatureValue based on algorithms specified in SignedInfo Construct the Signature element that includes SignedInfo, KeyInfo and

SignatureValue

The explanation for the elements specified with example XML segments follows:Look into the example XML segment. This simply explains a purchase order about an itemto be delivered to a particular address.

<PurchaseOrder xmlns=”url: xxx.purchase”><DeliveredTo countryname=”INDIA”><cus_name>Veda </cus_name><street>12 Chittankudi</street><city>Puducherry</city><state>Pondicherry</state><pincode>605004</pincode>

DMC 1801

NOTES


</DeliveredTo><items><item_names partNum=”52525252"><productName>KinderJoy candy</productName><quantity>200</quantity><price>6000</price></item_names></items></PurchaseOrder>Now look into the following XML segment with the signature information.<PurchaseOrder xmlns=”url: xxx.purchase”><DeliveredTo countryname=”INDIA”><cus_name>Veda </cus_name><street>12 Chittankudi</street><city>Puducherry</city><state>Pondicherry</state><pincode>605004</pincode></DeliveredTo><items><item_names partNum=”52525252"><productName>KinderJoy candy</productName><quantity>200</quantity><price>6000</price></item_names></items><Signature Id=”EnvelopedSig” xmlns=”http://www.w3.org/2000/09/xmldsig#”><SignedInfo Id=”EnvelopedSig.SigInfo”><CanonicalizationMethod Algorithm=“http://www.w3.org/TR/2001/REC-xml-c14n-20010315”/><SignatureMethod Algorithm=“http://www.w3.org/2000/09/xmldsig#rsa-sha1”/><Reference Id=”EnvelopedSig.Ref” URI=””><Transforms><Transform Algorithm=


NOTES


“http://www.w3.org/2000/09/xmldsig#enveloped-signature”/></Transforms><DigestMethod Algorithm=“http://www.w3.org/2000/09/xmldsig#sha1”/><DigestValue>yHIsORnxE3nAObbjMKVo1qEbToQ=</DigestValue></Reference></SignedInfo><SignatureValue Id=”EnvelopedSig.SigValue”>GqWAmNzBCXrogn0BlC2VJYA8CS7gu9xH/XVWFa08e</SignatureValue><KeyInfo Id=”EnvelopedSig.KeyInfo”><KeyValue><RSAKeyValue><Modulus>AIvPYJVd5zFrRRrJzB/awFLXb73kSlWqHao+3nxuF38rZPRTkGIKjD7rw4 Vvml7nKlqWg/NhCLWCQFWZ</Modulus><Exponent>AQAB</Exponent></RSAKeyValue></KeyValue></KeyInfo></Signature></PurchaseOrder>

Is it Complex? Is the XML segment is very large? Even though it looks very large insize and complex, actually it is not. Let us explore each element in detail.

Digest

Here we have two <DigestMethod> and <DigestValue> elements in the followingway.

<DigestMethod Algorithm=“http://www.w3.org/2000/09/xmldsig#sha1”/><DigestValue>yHIsORnxE3nAObbjMKVo1qEbToQ=</DigestValue>

DMC 1801

NOTES


Let us define digest first. Digest is nothing but the application of a mathematical algorithm/secured hash to a portion of message, which ensures the data being signed, cannot betampered with. As soon as the digest is created the next step is to add all the additionalsigned information. Then again create the digest of it. You may think is the job over? No itis not yet. Encrypt it again and write it into the XML message itself as the digital signature.

In the above example, the selected algorithm and the initial digest are contained in the<DigestMethod> and <DigestValue> elements.

The final digested, encrypted digital signature is contained in the <SignatureValue>element. The decryption key is stored in the <KeyInfo> element. The recipient can determinewhether the signature is valid by decrypting the digest and also recreating the whole processthat was performed by the sender to create the digest. If the consequential digest matchesthe original, the signed content was almost certainly not tampered with.

The <Reference> Element

Generating the message digest is the next issue. This is done with the help of<Reference> element. The <Reference> element includes the information required to dodata transformation or normalization used along the way, including canonicalization. Forillustration, you can associate a digital signature to an XML document in different ways asspecified below:

Enveloped

The signature is a child of the data being signed.

Enveloping

The signature encloses the data being signed.

Detached

The signature is a sibling of the element being signed and is referenced by a local link,or it can be located elsewhere on the network.

The above information should be specified using the <Transform> tag which is availableinside the signature. In the example specified , we chose to use the enveloped method:

<Transforms><Transform Algorithm= “http://www.w3.org/2000/09/xmldsig#enveloped-signature”/> </Transforms>

Similarly other examples of transforms are base64 encoding, XPATH filtering, XSLTtransformation, and schema validation.


NOTES


As we pointed out earlier, the selected algorithm and the digest are specified withthese tags: <DigestMethod Algorithm= “http://www.w3.org/2000/09/xmldsig#sha1”/> <DigestValue> yHIsORnxE3nAObbjMKVo1qEbToQ= </DigestValue>

Now it is important to take a close look at the <Signature> element. This <Signature>element has <SignedInfo> element; it specifies the data that is actually signed and thealgorithms used to sign it. <SignedInfo> has three elements: <CanonicalizationMethod>,<SignatureMethod>, and <Reference>.

The Signature Method

The next step involved in creating the digest is tracking and specifying the actualmethod used to create the signature (denoted by the <SignatureMethod> element). Afterthe canonical version of the XML is derived, the data that is part of the <SignedInfo>element desires to be converted into the actual signature value (and placed in the<SignatureValue> element). The <SignatureMethod> element specifies the algorithm thatwill be used for this operation.

The algorithm which is used to create the signature and, finally, the signature itself, arespecified in the <SignatureMethod> tag and <SignatureValue> tag:

<SignatureMethod Algorithm= “http://www.w3.org/2000/09/xmldsig#rsa-sha1”/> <Reference Id=”EnvelopedSig.Ref” URI=””> <SignatureValue Id=”EnvelopedSig.SigValue”>

e76Tduvq/N8kVd0SkYf2QZAC+j1IqUPFQe8CNA0CfUrHZdiS4TDDVv4sf0V1c6UBj7

zT 7leCQxAdgpOg/2Cxc=

</SignatureValue>

In this example segment, when the receiver gets the message, the signature is decryptedusing the sender’s public key, the verified digest, and by verifying the sender’s signature.Who has to provide the Key information? In the following listing, the <KeyInfo> elementholds the decryption key:

<KeyInfo Id=”EnvelopedSig.KeyInfo”> <KeyValue> <RSAKeyValue> <Modulus> mJVd5zFrRRrJzB/awFLXb73kSlWqHao+3nxuF38r Rk0HmqgsoKgWVvml7nKlqWg/NhCLWCQFWZ

DMC 1801

NOTES


</Modulus> <Exponent>AQAB</Exponent> </RSAKeyValue> </KeyValue>

</KeyInfo>

Here note that the XML signature doesn’t address trust of such key information.Then it is responsibility of whom? Generally, the application has to determine how trustworthythe key is. But for there is another way to verify that the supplied decryption key doesbelong to the sender, there is little point to the process. Anyone could intercept the message,change its contents, regenerate a public/private key pair, and re-sign the document. Thiswill assert the public key belongs to the sender. This is the place; where the digital certificatescome into the picture.

The certificate contains the binding between the identity of the public key’s ownerand the key itself. For example, if the <KeyInfo> is omitted, the recipient is likely toidentify the key that will be used, based on the application context. This type of issue isaddressed in the XKMS specification, which is discussed in the later. Using XKMS oranother PKI infrastructure, the recipient of the message can obtain the digital certificate,extract the public key from it, and verify that this key does belong to the sender.

5.5 XKMS

5.5.1 Key Management

Keeping the public and private keys, digital signatures, and digital certificates organizedand secure is one of the biggest challenges for deploying all these new encryption, digitalsignature, and authentication technologies. Hence the need for a methodology for themanagement of the security components has been raised. In this progression, the XMLKey Management Specification (XKMS) is been an emerging effort under the backing ofthe W3C. The goal of XKMS is to provide standardized XML-based transaction definitionsfor the management of authentication, encryption, and digital signature services. The previoussection discussed about the XML Encryption and XML Digital Signature specifications.However, these specifications assume that the web service responsible for processing theXML exists in an environment where keys and certificates are kept safe and secure. Theassumption here is that the web service programmer is aware of which certificates andkeys to use. XKMS will provide a set of XML definitions to allow developers to contacta third party. They will be helpful in locating and providing the appropriate keys andcertificates. The usefulness for allowing a third party to do this confidential job is to free theweb service programmer from having to track the availability of keys or certificates andensure their validity.


NOTES


In other words, XKMS will provide a standardized set of XML definitions to do thefollowing:

Allowing developers to contact and use remote trusted third-party services The trusted third-party services will provide the following services:

encryption and decryption services

creation of keys

management of keys

authentication of keys and digital signatures

The specification standards specify a set of tags which is used to query external keymanagement and signature validation services. For example, to know about theauthentication of the certificate, a client might ask a remote service to answer questionssuch as, “Is it a valid certificate?” or, “Provide the value of the key managed by you.” Thusthe facility to manage the keys is provided in XKMS.

XKMS was submitted to the W3C by Microsoft, VeriSign and web-Methods and isbacked by a range of companies like HP, IBM Lenova etc. Thus XKMS is one of thethree W3C specifications that define the XML security architecture.

5.5.2 XKMS Structure

On the whole the XKMS specifies the protocols for distributing and registering publickeys. This is suitable for use in conjunction with the planned standard for XML signatureand as an additional standard for XML encryption. The structure of XKMS contains twosections:

XML Key Information Service Specification (X-KISS)

XML Key Registration Service Specification (X-KRSS)

Let us explore the sections in detail.

XML Key Information Service Specification

X-KISS characterizes a protocol for a reliance service. It helps in managing thepublic-key information contained in documents that confirm to the XML signaturespecification. The basic objective of this protocol design is that relieving the XMLprogrammers from the complex task of writing the code to process the XML signatureds:KeyInfo element. Essentially PKI may be based upon a different specification such asX.509, the international standard for public-key certificates or Pretty Good Privacy (PGP),the widely available public key encryption system. Any trust policy can be utilized alongwith the XML signature specification.

DMC 1801

NOTES


When ever, a person is signing a document it is not necessary to specify any keyinformation except that the value for the element <KeyInfo>. The value includes the keyname, certificate name, key identifier and so on. Otherwise a link may be provided to alocation which contains the required KeyInfo details.

XML Key Registration Service Specification

The Registration of the public key information is done through the protocol X-KRSSspecifies. Once the key is registered it can be used along with other web services. Thesame protocol may be also used for recovery of the private keys. Since the protocolprovides for authentication of the applicant, the key pair public key and private key maybe generated by the applicant. This is the proof of possession of the private key. A meansof communicating the private key to the client is provided if the private key is generated bythe registration service.

The following section explains the key retrieval, location service and validates servicewith some example XML documents:

Key retrieval

If the client wants the decryption key from a remote source, XKMS provides asimple method. Using the tag <Retrieval Method> inside the <KeyInfo> element which isavailable in the XML signature can be used for this. The following segment assumes that aservice exists that can provide information about a given key.

<KeyInfo>

<RetrievalMethod

URI=”http://www.KeyFil.samp/ValidateKey”Type=”http://www.w3.org/2009/01/xmldsig#X509Certificate”/></KeyInfo>

This search for a key is very simple and does not require the service to enforce thevalidity of the key it returns.

Location service

If the application client wants to query a service for public key information then thereare some set of tags available in the location service. If a web service client wants toencrypt something based on the value of the recipient’s public key, then the web serviceclient should know the key value. For this requirement, it has to contact the key locationservice to obtain that key. The following listing shows the <Locate>, <Query>, and<Respond> tags used in the request:


NOTES


<Locate><Query><:KeyInfo><KeyName>Varanam AAyeeram</:KeyName></KeyInfo></Query><Respond><string>KeyName</string><string>KeyValue</string></Respond></Locate>

In this example XML segment, the <Query> tag provides the name of the requestedkey, and the <Respond> element lists the items that the client would like to know about.The response looks like this:<LocateResult><Result>Success</Result><Answer><KeyInfo><KeyName> Varanam AAyeeram </KeyName><KeyValue>the actual key value</KeyValue></KeyInfo></Answer></LocateResult>

Validate Service

The correspondence between the key and an attribute should be validated. Here theValidate Service facility available through a trusted third party can be used to get the jobdone. That third party validates the binding between a key and an attribute. For instance,look into the following query:<Validate><Query><Status>Valid</Status><KeyInfo><KeyName>...</KeyName><KeyValue>...</KeyValue></KeyInfo>

DMC 1801

NOTES


</Query><Respond><string>KeyName</string><string>KeyValue</string></Respond></Validate>

If this query is being sent to the Validate Service then the following result would beproduced.<ValidateResult><Result>Success</Result><Answer><KeyBinding><Status>Valid</Status><KeyID>http://www.xmltcenr.org/assert/20-39 </KeyID><KeyInfo><KeyName>...</KeyName><:KeyValue>...</KeyValue></ds:KeyInfo><ValidityInterval><NotBefore>2000-09-20T12:00:00</NotBefore><NotAfter>2000-10-20T12:00:00</NotAfter></ValidityInterval></KeyBinding></Answer></ValidateResult>

The XML segment clearly indicates that the ‘result’ for the given ‘Query’ is generatedand sent to the application client. The value for the <Result> element is success and itindicates that the request was processed successfully by the service. Similarly the element<Status> indicates the results of the processing. The value ‘valid’ in this case, representsthat the result is Valid.

Here the element <ValidityInterval> is an optional element. It indicates that the timespanfor which the Validate Service’s results are considered valid. Now the question may arise,once the digital certificate or keys are generated are they valid without any time span? It isnot like that they are not unconditionally valid; they can be (and frequently are) assigned aspecific time limit, after which they expire and are no longer valid.


NOTES


In addition, XKMS also defines requests and responses for the following areas:

Key registration

How to register your key information with a third-party KMS?

Key revocation

How to send a request to the third-party KMS to tell it that you no longer want it tomanage the key on your behalf?

Key recovery

If you forgot your private key, then what to do? XKMS gives some solutions to this.It describes how to send a request to obtain the private key and what the response lookslike. The specification does not state the rules under which the private key should bereturned. For example, it may be the policy of the service to cancel the old key and issuea new one after certain period. However, that decision is up to the policy of the individualprovider.

Verisign is one of the primary drivers of XKMS. They have already released a Javatoolkit that supports XKMS development. To download the product, visit http://www.xmltrustcenter.org/xkms/download.htm.

Java Toolkits

IBM XML Security Suite and the Phaos XML Toolkit are some of the JAVA Toolkitsfor XML security available. The toolkits use Xerces and Xalan to parse the XML data.The assembly of signatures is done by using their own APIs. The same is used for encryptingthe data. The Phaos sample simply used parser APIs such asdoc.getElementsByTagName(tagName) to access the element to be encrypted, as shownin the following listing:

// Copyright © Phaos Technologiespublic class XEncryptTest{public static void main (String[] args) throws Exception{ ... // usage, command line args... // get the XML file and retrieve the XML Element to be encryptedFile xmlFile = new File(inputFileName);DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();dbf.setNamespaceAware(true);

DMC 1801

NOTES


DocumentBuilder db = dbf.newDocumentBuilder();Document doc = db.parse(xmlFile);Element inputElement = null;NodeList list = doc.getElementsByTagName(tagName);if (list.getLength() != 0)inputElement = (Element) list.item(0);else{System.err.println(“XML element with tagName “ + tagName + “ unidentified.”);System.exit(1);}// Create a new XEEncryptedData instance with the owner// Document of the input xml file,the data type URI and// the Id “ED” for this EncryptedData element.XEEncryptedData encData= XEEncryptedData.newInstance(doc, “ED”, dataType);... // determine encryption algorithm// set up the EncryptionMethod child elementXEEncryptionMethod encMethod = encData.createEncryptionMethod(algURI);encData.setEncryptionMethod(encMethod);// set up the symmetric key to be used in encryptionSymmetricKey key = null;File keyFile = new File(keyFileName);... // File stuff// set up the ds:KeyInfo child element with the keyNameXSKeyInfo keyInfo = encData.createKeyInfo( );keyInfo.addKeyInfoData(encData.createKeyName(keyName));encData.setKeyInfo(keyInfo);// set a nonce value to be prepended to the plain textbyte[] nonce = new byte[16];encData.setNonce(RandomBitsSource.getDefault().randomBytes(nonce));// encrypt the XML element and replace it with the// newly generated EncryptedData elementSystem.out.print(“Encrypting the XML data ... “);XEEncryptedData newEncData =


NOTES


XEEncryptedData.encryptAndReplace(inputElement, key, encData);System.out.println(“done”);// output the XML Document with the new EncryptedData element to a// file}}

The Phaos toolkit was much easier to set up and run than the IBM toolkit. This pieceof makes a call to encryptAndReplace( ). This method takes the element that we’ve givenit, encrypts it by using the given key, and replaces the original element with the appropriatelytagged, encrypted element.

As a whole, it can be said that Web services security is still an emerging area and properhandling of this portion has to be done by researchers and vendors together.

Single-sign-on

What is this Single-sign-on? It is nothing but the ability for an end user or applicationto access other applications within a secure environment. It has to be done without needingto be validated by each application. The most common example of single-sign-on technologyis in web-based corporate intranet applications.

What is the use of this environment? In this setting, the users may want to use variousapplications that allow access to their timetable, Project schedule, expense reports andhealth benefits. If each user of the application need to be authenticated individually then thefollowing may occur such as in convenienence, slow, and limiting the value of the intranetsite. The single sign on is one of the solution which allows access to all applications withoutadditional intervention after the initial sign on, using a profile that defines what the user isallowed to do.

Many companies provide products for web-based, single-sign-on authentication andauthorization, including companies such as Netegrity, Securant (now a part of RSA), Oblix,and Verisign. These products with the help of an intermediary process which controls andmanages the passing of user credentials from one application to another. Users are assigneda permit that carries their rights information and simultaneously allows them to accessmany applications without the need to authenticate each one. This permit allows applicationswithin the secure environment to shift the burden of authentication and authorization to atrusted third party, leaving the application free to focus on implementation of businesslogic.

The single-sign-on concept is easily extended to web services. Web services can begiven a permit (placed in an XML/SOAP message) that can be used to validate the servicewith other web services. However, the secure use of web services will depend on the

DMC 1801

NOTES


ability to exchange user credentials on a scale never seen before. Individual services willreside in a variety of protected environments, each using various security products andtechnologies. Providing a way to integrate these environments and enable their interoperabilityis critical for the secure and effective use of these services.

Based on XML, the Security Assertion Markup Language (SAML) is an almostcomplete specification proposed by the Organization for the Advancement of StructuredInformation Standards (OASIS). The primary goal of SAML is to enable interoperabilitybetween different systems that provide security services. The SAML specification doesnot define new technology or approaches for authentication or authorization. Rather, itdefines a common XML language that describes the information or outputs generated bythese systems.

5.5.3 Guidelines for signing XML documents

Signing of XML documents needs care, since any change in the document likeintroduction of white space, change of case tend to change the signature. The followingtwo points to be kept in mind when going for signing the document:

Content Presentation techniques may introduce changes Transformation may alter the content

XML relies on transformations and substitutions during the processing of XMLdocuments. For example, if an XML document includes an embedded style sheet orreferences to an external style sheet, the transformed document should be represented tothe user rather than the document without the style sheet. In this case, the signer should becareful to sign not only the original XML but also the other information that may affect thepresentation.

While due consideration is not been given for handling the original and transformeddocument, it will return a different result than intended. As in any security infrastructure,the security of an overall system will depend on the security and integrity of proceduresand personnel as well as procedural enforcement.

Summary

One of the important aspects of web commerce is security. While it is possible to usestandard security protocols to encrypt and authenticate XML, there are matters relating tothe structure and definition of XML and its use in SOAP. Here Soap requires specializedsecurity solutions. W3C has developed XML Encryption and XML Signature to providefor the selective signing and encryption of XML elements and content. We have also seenthe issues of trust handling by XKMS, which builds on the services of XML Signature andXML Encryption and relies on established certificate authorities.


NOTES


QUESTIONS

Part A

1. Define non-repudiation.2. What do you mean by Data integrity?3. What do you mean by confidentiality?4. What are the two basic approaches used to cryptography?5. What kind of cryptography is used in fixed devices such as ATM machines?6. Define digital hashing.7. Who are certificate Authorities (CA)?8. Define canonicalization.9. Define digest.10. Explain key management.11. ——— is one of the three W3C specifications that define the XML security

architecture.12. XKMS specifies the protocols for distributing and registering ————13. The structure of XKMS contains ——— and ———14. What do you mean by single sign-on?15. What are the two key points for signing XML documents?

Part B

1. What are the basic security needs of an e-business application?2. Discuss about single-key cryptography.3. Discuss about Public-key cryptography.4. How does public-key cryptography address the three dimensions of secured

transactions of an e-business application?5. What do you mean by digital signature? Explain.6. Discuss about certificates and private key management.7. Discuss about canonicalization.8. What are the guidelines and high-level rules to convert the document to an xml-c14n-

compliant canonical format?9. Does the XML encryption support the encryption of an entire XML document or

part of it? Explain.10. Explain with a suitable example the process of encrypting an XML data.11. What are the elements of XML digital signature specification?12. What are the steps in generating XML digital signatures?

DMC 1801

NOTES


13. Explain the role of <Reference> element in generating message digest.14. Discuss about <signature> method.15. What is XKMS and what does the standardized set of XML definitions provided by

XKMS do?16. Discuss about XML Key Information Service Specification.17. Discuss about XML Key Registration Service Specification.18. How can a client retrieve a decryption key from a remote source in XKMS?19. How can an application client query a service for public key information in XKMS?20. Discuss about Java toolkits available for XML security.21. What are the guidelines for signing XML documents?

Part C

1. What are the security issues of an opened, loosely-coupled environment?2. Discuss in detail about the two approaches to cryptography.3. Discuss in detail about the XML security framework and its components.4. Explain with a suitable example the process of encrypting an XML data.5. Explain in detail about how XML digital signatures are generated and used.6. Explain how XKMS defines requests and responses for key management.

Objective type questions

1. Authentication isa. Preventing the originator of a document from denying itb. Some one who deliberately tapped the data stream should not be able to understand

the valuable informationc. Ensuring that the information arrived at the destination is the original message,

which has not been tampered with or altered in transitd. None of these

2. What are the two basic approaches used to cryptography?a. Private key cryptographyb. Public key cryptographyc. Single key cryptographyd. b & c

3. What kind of cryptography is used in fixed devices such as ATM machines?a. Private key cryptographyb. Public key cryptographyc. Single key cryptographyd. b & c


NOTES


4. Converting the XML document into a standard format is known asa. Canonicalizationb. Parsingc. Normalizingd. None of these

5. Digital hashing isa. algorithmically generated short string of characters that uniquely identifies a

documentb. electronic equivalent of a written signaturec. converting digital information to written signatured. None of these

6. Digital Signature isa. electronic equivalent of a written signatureb. algorithmically generated short string of characters that uniquely identifies a

documentc. converting digital information to written signatured. None of these

7. Certificate Authorities (CA)a. Provide digital signature servicesb. Provides digital hashing servicesc. “trusted entities” in the Web security, issuing Certificates ensuring about the

confidentiality and authentication of the keys received by the usersd. None of these

8. Guideline(s) to convert the document to an xml-c14n-compliant canonical formata. UTF-8 is the encoding format for the documentb. Before parsing the document, the line breaks are normalizedc. Attribute values are normalized, as if by a validating processord. All of these

9. One of the three XML security technologies driven by W3C isa. XML digital hashing serviceb. XML certificate provision servicec. XML Key Management Servicesd. None of these

10. XML encryption supports the encryptiona. The complete XML documentb. An element and all its sub elements

DMC 1801

NOTES


c. A reference to a resource outside the documentd. All of these

11. Which one is not a XML digital signature element?a. CanonicalizationMethodb. Transformsc. Hash valued. Reference

12. Digest isa. application of a mathematical algorithm/secured hash to a portion of message,

which ensures the data being signed, cannot be tampered withb. appended message for securityc. digital value calculated for a written documentd. None of these

13. The <Reference> element includes the information required to doa. Data transformationb. Normalizationc. Canonicalizationd. All of these

14. The structure of XKMS containsa. XML Key Information Service Specification (X-KISS)b. XML Key Registration Service Specification (X-KRSS)c. Both a & bd. None of these

15. Which one of the following is not defined by XKMSa. Key registrationb. Key revocationc. Key recoveryd. None of these

ANSWERS

1 - a, 2 - d , 3 - c , 4 - a , 5 - a, 6 - a, 7 - c, 8 - d, 9 - c, 10 - d, 11 - c, 12 -a , 13 -d , 14-c , 15 - d


NOTES


NOTES

DMC 1801

NOTES


NOTES

XML and Web Services

Documents

Transcript of XML and Web Services