Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar...

35
Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005

Transcript of Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar...

Page 1: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A:XML and XML Schema

Service-Oriented Computing: Semantics, Processes, Agents– Munindar P. Singh and Michael N. Huhns, Wiley, 2005

Page 2: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 2Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

Highlights of this Chapter

XML and Vocabularies Well-Formedness Namespaces and Qualified Names XML Extensions XML Schema XML Query Languages XPath XSLT Limitations

Page 3: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 3Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

Brief Introduction to XML

Basics Parsing Storage Transformations

Page 4: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 4Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

Markup History

None Ad hoc tags SGML (Standard Generalized Markup L):

complex, few reliable tools HTML (HyperText ML): simple,

unprincipled, mixes structure and display XML (eXtensible ML): simple, yet

extensible subset of SGML to capture new vocabularies Machine processible Comprehensible to people: easier debugging

Page 5: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 5Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XML Basics and Namespaces

<?xml version="1.0"?> <!– not part of the document per se <arbitrary:toptag xmlns=“http://one.default.namespace/if-

needed”xmlns:arbitrary=“http://wherever.it.might.be/arbit-ns”

      xmlns:random=“http://another.one/random-ns”>

    <arbitrary:atag attr1=“v1” attr2=“v2”>Optional text also known as PCDATA

<arbitrary:btag attr1=“v1” attr2=“v2” /></arbitrary:atag><random:simple_tag/><random:atag attr3=“v3”/> <!– compare with

arbitrary:atag above </arbitrary:toptag>

Page 6: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 6Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

Parsing and Validating An XML document maps to a parse tree.

Each tag ends once: nesting structure (one root) Each attribute occurs at most once; quoted string

Well-formed XML documents can be parsed Applications have an explicit or implicit syntax

for their particular XML-based tags If explicit, may be expressed in DTDs and XML

Schemas Best referred to definitions elsewhere XML Schemas, expressed in XML, are superior to DTDs

When docs are produced by external components, they should be validated

Page 7: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 7Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XML Schema A data definition language for XML:

defines a notion of schema validity Same syntax as regular XML documents Local scoping of subelement names Incorporates namespaces Types

Primitive (built-in): string, integer, float, date, … Primitive (built-in): ID (key), IDREF (foreign key) simpleType constructors: list, union Restrictions: intervals, lengths, enumerations,

regex patterns, Flexible ordering of elements

Key and referential integrity constraints

Page 8: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 8Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XML Schema: complexType

Specifies types of elements with structure: Must use a compositor if ¸

1subelements Subelements with types Min and max occurrences (default 1) of

subelements Elements with text content not easy:

ignore EMPTY elements: easy. Example?

Page 9: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 9Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XML Schema: Compositors

Sequence: ordered Can occur within other compositors Allows varying min and max occurrence

All: unordered Must occur directly below root element Max occurrence of each element is 1

Choice: exclusive or Can occur within other compositors

Page 10: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 10Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XML Schema: Key Namespaces

http://www.w3.org/2001/XMLSchema Conventional prefix: xsd Terms for defining schemas: schema, element,

attribute, … The tag schema has an attribute

targetNamespace http://www.w3.org/2001/XMLSchema-

instance Conventional prefix: xsi Terms for use in instances: schemaLocation, null

targetNamespace: user-defined

Page 11: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 11Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XML Schema Instance Doc

<Music xmlns=http://a.b.c/Musexmlns:xsi=“the standard-xsi”

xsi:schemaLocation=“a-schema-as-a-URI a-schema-location-as-a-URL”>

…</Music>Define null values as <aTag

xsi:nil=“true”/>

Page 12: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 12Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

Creating Schema Docs: 1

<schema xmlns=“the-standard-xsd”targetNamespace=“the-target”>

<include schemaLocation=“part-one.xsd”/><include schemaLocation=“part-two.xsd”/>

<!– schemaLocation as in xsd, not xsi

</schema>Included into the same namespace as the

including space.

Page 13: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 13Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

Creating Schema Docs: 2

Use imports instead of include Specify namespaces from which

schemas are to be imported Location of schemas not required and

may be ignored if provided

Page 14: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 14Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

Document Object Model (DOM)

Basis for parsing XML, which provides a node-labeled tree in its API Conceptually simple: traverse by requesting

tag, its attribute values, and its children Processing program reflects document

structure Can edit documents Inefficient for large documents: parses them

first entirely to build the tree even if a tiny part is needed

Page 15: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 15Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

DOM Example [Simeoni 2003]

Element s = d.getDocumentElement();NodeList l =

s.getElementsByTagName(“member”); Element m = (Element) l.item(0);int code = m.getAttribute(“code”);NodeList kids = m.getChildNodes();Node kid = kids.item(0);String tagName =

((Element)kid).getTagName();…

Page 16: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 16Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

Simple API for XML (SAX)

Parser generates a sequence of events: startElement, endElement, …

Programmer implements these as callbacks More control for the programmer

Processing program does not reflect document structure

Page 17: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 17Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

SAX Example [Simeoni 2003]

class MemberProcess extends DefaultHandler {public void startElement (String uri, String n,

String qName, Attributes attrs) {if (n.equals(“member”)) code = attrs.getValue(“code”);if (n.equals(“project”)) inProject = true;buffer.reset(); }

public void endElement (String uri, String n, String qName) {if (n.equals(“project”)) inProject = false;if (n.equals(“member”) && !inProject)

name = buffer.toString().trim(); } }

Page 18: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 18Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

Programming with XML

Current approaches concentrate on structure but ignore meaning Difficult to construct and maintain Treat everything as a string Inadequate type checking can hide

errors Emerging approaches (e.g., JAXB)

provide superior binding from XML to programming languages Primitives such as unmarshal to

materialize an object from XML

Page 19: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 19Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

Uses of XML

Exchanging information across software components

Storing information in nonproprietary format

XML documents represent structured descriptions: Products, services, catalogs Contracts Queries, requests, invocations (as in SOAP)

Data-centric versus document-centric (irregular, heterogeneous data, depend on entire doc for app-specific meaning) views

Page 20: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 20Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

Data-Centric View<relation>

<tuple><attr1>V11</attr1>… <attrn>V1n</attrn></tuple>…<tuple><attr1>Vm1</attr1>… <attrn>Vmn</attrn></tuple>

</relation> Extract and store into DB via

mapping to DB model Regular, homogeneous tags May be expensive if repeatedly

parsed and instantiated

Page 21: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 21Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

Document-Centric View

Storing docs in DBs Use character large objects (clobs)

within DB Store paths to external files containing

docs Combine with some structured

elements with search conditions for both structured elements and unstructured clobs or files

Heterogeneity also complicates mappings to traditional typed OO programming languages

Page 22: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 22Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

Directions Limitations of XML

Doesn’t represent meaning Enables multiple representations for the

same information; transform if models known

Trends: sophisticated approaches for Querying and manipulating XML, e.g.,

XSLT Binding to PLs and DBs Semantics, e.g., RDF, DAML, OWL, …

Page 23: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 23Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XML Query Languages

XPath XPointer XSLT XQuery

Page 24: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 24Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XPath

Model XML documents as trees with nodes Elements Attributes Text (PCDATA) Comments Root node: above root of document

Page 25: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 25Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

Achtung!

Parent in XPath is like parent as traditionally in computer science

Child in XPath is confusing: An attribute is not the child of its parent Makes a difference for certain kinds of

recursion (e.g., apply-templates discussed in XSLT)

Our terminology is based on the traditional terminology: e-children, a-children, t-children Sets via et- or ta-, etc.

Page 26: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 26Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XPath Paths

Leading /: root /: indicates walking down a tree .:current node ..:parent node @attr: to access values for the

given attribute text() comment()

Page 27: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 27Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XPath Navigation

Select children according to position, e.g., [j], where j could be 1 … last()

Descendant-or-self operator, // .//elem finds all elems under the current //elem finds all elems in the document

Ancestors: not needed in this course Wildcard, *:

collects e-children of the node where it is applied, but omits the t-children

@*: finds all attribute values

Page 28: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 28Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XPath Queries

Incorporate selection conditions in XPath Attributes: //Song[@genre=“jazz”] Elements: //Song[starts-with(.//group, “Led”)] Existence of attribute: //Song[@genre] Existence of subelement: //Song[group] Boolean operators: and, not, or Set operator: union (|); none others Arithmetic operators: >, <, … String functions: contains(), concat(), length(), Aggregates: sum(), count()

Page 29: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 29Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XPointer

Combines XPath with URLs URL to get to a document; XPath to

walk down the document Can be used to formulate queries,

e.g., Song-URL#xpointer(//

Song[@genre=“jazz”])

Page 30: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 30Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XSLT

A functional programming language

A stylesheet specifies transformations on a document<?xml version=“1.0”?><?xml-stylesheet type=“text/xsl”

href=“URL-to-dot-xsl”?> <!– the sheet to use <main-tag>…</main-tag>

Page 31: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 31Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XSLT Stylesheets

Use the XSLT namespace, conventionally abbreviated as xsl Includes primitives: Copy-of <for-each select=“…”> <if test=“…”> <choose >

Page 32: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 32Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XSLT Templates: 1

A pattern to specify where a given transform should apply This match only works on the root:<xsl:template match=“/”>…</xsl:template> Only anonymous templates in this

course

Page 33: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 33Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XSLT Templates: 2

Can be applied recursively on the et-children via

<xsl:apply-templates/> By default, if no other template matches,

recursively apply to et-children of current node (ignores attributed) and to root:

<xsl:template match=“*|/”><xsl:apply-templates/>

</xsl:template> Can over-apply; to override the default,

may need an empty template:<xsl:template match=“…”/> <!– e.g., match all text()

Page 34: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 34Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

XSLT Templates: 3

Subtleties of XSLT matching are beyond our scope

Discuss some examples

Page 35: Appendix A: XML and XML Schema Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.

Appendix A 35Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and

Michael Huhns

Appendix A Summary

XML enables information sharing XML is well established

Several aspects are worked out Lots of tools Works with databases and

programming languages XML provides a useful substrate for

service-oriented computing