Extending XML Schemas XML Schemas: Best Practices A set of guidelines for designing XML Schemas...

Post on 01-Jan-2016

228 views 1 download

Tags:

Transcript of Extending XML Schemas XML Schemas: Best Practices A set of guidelines for designing XML Schemas...

Extending XML Schemas

XML Schemas: Best Practices

A set of guidelines for designing XML Schemas

Created by discussions on xml-dev

Not “All Powerful”

• XML Schemas is very powerful• However, it is not "all powerful". There are many constraints that it cannot

express. Here are some examples:– Ensure that the value of the aircraft <Elevation> element is greater than the value of the obstacle

<Height> element.– Ensure that:

• if the value of the attribute, mode, is "air", then the value of the element, <Transportation>, is either airplane or hot-air balloon

• if mode="water" then <Transportation> is either boat or hovercraft• if mode="ground" then <Transportation> is either car or bicycle.

– Ensure that the value of the <PaymentReceived> is equal to the value of <PaymentDue>, where these elements are in separate documents!

• To check all our constraints we will need to supplement XML Schemas with another tool. In this tutorial we will cover each of these examples and show how the constraints may be implemented using either XSLT/XPath or Schematron. At the end we will examine the pros and cons of each approach.

Two Approaches to Extending XML Schemas

• XSLT/XPath– The first approach is to supplement the XSD

document with a stylesheet

• Schematron– The second approach is to embed the additional

constraints within <appinfo> elements in the XSD document. Then, a tool (Schematron) will extract and process those constraints.

Enhancing XML Schemas using XSLT/XPath

Your XMLdata

SchemaValidator

XMLSchema

XSLProcessor

XSLT Stylesheetcontaining code

to check additionalconstraints

valid

valid

Now you know your XML data is valid!

Enhancing XML Schemas using Schematron

Your XMLdata

SchemaValidator

XMLSchema

valid

Now you know your XML data is valid!

Schematron valid

<?xml version="1.0"?><Demo xmlns="http://www.demo.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.demo.org demo.xsd"> <A>10</A> <B>20</B></Demo>

First Example: Verify that A > B

Constraints which must be implemented: 1. <Demo> contains a sequence of elements: <A> then <B> 2. <A> contains an integer; <B> contains an integer 3. The value of element <A> is greater than the value of element <B>Note: an XML Schema document can implement the first two constraints, but not the last.

<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.demo.org" xmlns="http://www.demo.org" elementFormDefault="qualified"> <xsd:element name="Demo"> <xsd:complexType> <xsd:sequence> <xsd:element name="A" type="xsd:integer"/> <xsd:element name="B" type="xsd:integer"/> </xsd:sequence> </xsd:complexType> </xsd:element></xsd:schema>

Demo.xsd

Note that this schema implements constraints 1 and 2.

<?xml version="1.0"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:d="http://www.demo.org" version="1.0"> <xsl:output method="text"/>

<xsl:template match="/"> <xsl:if test="/d:Demo/d:A &lt; /d:Demo/d:B"> <xsl:text>Error! A is less than B</xsl:text> <xsl:text>&#xD;&#xA;</xsl:text> <!-- carriage return --> <xsl:text>A = </xsl:text><xsl:value-of select="/d:Demo/d:A"/> <xsl:text>&#xD;&#xA;</xsl:text> <!-- carriage return --> <xsl:text>B = </xsl:text><xsl:value-of select="/d:Demo/d:B"/> </xsl:if> <xsl:if test="/d:Demo/d:A &gt;= /d:Demo/d:B"> <xsl:text>Instance document is valid</xsl:text> </xsl:if> </xsl:template></xsl:stylesheet>

Demo.xsl (see example01/xslt)

This stylesheet implements constraint 3!

Output

• Here is the output that results when the stylesheet is run against demo.xml:

Error! A is less than BA = 10B = 20

Introduction to Schematron

• Schematron enables you to embed assertions (which are expressed using XPath syntax) within <appinfo> elements.

XMLSchema

SchematronExtract assertionsfrom <appinfo>elements

XMLData

Valid/invalid

Introduction to Schematron (cont.)

• State a constraint using an <assert> element

• A <rule> is comprised of one or more <assert> elements

• A <pattern> is comprised of one or more <rule> elements

<pattern> <rule> <assert> … </assert> <assert> … </assert> </rule></pattern>

The <assert> Element

• Use the assert element to state a constraint

<assert test="d:A &gt; d:B">A should be greater than B</assert>

XPath expression Text description of the constraint

In the <assert> element you express a constraint using an XPathexpression. Additionally, you state the constraint in naturallanguage. The later helps to make your schemas self-documenting.

The <rule> Element

• Use this element to specify the context for the <assert> elements.

<rule context="d:Demo"> <assert test="d:A &gt; d:B">A should be greater than B</assert></rule>

"Within the context of the Demo element, we assert that the Aelement should be greater than the B element."

The value of the context attribute is an XPath expression.

The <diagnostic> Element

• You can associate an <assert> element with a <diagnostic> element.

• Use the <diagnostic> element for printing error messages when the XML data fails the assertion. – In the <diagnostic> element you can use:

• Literal strings

• The <value-of …> element (from XSLT) to print out the value of elements

• The <diagnostic> elements are embedded within a <diagnostics> element.

• The diagnostics go after a <pattern> element.

<pattern name="Check A greater than B"> <rule context="d:Demo"> <assert test="d:A &gt; d:B" diagnostics="lessThan"> A should be greater than B </assert> </rule></pattern><diagnostics> <diagnostic id="lessThan"> Error! A is less than B A = <value-of select="d:A"/> B = <value-of select="d:B"/> </diagnostic></diagnostics>

Schematron Schema

• Recall that Schematron extracts the assertions, rules, etc out of the <appinfo> elements.

• It builds a Schematron schema from the stuff that it extracts.

XMLSchema Extract assertions,

rules, etc from the<appinfo> elements

SchematronSchema

Schematron

XMLData

Valid/invalid

Demo.xsd Demo.xml_sch

Demo.xml

Schematron Schema (cont.)

• Schematron requires all the Schematron elements (e.g., <assert>, <rule>, etc) to be qualified with the namespace prefix sch:– This is how the Schematron engine identifies

the Schematron elements.

<sch:pattern name="Check A greater than B"> <sch:rule context="d:Demo"> <sch:assert test="d:A &gt; d:B" diagnostics="lessThan"> A should be greater than B </sch:assert> </sch:rule></sch:pattern><sch:diagnostics> <sch:diagnostic id="lessThan"> Error! A is less than B A = <sch:value-of select="d:A"/> B = <sch:value-of select="d:B"/> </sch:diagnostic></sch:diagnostics>

Need to Inform Schematron of the Namespace of the XML Data• You need to inform Schematron of the

namespace of the XML data.

• Also, you need to tell it the namespace prefix that you are using in the <appinfo> section to identify the XML data elements.

• Use the <ns> element to do this.

<sch:ns prefix="d" uri="http://www.demo.org"/>

<?xml version="1.0"?><xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.demo.org" xmlns="http://www.demo.org" xmlns:sch="http://www.ascc.net/xml/Schematron” elementFormDefault="qualified"> <xsd:annotation> <xsd:appinfo> <sch:title>Schematron validation</sch:title> <sch:ns prefix="d" uri="http://www.demo.org"/> </xsd:appinfo> </xsd:annotation> <xsd:element name="Demo"> <xsd:annotation> <xsd:appinfo> <sch:pattern name="Check A greater than B"> <sch:rule context="d:Demo"> <sch:assert test="d:A &gt; d:B" diagnostics="lessThan">A should be greater than B</sch:assert> </sch:rule> </sch:pattern> <sch:diagnostics> <sch:diagnostic id="lessThan"> Error! A is less than B. A = <sch:value-of select="d:A"/> B = <sch:value-of select="d:B"/> </sch:diagnostic> </sch:diagnostics> </xsd:appinfo> </xsd:annotation> <xsd:complexType> <xsd:sequence> <xsd:element name="A" type="xsd:integer" minOccurs="1" maxOccurs="1"/> <xsd:element name="B" type="xsd:integer" minOccurs="1" maxOccurs="1"/> </xsd:sequence> </xsd:complexType> </xsd:element></xsd:schema>

See example01 Do Lab1

Example 2

• Constraint: Ensure that an Aircraft’s Elevation is greater than the Height of

all Obstacles <?xml version="1.0"?><Flight xmlns="http://www.aviation.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.aviation.org Aircraft.xsd"> <Aircraft type="Boeing 747"> <Elevation units="feet">3300</Elevation> </Aircraft> <Obstacle type="mountain"> <Height units="feet">26000</Height> </Obstacle></Flight>

Aircraft.xml (see example02)

Check that the aircraft is higherthan all obstacles.In this example itisn’t, so we want tocatch this (kind ofan important check,don’t you agree?)

<?xml version="1.0"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:a="http://www.aviation.org" version="1.0"> <xsl:output method="text"/>

<xsl:template match="/"> <xsl:if test="/a:Flight/a:Aircraft/a:Elevation &lt; /a:Flight/a:Obstacle/a:Height"> <xsl:text>Danger! The aircraft is about to collide with a </xsl:text> <xsl:value-of select="/a:Flight/a:Obstacle/@type"/><xsl:text>!</xsl:text> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:text>aircraft elevation = </xsl:text><xsl:value-of select="/a:Flight/a:Aircraft/a:Elevation"/> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:value-of select="/a:Flight/a:Obstacle/@type"/><xsl:text> height = </xsl:text> <xsl:value-of select="/a:Flight/a:Obstacle/a:Height"/> </xsl:if> <xsl:if test="/a:Flight/a:Aircraft/a:Elevation &gt;= /a:Flight/a:Obstacle/a:Height"> <xsl:text>Instance document is valid</xsl:text> </xsl:if> </xsl:template></xsl:stylesheet>

XSLT Implementation of Checking the Constraint

Aircraft.xsl (see example02)

Output

• Here is the output that results when the stylesheet is run against Aircraft.xml:

Danger! The aircraft is about to collide with a mountain!aircraft elevation = 3300mountain height = 26000

<xsd:annotation> <xsd:appinfo> <sch:pattern name="Check Aircraft Elevation is greater than Obstacle Height"> <sch:rule context="a:Flight"> <sch:assert test="a:Aircraft/a:Elevation &gt; a:Obstacle/a:Height" diagnostics="lessThan"> The Aircraft Elevation should be greater than Obstacle Height. </sch:assert> </sch:rule> </sch:pattern> <sch:diagnostics> <sch:diagnostic id="lessThan"> Danger! The aircraft is about to collide with a <sch:value-of select="a:Obstacle/@type"/>... aircraft elevation = <sch:value-of select="a:Aircraft/a:Elevation"/>... <sch:value-of select="a:Obstacle/@type"/> height = <sch:value-of select="a:Obstacle/a:Height"/> </sch:diagnostics> </xsd:appinfo> </xsd:annotation>

Schematron Implementation of Checking the Constraint

(This annotation is embedded within Aircraft.xsd) Do Lab2

Example 3

• Constraint: Ensure that the value of the <Transportation> element is appropriate for the value specified by the mode attribute (see table below)

mode Transportation

air airplane,hot-air balloon

water boat,hovercraft

ground car,bicycle

<?xml version="1.0"?><JourneyToTibet xmlns="http://www.journey-to-tibet.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.journey-to-tibet.org journeyToTibet.xsd"> <Trip segment="1" mode="air"> <Transportation>boat</Transportation> </Trip> <Trip segment="2" mode="water"> <Transportation>boat</Transportation> </Trip> <Trip segment="3" mode="ground"> <Transportation>car</Transportation> </Trip> <Trip segment="4" mode="water"> <Transportation>hovercraft</Transportation> </Trip> <Trip segment="5" mode="air"> <Transportation>hot-air balloon</Transportation> </Trip></JourneyToTibet>

journeyToTibet.xml (see example03)

Need to check each Trip elementto ensure thatthe Transportationelement has a legal value for the given mode.In this examplethe first Trip(segment) isillegal and theremaining segments arelegal.

<xsl:template match="j:JourneyToTibet"> <xsl:for-each select="j:Trip"> <xsl:choose> <xsl:when test="(@mode='air') and (not((j:Transportation='airplane') or (j:Transportation='hot-air balloon')))"> <xsl:text>Error! When the mode is 'air' then Transportation must be either airplane or hot-air balloon</xsl:text> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:text> segment = </xsl:text><xsl:value-of select="@segment"/> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:text> mode = </xsl:text><xsl:value-of select="@mode"/> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:text> Transportation = </xsl:text><xsl:value-of select="j:Transportation"/> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:text>&#xD;&#xA;</xsl:text> </xsl:when> <xsl:when test="(@mode='water') and (not((j:Transportation='boat') or (j:Transportation='hovercraft')))"> <xsl:text>Error! When the mode is 'water' then Transportation must be either boat or hovercraft</xsl:text> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:text> segment = </xsl:text><xsl:value-of select="@segment"/> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:text> mode = </xsl:text><xsl:value-of select="@mode"/> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:text> Transportation = </xsl:text><xsl:value-of select="j:Transportation"/> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:text>&#xD;&#xA;</xsl:text> </xsl:when>

XSLT Implementation of Checking the Constraint

Cont. -->

<xsl:when test="(@mode='ground') and (not((j:Transportation='car') or (j:Transportation='bicycle')))"> <xsl:text>Error! When the mode is 'ground' then Transportation must be either car or bicycle</xsl:text> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:text> segment = </xsl:text><xsl:value-of select="@segment"/> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:text> mode = </xsl:text><xsl:value-of select="@mode"/> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:text> Transportation = </xsl:text><xsl:value-of select="j:Transportation"/> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:text>&#xD;&#xA;</xsl:text> </xsl:when> <xsl:otherwise> <xsl:text>segment </xsl:text><xsl:value-of select="@segment"/><xsl:text> is valid.</xsl:text> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:text>&#xD;&#xA;</xsl:text> </xsl:otherwise> </xsl:choose> </xsl:for-each> </xsl:template>

journeyToTibet.xsl (see example03)

Output

• Here is the output that results when the stylesheet is run against journeyToTibet.xml:

Error! When the mode is 'air' then Transportation must be either airplane or hot-air balloon segment = 1 mode = air Transportation = boat

segment 2 is valid.

segment 3 is valid.

segment 4 is valid.

segment 5 is valid.

<xsd:annotation> <xsd:appinfo> <sch:pattern name="Check the Transportation value is appropriate for the given mode"> <sch:rule context="j:Trip"> <sch:assert test="((@mode='air') and (j:Transportation='airplane')) or ((@mode='air') and (j:Transportation='hot-air balloon')) or ((@mode='water') and (j:Transportation='boat')) or ((@mode='water') and (j:Transportation='hovercraft')) or ((@mode='ground') and (j:Transportation='car')) or ((@mode='ground') and (j:Transportation='bicycle'))" diagnostics="wrongTransportationForTheMode"> If the mode is air then Transportation should be either airplane or hot-air balloon. If the mode is water then Transportation should be either boat or hovercraft. If the mode is ground then Transportation should be either car or bicycle. </sch:assert> </sch:rule> </sch:pattern> <sch:diagnostics> <sch:diagnostic id="wrongTransportationForTheMode"> Error! Transportation does not have a legal value for the given mode. segment = <sch:value-of select="@segment"/> mode = <sch:value-of select="@mode"/> Transportation = <sch:value-of select="j:Transportation"/> </sch:diagnostic> </sch:diagnostics> </xsd:appinfo></xsd:annotation>

(This annotation is embedded within journeyToTibet.xsd)

Schematron Implementation of Checking the Constraint

<xsd:appinfo> <sch:pattern name="Check the Transportation value is appropriate for the given mode"> <sch:rule context="j:Trip[@mode='air']"> <sch:assert test="(j:Transportation='airplane') or (j:Transportation='hot-air balloon')" diagnostics="wrongTransportationForTheMode"> If the mode is air then Transportation should be either airplane or hot-air balloon. </sch:assert> </sch:rule> <sch:rule context="j:Trip[@mode='water']"> <sch:assert test="(j:Transportation='boat') or (j:Transportation='hovercraft')" diagnostics=" wrongTransportationForTheMode "> If the mode is water then Transportation should be either boat or hovercraft. </sch:assert> </sch:rule> <sch:rule context="j:Trip[@mode='ground']"> <sch:assert test="(j:Transportation='car') or (j:Transportation='bicycle')" diagnostics=" wrongTransportationForTheMode "> If the mode is ground then Transportation should be either car or bicycle. </sch:assert> </sch:rule> </sch:pattern> <sch:diagnostics> <sch:diagnostic id=" wrongTransportationForTheMode "> Error! Transportation does not have a legal value for the given mode. segment = <sch:value-of select="@segment"/>. mode = <sch:value-of select="@mode"/>. Transportation = <sch:value-of select="j:Transportation"/>. </sch:diagnostic> </sch:diagnostics></xsd:appinfo>

Alternate Schematron Implementation

(see journeyToTibet_v2.xsd in example03)

Observation

• In this last example there is a definite difference in the complexity of implementations. Schematron provides a much simpler way of expressing the constraints than XSLT.

Do Lab3

Example 4

• Constraint: Ensure that the value of the <PaymentReceived> equals the value of <PaymentDue>, where these two elements are in different documents.

Invoice.xml

<PaymentDue>…</PaymentDue>

CheckingAccount.xml

<PaymentReceived>…</PaymentReceived>

Check that these are equal.

<?xml version="1.0"?><Invoice xmlns="http://www.invoice.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.invoice.org Invoice.xsd">

<PaymentDue currency="USD">199.00</PaymentDue></Invoice>

Invoice.xml (see example04)

<?xml version="1.0"?><CheckingAccount xmlns="http://www.checking-account.org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation= "http://www.checking-account.org CheckingAccount.xsd">

<PaymentReceived currency="USD">79.00</PaymentReceived></CheckingAccount>

CheckingAccount.xml (see example04)

<xsl:template match="i:Invoice"> <xsl:variable name="checking-account" select="document('file://localhost/.../CheckingAccount.xml')"/> <xsl:if test="not(i:PaymentDue = $checking-account/c:CheckingAccount/c:PaymentReceived)"> <xsl:text>Payment due is NOT equal to payment received!</xsl:text> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:text> Payment due = </xsl:text><xsl:value-of select="i:PaymentDue"/> <xsl:text>&#xD;&#xA;</xsl:text> <xsl:text> Payment received = </xsl:text> <xsl:value-of select="$checking-account/c:CheckingAccount/c:PaymentReceived"/> </xsl:if> <xsl:if test="i:PaymentDue = $checking-account/c:CheckingAccount/c:PaymentReceived"> <xsl:text>Instance document is valid</xsl:text> </xsl:if> </xsl:template>

XSLT Implementation of Checking the Constraint

CheckDocuments.xsl (see example04)

Output

• Here is the output that results when the stylesheet is run against Invoice.xml and CheckingAccount.xml:

Payment due is NOT equal to payment received! Payment due = 199.00 Payment received = 79.00

<xsd:appinfo> <sch:pattern name="Check payment due equals payment received"> <sch:rule context="i:Invoice"> <sch:assert test="i:PaymentDue = document('file://.../CheckingAccount.xml')/c:CheckingAccount/c:PaymentReceived" diagnostics="amountsDiffer"> Payment due should equal payment received. </sch:assert> </sch:rule> </sch:pattern> <sch:diagnostics> <sch:diagnostic id="amountsDiffer"> Payment due is NOT equal to payment received! ... Payment due = <sch:value-of select="i:PaymentDue"/>... Payment received = <sch:value-of select="document('file://.../CheckingAccount.xml')/c:CheckingAccount/c:PaymentReceived"/> </sch:diagnostic> </sch:diagnostics></xsd:appinfo>

Schematron Implementation of Checking the Constraint

(This appinfo is embedded within Invoice.xsd)

Advantages of using XSLT/XPath to Implement

Additional Constraints– Application Specific Constraint Checking: Each application can

create its own stylesheet to check constraints that are unique to the application. Thus, each application can enhance the schema without touching it!

– Core Technology: XSLT/XPath is a "core technology" which is well supported, well understood, and with lots of material written on it.

– Expressive Power: XSLT/XPath is a very powerful language. Most, if not every, constraint that you might ever need to express can be expressed using XSLT/XPath. Thus you don't have to learn multiple schema languages to express your additional constraints

– Long Term Support: XSLT/XPath is well supported, and will be

around for a long time.

Disadvantages of using XSLT/XPath to Implement

Additional Constraints– Separate Documents: With this approach you write your XML

Schema document, then you write a separate XSLT/XPath document to express additional constraints. Keeping the two documents in synch needs to be carefully managed

– Overkill? On the previous slide it was stated that an advantage of using XSLT/XPath is that it a very rich, expressive language. As a language for expressing constraints perhaps it is too much. As we have seen with these examples, only a few XSLT/XPath elements were needed to express all the contraints. So XSLT/XPath gives a lot of unnecessary overhead.

Advantages of using Schematron to Implement Additional

Constraints– Collocated Constraints: We saw how Schematron could be used to

express additional constraints. We saw that you embed the Schematron directives within the XML Schema document. There is something very appealing about having all the constraints expressed within one

document rather than being dispersed over multiple documents. – Application Specific Constraint Checking: In all of our examples we

showed the Schematron elements embedded within the XSD document. However, that is not necessary. Alternatively, you can create a standalone Schematron schema. With this later approach applications can then create a Schematron schema to express their unique constraints.

– Simpler: The Schematron vocabulary is very simple, comprised of a handful of elements. As we have seen, with this handful of elements we are able to express all the desired assertions.

Disadvantages of using Schematron to Implement

Additional Constraints– Multiple Schema Languages may be Required: From the

examples shown in this tutorial it appears that Schematron is sufficiently powerful to express any additional constraints that you might have. It is possible, however, that there are constraints that it cannot express, or cannot express easily. Consequently, you may need to supplement Schematron with another language to express

all the additional constraints.

Inside Schematron

• The “Schematron engine” is actually just an XSL stylesheet!

XMLSchema Extract assertions,

rules, etc from the<appinfo> elements

SchematronSchema

validate.xsl

XMLData

valid/invalid

Demo.xsd Demo.xml_sch

Demo.xml

XSD2Schtrn.xsl

Convert to a stylesheet

Schematron-basic.xslskeleton1-5.xsl

XSLProcessor