1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - –...
Transcript of 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - –...
![Page 1: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/1.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 1
XML Data Management
Prof. Dr. Stefan BöttcherUniversity of Paderborn (Germany)
1. XML standards: XML , DTD , XML Schema
2. DOM , SAX , XPath
3. XSLT
![Page 2: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/2.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 2
Data centric XML - XML data storage
<doc><order>
<customer> Alice </customer><PC> pc400 </PC>
</order><order> <customer> Bob </customer>
<PC> pc500 </PC></order><order>
<customer> Carla </customer><PC> pc600 </PC>
</order></doc>
% customer PCorder( ).
).).
order(order(
Alice pc400Bob pc500
Carla pc600
doc
markup (tag)
content
![Page 3: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/3.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 3
eXtended Markup Language (XML)XML - a family of standards:
XML (eXtensible Markup Language) data format exchangable accross different operating systems, applications, and enterprisesoften used for content
XPathpath expressions used for navigation in XML treesused within other XML standards (e.g. XSL(T))
XSL (eXtensible Stylesheet Language)used to describe layout of content / to convert data
many more standards: XQuery ( queries ) , DTD ( type definition ) , XML-Schema ( integrity constraints )
![Page 4: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/4.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 4
Unique Standard for ContentDTD or XML Schema:
defines structure of all XML trees exchanged=> unique data format for all participants
data formats exchangable accross company borders
New data exchange formats and languages based on XML example:
ebXML (E-Business XML) as a basis forOTA (Open Travel Association)
data exchange between travel agency , airline etc.
Consequence of these standards: ( economic ) force to use the standard
![Page 5: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/5.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 5
Separation of content and layout
content (product1.xml)
content (product2.xml)
layout (customer1.xsl)
layout ( technican2.xsl)
HTML file
combines requested data with requested layout
![Page 6: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/6.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 6
Separation of content and layout (2)consequences:
• 1 (content) data source for different layouts(technican, seller, customer, re-seller, ...)
• layout may change without changing content( different logo, different seller or customer,
different employee or job, new view of data )
• reuse 1 layout for different content( frame with company logo, ...)
• content may change without changing layout ( new prices , … )
![Page 7: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/7.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 7
XML on Java servers
• XML + XSL separate layout and content• layout (.xsl file) • content data (.xml file) • combine them in the web server
ServletBrowserHTML-page
client server
calls
generatedHTML page
inputtransformXML+XSL
HTML
XMLfile
XSLfile
![Page 8: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/8.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 8
XML syntaxXML - Prolog:
<?xml version="1.0" encoding="iso-8859-1" standalone="yes"?>
<?xml-stylesheet type="text/xsl" href="xmlbsp1.xsl"?>
XML - main part:
<order><customer> Alice </customer> <PC> pc400 </PC>
</order>
version character set without DTD !
used stylesheet(only inside ie5)
start tag
element
text node
end tag
![Page 9: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/9.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 9
XML syntax (2)In the XML main part: <offers>
<offer supplier=“vobis“ item=“pc500“ ></offer>
<offer supplier=“IBM“ item=“pc600“ / ></offers>
attribute attribute value end of tag (no text)
(arbitrarily) no text node
element
![Page 10: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/10.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 10
XML syntax (3)all tags must be closed
(<tag> ... </tag> or <singleTag />)
incorrectly nested tags not allowed( <tag1> <tag2> ... </tag1> </tag2> )
case-sensitive ( <tag> different from <Tag> )
attribute values must be quoted ( e.g. <p align="center"> )
text must be enclosed in elements
![Page 11: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/11.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 11
XML document as a tree<doc>
<customer name=“Alice“> <order> ...</order>
<address> </address>
</customer><customer>
<order/> <address/>
</customer></doc>
doc
customer customer
addressorder order address
name = “Alice“
![Page 12: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/12.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 12
XML node types7 kinds of nodes:
root - has no parent node
element
text - leaf node (has no child node)attribute - leaf node (has no child node)
comment - leaf node (has no child node)name-space - leaf node (has no child node)processing-instruction - leaf node (has no child node)
![Page 13: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/13.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 13
DTD and XML Schema
DTD ( the older standard ) : + defines the structure (nesting of tags) of the documents
<customer><order>
<item> …+ defines structural dependencies,
e.g. every order contains at least one item element
XML-Schema ( the newer standard ) additionally : + binds XML elements to types defined in the XML Schema+ defines domains+ defines integrity constraints
![Page 14: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/14.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 14
Document-Type-Definition (DTD)<!-- DTD xmlbsp2d.dtd for example xmlbsp2d.mxl --><!ELEMENT orders ( order )* > <!ELEMENT order ( customer , PC ) ><!ELEMENT customer (#PCDATA) ><!ELEMENT PC (#PCDATA) >
<?xml version="1.0" encoding="iso-8859-1" standalone="no"?><!DOCTYPE orders SYSTEM "xmlbsp2d.dtd"><?xml-stylesheet type="text/xsl" href="xmlbsp2.xsl"?><orders>
<order><customer> Alice </customer><PC> pc400 </PC>
</order> <order> ... </order>
</orders>
parsed char data sequence required
arbitrary many
root element
![Page 15: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/15.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 15
Element declarations in DTDs<!ELEMENT PC (#PCDATA) >
<!ELEMENT offer (EMPTY) >
<!ELEMENT supplies (offer) >
<!ELEMENT offers (offer)* >
<!ELEMENT order (customer,PC) >
<!ELEMENT payment (cash|card) >
<!ELEMENT E ((A|B)*,C,(D)?)+ >
text (no elements)
empty
1 sub-element
? 0 or 1 * arbitrary many+ al least 1
sub-element
sequence
choice
parenthesis
![Page 16: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/16.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 16
Attribute Declarations in DTDs<!-- DTD xmlbsp2d.dtd for the example xmlbsp2d.xml --><!ELEMENT offers (offer)* > <!ELEMENT offer (EMPTY) ><!ATTLIST offer supplier CDATA #REQUIRED
item CDATA #REQUIRED >
<offers><offer supplier=“vobis“ item=“pc500“ ></offer><offer supplier=“IBM“ item=“pc600“ / >
</offers>
type(char data)
attribute must occur
arbitrary many
root element
empty
![Page 17: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/17.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 17
Part 1 (XML) - summary
• XML : tree structure for content
• DTD : structure definition
• XML-Schema additionally: type checking and logical consistency checking
well documented standards
http://www.w3c.org
![Page 18: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/18.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 18
Part 2: Search and Navigationin XML Documents
• DOM - Parser
• SAX - Parser
• the XML Path language XPath
![Page 19: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/19.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 19
Axes in XML document trees (1)
doc
customer
addressorder
parent
ancestor
followingfollowing-sibling
PC
user manual
@nrattributedescendant
child
self
ancestor-or-self
descendant-or-self
![Page 20: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/20.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 20
Axes in XML document trees (2)
doc
customer customer
addressorder
<doc><customer> … </customer><customer>
<name> … </name>
<order>
...</order>
<address> …</address>
</customer><customer> … </customer>
</doc>
name
customer
ancestor::
descendant::preceding:: following::
self::
![Page 21: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/21.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 21
Axes in XML document trees (3)The following axes select for a given context node:
• child:: its child nodes• descendant:: its descendants (=children and their descendants)• parent:: the parent node (only root does not have a parent).• ancestor:: nodes on the path to the root (=parent and its anc's). • following-sibling:: siblings have identical parent , following in doc order
(empty for attribute and namespace nodes).• preceding-sibling:: inverse to following sibling
(empty for attribute and namespace nodes).• following:: all nodes following in doc order after context node
(excluding descendant-, attribute- & namespace-nodes). • preceding:: all nodes preceeding in doc order before context node
(excluding ancestor-, attribute- & namespace-nodes).• attribute:: its attributes (empty for each non-element node). • namespace:: its namespace-nodes
(empty for each non-element node).
![Page 22: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/22.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 22
Axes in XML document trees (4)
More axes that can be used in XPath expressions:
the following axes select for a given context node:
• self:: the context node itself
• descendant-or-self:: the context node and its descendants
• ancestor-or-self:: the context node and its ancestors
![Page 23: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/23.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 23
The Document Object Model (DOM)XML–Document as a tree in main memory:
+ the program can navigate arbitrarily from node to nodeeasy to program
- consumes much memory- long loading time until document is in main memory
doc
customer customer
addressorder order address
name = “Alice“
![Page 24: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/24.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 24
DOM Parser Java API (1)DOMParser parser = new DOMParser(); // instantiate parsertry { parser.parse(uri); // parse text found at uri
Document doc = parser.getDocument(); // get document rootrecurseNodes(doc, …); // work on document
} catch (Exception e) { … }
public void recurseNodes(Node node, …) // recursively on all nodes{ … ; switch (node.getNodeType()) // depending on node type
{ case Node.DOCUMENT_NODE: … // if root node …case Node.ELEMENT_NODE: … // if element node …case Node.TEXT_NODE: … // if text node …
}}
![Page 25: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/25.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 25
DOM-Parser-Java-API (2)public void recurseNodes(Node node, …) // recursively on all nodes{ …
String name = node.getNodeName(); // read element name…NodeList nodes = node.getChildNodes(); // collect all childrenfor (int i=0; i<nodes.getLength(); i++)
recurseNodes(nodes.item(i), ""); // call each child node…NamedNodeMap attributes = node.getAttributes(); // get attribute listfor (int i=0; i<attributes.getLength(); i++) {
Node current = attributes.item(i); // get 1 attributeSystem.out.print(" " + current.getNodeName() + // attribute name
"=\"" + current.getNodeValue() + // attribute value"\""); // quote attribute value
}
![Page 26: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/26.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 26
Simple Access to XML (SAX)
Parser accesses at most one XML element node at a time: - can navigate and process nodes only in document order
less flexible programming than DOM+ needs less space in main memory+ loading document nodes into main memory is fast
doc
customer customer
addressorder order address
name = “Alice“
1.
2.
3.
4. 5. 6.7.
1. <doc>2. <customer name=“Alice“> 3. <order> 4. 5. 6. ...
</order>7. <address>
</address></customer>
8. <customer> 9. <order/> 10. <address/>
</customer></doc>
8.
9. 10.
![Page 27: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/27.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 27
SAX-Parser-Java-API// Parser calls this procedure once, when parsing the document startspublic void startDocument() throws SAXException { … }
// SAX parser calls this once for each start tag of an elementpublic void startElement( String namespaceURI, String localName,
String qName, Attributes atts)throws SAXException { … // code example:
for(int i=0; i<atts.getLength(); i++) { // for each attributeout.println( atts.getQName(i) + "=\"" + atts.getValue(i)+"\"");
} // output attribute name and attribute value… }
// SAX parser calls this once for each end tag of an elementpublic void endElement( String namespaceURI, String localName,
String qName) throws SAXException { … }
// SAX parser calls this once when end of document is reachedpublic void endDocument() throws SAXException { … }
![Page 28: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/28.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 28
Navigation along axes of an XML document
XML document
Axes: child-axis /child::doc/child::customer/child::order
/ doc / customer / orderattribute-axis
/child::doc/child::customer/attribute::name/ doc / customer / @ name
doc
customer customer
addressorder order address
name = “Alice“
![Page 29: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/29.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 29
XML Path language XPath (1)/ root element
. current context node
/ child::doc / child:: customer absolute path (starting at root)
. / child::order / child::PC relative path (starting at current context node)
doc
customer
addressorder
name = “Alice“
PC
location steps
XPath expression
![Page 30: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/30.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 30
XPath (2): Retrieval of XML data
XML document
XPath expression:/ child::doc / child::customer [attribute::name=“Alice“] / child::order
doc
customer customer
addressorder order address
name = “Alice“
filter expression
location step
![Page 31: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/31.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 31
XPath (3) – Location stepsXPath-Location-Expression ::=
LocationStep1 / … / LocationStepN (relative path)| / LocationStep1 / … / LocationStepN (absolute path)
e.g. child::customer [attribute::name=“Alice“] / descendant::order
LocationStepI ::= Axis-Specifier ‘::‘ NodeTest ( ‘[‘ FilterExpression ‘]‘ ) *
examples (given in long form)
child::customer [attribute::name=“Alice“] parent:: * node test is always successful (*)descendant-or-self::addressancestor-or-self::* [descendant::customer]
![Page 32: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/32.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 32
XPath (4): Node (name) tests
axis-specifier:: Ename selects only elements (or attributes) with the name Ename that arereachable from the context nodethrough the specified axis
axis-specifier:: * selects all elements (or attributes) that are reachable fromthe context nodethrough the specified axis
example:descendant-or-self:: customer selects all customer descendant nodes
of the current context node
![Page 33: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/33.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 33
Summary XML-Navigation and XPath
• XML : tree structure of dataincludes axes
• DTD and XML-Schema : type checking, consistency checking
• DOM : XML parser - loads the completedocument into main memory
• SAX : XML parser - works in document order
• XPath : declarative path languagesupports qualified search using filters
documentation sources at http://www.w3c.org
![Page 34: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/34.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 34
Part 3:
Transformation of XML Documentsusing XSLT
![Page 35: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/35.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 35
XML- and XSL - examples (1)<?xml version="1.0" encoding="iso-8859-1" standalone="yes"?><?xml-stylesheet type="text/xsl" href="xmlbsp1.xsl"?><order>
<customer>Alice</customer><PC>pc400</PC>
</order>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
</xsl:stylesheet> X S L
X ML
XML+XSL
![Page 36: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/36.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 36
XSL default templates for elements, …
default template for elements and the root:
<xsl:template match="*|/"><xsl:apply-templates/>
</xsl:template>
transform inner nodes
default template for text nodes and attribute nodes:
<xsl:template match="text()|@*"><xsl:value-of select="."/>
</xsl:template>
shows text values and attribute values
![Page 37: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/37.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 37
XML and XSL - examples (1a)<?xml version="1.0" encoding="iso-8859-1" standalone="yes"?>
<order><customer>Alice</customer><PC>pc400</PC>
</order>
<xsl:stylesheetversion="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="order/*">found a succesor node of order
</xsl:template> <!-- does not visit child nodes of visited nodes ! --></xsl:stylesheet>
X S L
X ML
XML+XSL
![Page 38: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/38.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 38
XML and XSL - examples (1b)<?xml version="1.0" encoding="iso-8859-1" standalone="yes"?>
<order><customer>Alice</customer><PC>pc400</PC>
</order>
<xsl:stylesheetversion="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="*"> <!-- applicable to each element (Tag) -->node found<xsl:apply-templates/> <!-- continue with child nodes -->
</xsl:template> <!-- including text nodes (PCDATA) --> </xsl:stylesheet>
X S L
X ML
XML+XSL
![Page 39: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/39.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 39
XML and XSL - example (1c)
<xsl:stylesheet ...><xsl:template match="*"> <!-- applicable to each node -->
node found: <xsl:value-of select="."/> <!-- show text included by current node --> its successor node: <xsl:apply-templates/> <!-- process child nodes too -->
</xsl:template> <!-- text (PCDATA) nodes are processed too --></xsl:stylesheet> X S L
XML+XSL
![Page 40: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/40.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 40
XML and XSL - example (1e)<?xml version="1.0" encoding="iso-8859-1" standalone="yes"?><?xml-stylesheet type="text/xsl" href="xmlbsp1e.xsl"?><order>
<customer>Alice</customer><PC>pc400</PC>
</order>
<?xml version="1.0" encoding="iso-8859-1"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"><xsl:template match="/">
<html> <body> customer an PC – in HTMLcustomer is <xsl:value-of select=„order/customer"/> PC is <xsl:value-of select=„order/PC"/>
</body> </html></xsl:template></xsl:stylesheet>
X S L
X ML
XML+XSL
![Page 41: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/41.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 41
Node types & XSL default templates
default template for elements and the root:
<xsl:template match = "*|/"><xsl:apply-templates/>
</xsl:template>
transform inner nodes
default template for text nodes and attributes:
<xsl:template match="text()|@*"><xsl:value-of select="."/>
</xsl:template>
show text values and attribute values
![Page 42: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/42.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 42
Node types and XSL default templates (2)
default template for comments and processing instructions :
<xsl:template match="comment()|processing-instruction()"></xsl:template>
do nothing with comments and processing instructions
default behaviour for namespace nodes
do not output namespace nodes
![Page 43: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/43.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 43
HTML file and HTML output<html><body><table width="100%" border="1"><tr><td> customer : </td><td> PC : </td>
</tr>
<tr><td>Alice</td><td>pc500</td>
</tr><tr><td>Bob</td><td>pc600</td>
</tr>
</table></body>
</html>
![Page 44: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/44.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 44
XSLT stylesheet and HTML outputfirst Template<html><body><table width="100%" border="1"><tr><td> customer : </td><td> PC : </td>
</tr>start here for every customer node
repeat<tr><td> Name of customer </td><td> PC of customer </td>
</tr>until here for every customer node
</table></body>
</html>separatetemplate
![Page 45: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/45.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 45
XSLT stylesheet and HTML output<xsl:template match="/"><html><body><table width="100%" border="1"><tr><td> customer : </td><td> PC : </td>
</tr><xsl:apply-templates/> <!– work on inner nodes too -->
</table></body>
</html></xsl:template>
<xsl:template match="order"> <!-- for every order node --><tr> <td> <xsl:value-of select="customer"/> </td>
<td> <xsl:value-of select="PC"/> </td></tr>
</xsl:template> <!-- example program xmlbsp2.xsl -->
![Page 46: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/46.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 46
Generating an SQL script file with XSLT<xsl:template match="/">
create table order( customer char(10) , PC char(10) ) ; <xsl:apply-templates/>
</xsl:template>
<xsl:template match="order">insert into order values( <xsl:value-of select="customer"/> ,
<xsl:value-of select="PC"/> ) ; </xsl:template>
example program: xmlbsp2sql.xsl
![Page 47: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/47.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 47
Advantages of XML and XSL
Generate HTML on a web server: transform: data.xml + layout.xsl -> x.html
XML is• transformable by XSL files• transformable by application programs (e.g. written in Java) • compressable and storable as compact data (zip, …) • exchangable accross applications, devices, enterprises, …• can be combined with very many other languages (C#, ...)• ...
![Page 48: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/48.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 48
More advantages of XSLdatabase XML WML or pdf or ps or HTML
database XML database
XML document 1 XML document 2
XML1
XML2
HTML
WML
other company
DB database query XSL
XSL
XSLXSL
DB
XSL
originaldatabase
other database
![Page 49: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/49.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 49
Flat XML database mappingsflat database XML mapping :
database <database> … </database> table <table> … </table> row <row> … </row> attribute value <value> … </value>
advantage(s): + easy to understand+ easy to implement
database
table
row row
value value valuevalue
…… …
…
![Page 50: 1. XML standards: XML , DTD , XML Schema 2. DOM , SAX ... · © Prof. Dr. Stefan Böttcher - – invited talk at UTS - 29/03/06 - XML Data Management / 4 Unique Standard for Content](https://reader030.fdocuments.in/reader030/viewer/2022040401/5e7691530751321c45281f41/html5/thumbnails/50.jpg)
© Prof. Dr. Stefan Böttcher - http://wwwcs.upb.de/cs/boettcher – invited talk at UTS - 29/03/06 - XML Data Management / 50
XSL stylesheet generates database script
create table order( … ) start here for every customer
insert into order values ( name of customer , order of customer,
address of customer ) ; until here for every customer
XML1DB database query
DB-Skript
XSL File:
„layout" and XPath expressionsselecting content fromthe XML file
generates
database scriptDB