Advanced Databases - XML
Transcript of Advanced Databases - XML
Advanced Databases
Mourad Gridach
Filière : Génie Informatique. EST d’Agadir
Année Universitaire : 2019 - 2020
XML Databases
Outlineü Well-Formed XML
u DOMu SAX
ü Print XMLu CSSu XSLT
ü Validating XMLu DTDu XSD
ü Querying XMLu XPathu XQuery
2
Introduction
ü XML : eXtensible Markup Language
ü Semi-structured (hierarchical) model
ü More popular these days
ü Based on trees
ü Standard for data representation and exchange
ü Document format similar to HTML
Ø Tags describe content instead of formatting
3
Document XML Example
Basic Concepts
ü Tagged elements (nested)
ü Attributes
ü Text
Relationnel XML
Structure
Schema
Queries
Implementation
Relationnel Model versus XML
Tables Hierarchy, trees, graphs
Fixed beforeFlexible ( see example before)
Simple (RA, SQL) J Complex L
RDBMS New, integrated in the most RDBMS
“Well-Formed” XML (1)§ An XML document respect some basic structural requirements
ü Single root element (“Bookstore” in the example below)ü Structured tags, proper nestingü Unique attributes within elements
“Well-Formed” XML (2)
XMLParser
XMLDocument Parsed XML
“Not well-formed”
§ Example of XML parsers
ü DOM (Document Object Model)
ü SAX (Simple API for XML)
Displaying XML Document (1)§ Use rule-based language to translate to HTML
ü Cascading StyleSheets (CSS)
ü Extensible Stylesheet Language (XSL)
Displaying XML Document (2)
CSS/XSLinterpreter
XMLDocument
(data)
Rules
HTMLDocument(look at)
Parser
XML Data
Validating XML
DTDs and XSD
“Validating” an XML Document§ Adheres to basic structural requirements (one single root, structured tags, proper
nesting, unique attributes)§ Also adheres to content-specific specification
ü Document Type Descriptor (DTD)ü XML Schema (XSD) (XML Schema Description)
“Validating” an XML Document
Validating XMLParser
XMLDocument Parsed XML
“Not valid”
DTD or XSD
Document Type Descriptor (DTD)§ Grammar-like language for specifying elements, attributes, nesting, ordering,
#occurrences
Document XML
DTD
Examples –Demos
ü Validating an XML document :
u xmllint --valid --noout Bookstore-DTD.xml
ü Examples
1. Change : “Edition CDATA #IMPLIED>” to “Edition CDATA #REQUIRED>”.
2. Add “Edition="2nd"” to the document.
3. Switch the order of “first_name” and “last_name”
ü <First_Name>Jeffrey</First_Name>
ü <Last_Name>Ullman</Last_Name>15
Examples (2)
1. Add “remark” to the first book :
<remark> </remark>
2. Add a magazine :
<Magazine Month="January" Year="2015">
<Title> Artificial Intelligence Magazine </Title>
</Magazine>
16
XML Schema (XMD)
Validating XML Documents
XML Schema (Reminder)
Validating XML
Parser
XMLDocument Parsed XML
“Not valid”
DTD or XSD
XML Schema Descriptor (XSD)
§ Extensive language
§ Like DTDs, can specify elements, attributes, nesting, ordering, #occurrences
§ Also data types, keys, (typed) pointers, and more
ü XSD is written in XML
ü The XSD file is written separately from the XML file (not like DTDs)
Basic Concepts
§Data type
§Keys
§References
§Occurrences
Data Type§ In an XML Schema, we can specify the data type (integer, string, etc.)
§ In this case, « Price » is an « Integer ».
§ If we modify the value of « Price » into « string », we will have an error.
Keys
§ The equivalent primary keys in relational model
§ We used the keyword “key”
§ It must be unique
§ « @ISBN » is a key for « Book ».
§ « @Ident » is a key for « Author ».
§ Remark: if 2 authors have the same « @Ident », we get an error.
References § We used « keyRef » to specify a reference
§ It must be unique
§ We use XPath (see later)
§ The attribute « @authIdent » must a reference for « AuthorKey ».
§ The attribute « @book » must a reference for « BookKey ».
Occurrences (1) § Indicates the number of times an element can appear in an XML document
§ By default, it’s “1”.
§ « minOccurs » specifies the minimum number for an element to appear in an XML documents.
§ « maxOccurs » specifies the maximum number for an element to appear in an XML documents.
Occurrences (2) - Example
§ For an author, « maxOccurs = unbouded » : many authors.
§ Absence of « minOccurs » : by default is « 1 ».
§ « minOccurs = 0 » : the minimum number of remarks in an XML document.
§ Absence of « maxOccurs » : by default is « 1 ».
Exemples - Demos
• For validation, we use the command :
xmllint --schema Bookstore.xsd --noout Bookstore-xsd.xml
• Examples
1. Suppose we want to change the type of « Price » into type « string » and let us try to validate the document.
<Book ISBN = "ISBN-0-13-713526-2" Price = ”hi">
2. Let us test the keys :
replace “<Author Ident="JU">” by “<Author Ident="HG">”.
26
Querying XML
N.B. The Demos are done using “editiX-XML Editor”. You can download it for free
Introduction
ü Not nearly as mature as Querying Relational
u Newer
u No underlying algebra
ü We will explore :
u XPath : expressions for a path + conditions.
u XQuery : based on XPath (close to SQL)
u Other : XSLT, XLink, XPointer (out of the scope of this course).
28
Querying XML
XPath
XPathü XPath : expressions for a path + conditions.
u The XML document is considered as a tree
30
Bookstore
Book Magazine
@ISBN @Price Title Authors
Author
Last_Name First_Name
"Jeff""Ullman"
Book
Author"ISBN-22" "100" "AI"
XPath : Basic Concepts(1)
1. Expressionsü « / » (slash): root element or separator.
u Example : Bookstore/Book.ü « * » : any thing.
u Example : Bookstore/Book/*ü « @Att » : represents an attribute.
u Example : @ISBNü « // » : represents a descendant of an element (the element is included).
u Example : Bookstore/Book// : the elements « Book », « Title », « Authors », « last name », « first name ».
31
XPath : Concepts de Bases (2)
2. Conditions
ü The conditions are written between brackets « [condition] ».
ü Example : [@Price < 50].
3. Built in Functions (many)
u « contains (s1, s2) » : if « s1 » contains « s2 ».
u « name() » : return the current tag of an element.32
XPath : Concepts de Bases (2)
4. Navigation « axes » (13 axes)
u « parent:: » : return the parent element.
u « following-sibling:: » : return the child element.
u « descendants:: » : return the descendants.
u « self:: » : return the current element.
33
XPath Examples (1)ü Books’ titles :
u /Bookstore/Book/Title
ü We can use « * » :u /Bookstore/ * /Title
ü All the titles :u //Title
ü All the elements :u //*
34
XPath Examples (2)
ü The « ISBN » of books :
u /Bookstore/Book/@ISBN
ü Books which cost less than 80 Dh :
u /Bookstore/Book[@Price < 80]
ü Books’ titles which cost less than 80 Dh :
u /Bookstore/Book[@Price < 80]/Title35
XPath Examples (3) (Stop here)
ü Books’ titles which have a remark :
u /Bookstore/Book[Remark]/Title
ü Books’ titles which cost less than 90 DH and “Ullman” is an author:
u /Bookstore/Book[@Price < 90 and Authors/Author/Last_Name= "Ullman"]/Title
ü Books’ titles where the remark contains the word “great”:
u //Book[contains(Remark, "great")]/Title
36
Querying XML
XQuery
XQuery
ü Expression language (compositional)
ü Each expression operates on & returns sequence of elements
ü XPath is one type of expression used by XQuery
ü More complex than SQL
38
XQuery : L’Expression FLWOR
39
For $var in expr
Let $var := expr
Where condition
Order By expr
Return expr
ü All are optional except for « Return ».
ü « For » and « Let » can be repeated and interleaved (intercalés)
ü Close to SQL Produce a set of un « N » elements
Assign one element to variable « var »
Return « M » elements as results
Mixing Queries and XML
40
ü <Result> { …query goes here… } </Result>
ü { …query… } will be evaluated by the XML
processor.
XQuery Example (1)
ü Books’ titles which cost less than 90 DH and « Ullman » is an author:
for $b in /Bookstore/Bookwhere $b/@Price < 90
and $b/Authors/Author/Last_Name = "Ullman"return $b/Title
ü In this case « expr » and « for » is an « XPath » expression
41
XQuery Example (2)
ü Titles and price of books ordered by price
for $b in /Bookstore/Bookorder by xs:int($b/@Price) return <Book>
{ $b/Title }<Price> { $b/data(@Price) } </Price>
</Book>
ü « xs:int($b/@Price) » : to convert a String « @Price » to an Integer.42
XQuery Examples (3)
ü All authors’ names
for $n in //Last_Name
return $n
ü Remark : We get duplicates.
43
XQuery Examples (3)
ü Use the function « distinct-values() ».
for $n in distinct-values(//Last_Name)
return $n
ü Change the display:
for $n in distinct-values(//Last_Name)
return <Last_Name> {$n} </Last_Name>
44
Conclusion
ü XML queries are still in progress compared to the relational model
u Newer
u No relational algebra behind the language
ü We explored:
u XPath : expressions for a path + conditions.
u XQuery : based on XPath (close to SQL)
45