Advanced Databases - XML

45
Advanced Databases Mourad Gridach Filière : Génie Informatique. EST d’Agadir Année Universitaire : 2019 - 2020 XML Databases

Transcript of Advanced Databases - XML

Page 1: Advanced Databases - XML

Advanced Databases

Mourad Gridach

Filière : Génie Informatique. EST d’Agadir

Année Universitaire : 2019 - 2020

XML Databases

Page 2: Advanced Databases - XML

Outlineü Well-Formed XML

u DOMu SAX

ü Print XMLu CSSu XSLT

ü Validating XMLu DTDu XSD

ü Querying XMLu XPathu XQuery

2

Page 3: Advanced Databases - XML

Introduction

ü XML : eXtensible Markup Language

ü Semi-structured (hierarchical) model

ü More popular these days

ü Based on trees

ü Standard for data representation and exchange

ü Document format similar to HTML

Ø Tags describe content instead of formatting

3

Page 4: Advanced Databases - XML

Document XML Example

Page 5: Advanced Databases - XML

Basic Concepts

ü Tagged elements (nested)

ü Attributes

ü Text

Page 6: Advanced Databases - XML

Relationnel XML

Structure

Schema

Queries

Implementation

Relationnel Model versus XML

Tables Hierarchy, trees, graphs

Fixed beforeFlexible ( see example before)

Simple (RA, SQL) J Complex L

RDBMS New, integrated in the most RDBMS

Page 7: Advanced Databases - XML

“Well-Formed” XML (1)§ An XML document respect some basic structural requirements

ü Single root element (“Bookstore” in the example below)ü Structured tags, proper nestingü Unique attributes within elements

Page 8: Advanced Databases - XML

“Well-Formed” XML (2)

XMLParser

XMLDocument Parsed XML

“Not well-formed”

§ Example of XML parsers

ü DOM (Document Object Model)

ü SAX (Simple API for XML)

Page 9: Advanced Databases - XML

Displaying XML Document (1)§ Use rule-based language to translate to HTML

ü Cascading StyleSheets (CSS)

ü Extensible Stylesheet Language (XSL)

Page 10: Advanced Databases - XML

Displaying XML Document (2)

CSS/XSLinterpreter

XMLDocument

(data)

Rules

HTMLDocument(look at)

Parser

Page 11: Advanced Databases - XML

XML Data

Validating XML

DTDs and XSD

Page 12: Advanced Databases - XML

“Validating” an XML Document§ Adheres to basic structural requirements (one single root, structured tags, proper

nesting, unique attributes)§ Also adheres to content-specific specification

ü Document Type Descriptor (DTD)ü XML Schema (XSD) (XML Schema Description)

Page 13: Advanced Databases - XML

“Validating” an XML Document

Validating XMLParser

XMLDocument Parsed XML

“Not valid”

DTD or XSD

Page 14: Advanced Databases - XML

Document Type Descriptor (DTD)§ Grammar-like language for specifying elements, attributes, nesting, ordering,

#occurrences

Document XML

DTD

Page 15: Advanced Databases - XML

Examples –Demos

ü Validating an XML document :

u xmllint --valid --noout Bookstore-DTD.xml

ü Examples

1. Change : “Edition CDATA #IMPLIED>” to “Edition CDATA #REQUIRED>”.

2. Add “Edition="2nd"” to the document.

3. Switch the order of “first_name” and “last_name”

ü <First_Name>Jeffrey</First_Name>

ü <Last_Name>Ullman</Last_Name>15

Page 16: Advanced Databases - XML

Examples (2)

1. Add “remark” to the first book :

<remark> </remark>

2. Add a magazine :

<Magazine Month="January" Year="2015">

<Title> Artificial Intelligence Magazine </Title>

</Magazine>

16

Page 17: Advanced Databases - XML

XML Schema (XMD)

Validating XML Documents

Page 18: Advanced Databases - XML

XML Schema (Reminder)

Validating XML

Parser

XMLDocument Parsed XML

“Not valid”

DTD or XSD

Page 19: Advanced Databases - XML

XML Schema Descriptor (XSD)

§ Extensive language

§ Like DTDs, can specify elements, attributes, nesting, ordering, #occurrences

§ Also data types, keys, (typed) pointers, and more

ü XSD is written in XML

ü The XSD file is written separately from the XML file (not like DTDs)

Page 20: Advanced Databases - XML

Basic Concepts

§Data type

§Keys

§References

§Occurrences

Page 21: Advanced Databases - XML

Data Type§ In an XML Schema, we can specify the data type (integer, string, etc.)

§ In this case, « Price » is an « Integer ».

§ If we modify the value of « Price » into « string », we will have an error.

Page 22: Advanced Databases - XML

Keys

§ The equivalent primary keys in relational model

§ We used the keyword “key”

§ It must be unique

§ « @ISBN » is a key for « Book ».

§ « @Ident » is a key for « Author ».

§ Remark: if 2 authors have the same « @Ident », we get an error.

Page 23: Advanced Databases - XML

References § We used « keyRef » to specify a reference

§ It must be unique

§ We use XPath (see later)

§ The attribute « @authIdent » must a reference for « AuthorKey ».

§ The attribute « @book » must a reference for « BookKey ».

Page 24: Advanced Databases - XML

Occurrences (1) § Indicates the number of times an element can appear in an XML document

§ By default, it’s “1”.

§ « minOccurs » specifies the minimum number for an element to appear in an XML documents.

§ « maxOccurs » specifies the maximum number for an element to appear in an XML documents.

Page 25: Advanced Databases - XML

Occurrences (2) - Example

§ For an author, « maxOccurs = unbouded » : many authors.

§ Absence of « minOccurs » : by default is « 1 ».

§ « minOccurs = 0 » : the minimum number of remarks in an XML document.

§ Absence of « maxOccurs » : by default is « 1 ».

Page 26: Advanced Databases - XML

Exemples - Demos

• For validation, we use the command :

xmllint --schema Bookstore.xsd --noout Bookstore-xsd.xml

• Examples

1. Suppose we want to change the type of « Price » into type « string » and let us try to validate the document.

<Book ISBN = "ISBN-0-13-713526-2" Price = ”hi">

2. Let us test the keys :

replace “<Author Ident="JU">” by “<Author Ident="HG">”.

26

Page 27: Advanced Databases - XML

Querying XML

N.B. The Demos are done using “editiX-XML Editor”. You can download it for free

Page 28: Advanced Databases - XML

Introduction

ü Not nearly as mature as Querying Relational

u Newer

u No underlying algebra

ü We will explore :

u XPath : expressions for a path + conditions.

u XQuery : based on XPath (close to SQL)

u Other : XSLT, XLink, XPointer (out of the scope of this course).

28

Page 29: Advanced Databases - XML

Querying XML

XPath

Page 30: Advanced Databases - XML

XPathü XPath : expressions for a path + conditions.

u The XML document is considered as a tree

30

Bookstore

Book Magazine

@ISBN @Price Title Authors

Author

Last_Name First_Name

"Jeff""Ullman"

Book

Author"ISBN-22" "100" "AI"

Page 31: Advanced Databases - XML

XPath : Basic Concepts(1)

1. Expressionsü « / » (slash): root element or separator.

u Example : Bookstore/Book.ü « * » : any thing.

u Example : Bookstore/Book/*ü « @Att » : represents an attribute.

u Example : @ISBNü « // » : represents a descendant of an element (the element is included).

u Example : Bookstore/Book// : the elements « Book », « Title », « Authors », « last name », « first name ».

31

Page 32: Advanced Databases - XML

XPath : Concepts de Bases (2)

2. Conditions

ü The conditions are written between brackets « [condition] ».

ü Example : [@Price < 50].

3. Built in Functions (many)

u « contains (s1, s2) » : if « s1 » contains « s2 ».

u « name() » : return the current tag of an element.32

Page 33: Advanced Databases - XML

XPath : Concepts de Bases (2)

4. Navigation « axes » (13 axes)

u « parent:: » : return the parent element.

u « following-sibling:: » : return the child element.

u « descendants:: » : return the descendants.

u « self:: » : return the current element.

33

Page 34: Advanced Databases - XML

XPath Examples (1)ü Books’ titles :

u /Bookstore/Book/Title

ü We can use « * » :u /Bookstore/ * /Title

ü All the titles :u //Title

ü All the elements :u //*

34

Page 35: Advanced Databases - XML

XPath Examples (2)

ü The « ISBN » of books :

u /Bookstore/Book/@ISBN

ü Books which cost less than 80 Dh :

u /Bookstore/Book[@Price < 80]

ü Books’ titles which cost less than 80 Dh :

u /Bookstore/Book[@Price < 80]/Title35

Page 36: Advanced Databases - XML

XPath Examples (3) (Stop here)

ü Books’ titles which have a remark :

u /Bookstore/Book[Remark]/Title

ü Books’ titles which cost less than 90 DH and “Ullman” is an author:

u /Bookstore/Book[@Price < 90 and Authors/Author/Last_Name= "Ullman"]/Title

ü Books’ titles where the remark contains the word “great”:

u //Book[contains(Remark, "great")]/Title

36

Page 37: Advanced Databases - XML

Querying XML

XQuery

Page 38: Advanced Databases - XML

XQuery

ü Expression language (compositional)

ü Each expression operates on & returns sequence of elements

ü XPath is one type of expression used by XQuery

ü More complex than SQL

38

Page 39: Advanced Databases - XML

XQuery : L’Expression FLWOR

39

For $var in expr

Let $var := expr

Where condition

Order By expr

Return expr

ü All are optional except for « Return ».

ü « For » and « Let » can be repeated and interleaved (intercalés)

ü Close to SQL Produce a set of un « N » elements

Assign one element to variable « var »

Return « M » elements as results

Page 40: Advanced Databases - XML

Mixing Queries and XML

40

ü <Result> { …query goes here… } </Result>

ü { …query… } will be evaluated by the XML

processor.

Page 41: Advanced Databases - XML

XQuery Example (1)

ü Books’ titles which cost less than 90 DH and « Ullman » is an author:

for $b in /Bookstore/Bookwhere $b/@Price < 90

and $b/Authors/Author/Last_Name = "Ullman"return $b/Title

ü In this case « expr » and « for » is an « XPath » expression

41

Page 42: Advanced Databases - XML

XQuery Example (2)

ü Titles and price of books ordered by price

for $b in /Bookstore/Bookorder by xs:int($b/@Price) return <Book>

{ $b/Title }<Price> { $b/data(@Price) } </Price>

</Book>

ü « xs:int($b/@Price) » : to convert a String « @Price » to an Integer.42

Page 43: Advanced Databases - XML

XQuery Examples (3)

ü All authors’ names

for $n in //Last_Name

return $n

ü Remark : We get duplicates.

43

Page 44: Advanced Databases - XML

XQuery Examples (3)

ü Use the function « distinct-values() ».

for $n in distinct-values(//Last_Name)

return $n

ü Change the display:

for $n in distinct-values(//Last_Name)

return <Last_Name> {$n} </Last_Name>

44

Page 45: Advanced Databases - XML

Conclusion

ü XML queries are still in progress compared to the relational model

u Newer

u No relational algebra behind the language

ü We explored:

u XPath : expressions for a path + conditions.

u XQuery : based on XPath (close to SQL)

45