XML and the Semi-Structured Data Model

Post on 24-Jan-2016

62 views 0 download

Tags:

description

XML and the Semi-Structured Data Model. Motivation. We have seen that relational databases are very convenient to query. However: There is a LOT of data not in relational databases!! Perhaps the most widely accessed database is the web, and it certainly isn’t a relational database. - PowerPoint PPT Presentation

Transcript of XML and the Semi-Structured Data Model

1

XML and the Semi-Structured Data Model

2

Motivation

• We have seen that relational databases are very convenient to query. However:– There is a LOT of data not in relational

databases!!

• Perhaps the most widely accessed database is the web, and it certainly isn’t a relational database.

3

Documents Vs. Databases

Documents Databases

Paragraphs, Sentences Tables, tuples

Easy for people to understand

Easy for computers to understand

Static Dynamic

4

Querying the Web

• The web can be queried using a search engine, however, we can’t ask questions like:– What is the weather in Zanzibar today?– What is the lowest price for which a Jaguar is sold

on the web?

• Problems:– There are no facilities for asking complex

questions, such as aggregation of data– Words have overloaded meanings (Jaguar)

5

Understanding the Web

• In order to query the web, we must be able to understand it.

• 2 Computer Science Approaches:– Artificial Intelligence Approach– Database Approach

6

Artificial Intelligence Approach

“The web is unstructured and we must deal with it”

• Use techniques for machine learning to understand the web.

• Example: To understand the word “Jaguar” check if it appears on a page with the word car or automobile; or rather with jungle and Africa

• Problem: Such techniques tend to be inexact and have a large percentage of mistakes

7

Database Approach

“The web is unstructured and we will structure it”

• Sometimes problems that are very difficult can be solved easily by enforcing a standard

• Encourage the use of XML as a standard for data exchange on the web

8

Example XML Document<?xml version=“1.0”?>

<transaction>

<account>89-344</account>

<buy shares = “100”>

<ticker exch = “NASDAQ”>WEBM</ticker>

</buy>

<sell shares = “30”>

<ticker exch = “NYSE”>GE</ticker>

</sell>

</transaction>

Opening Tag

Attribute Name

Attribute Value

ElementClosing Tag

9

XML Representation of a Table<?xml version=“1.0”?>

<ROWSET>

<ROW num = “1” >

<ENAME>KING </ENAME>

<SAL>5000</SAL>

</ROW>

<ROW num = “2” >

<ENAME>SCOTT </ENAME>

<SAL>3000</SAL>

</ROW>

</ROWSET>

ENAME SAL

KING 5000

SCOTT 3000

10

Very Unstructured XML

<?xml version=“1.0”?>

<DamageReport>

The insured’s <Vehicle Make = “Volks”> Beetle </Vehicle> broke through the guard rail and plummeted into the ravine. The cause was determined to be <Cause>faulty brakes </Cause>. Amazingly there were no casualties.

</DamageReport>

11

XML Vs. HTML

• XML and HTML are brothers. They are both special cases of SGML.

• HTML has specific tag and attribute names. These are associated with a specific meaning

• XML can have any tag and attribute name. These are not associated with any meaning

• HTML is used to specify visual style• XML is used to specify meaning

12

Rules for Creating XML Documents

13

Rule 1 – XML Declaration

• An XML document should begin with an XML declaration. A simple declaration is:

<?xml version=“1.0”?>

Other things can be specified, such as

character encoding.

14

Rule 2 – Document Element

• Use exactly one top-level document element:

Example:<?xml version=“1.0”?>

<Question> This is legal </Question>

<?xml version=“1.0”?>

<Question> Is this legal? </Question>

<Answer> No. </Answer>

15

Rule 3 – Match Opening and Closing Tags

• XML is case sensitive. The following examples are all illegal

Example:

<Question> This is legal </QUESTION>

<Question> <B> Is this legal? </Question> </B>

16

Rule 4 – Comments

• Comments are between <!-- and --> characters. Comments can’t appear as attribute values or within a tag.

Example:<!-- This is a legal comment -->

<Question <!-- This is illegal -->>

Why is this illegal

<!-- This is a legal comment -->

</Question>

17

Rule 5 – Element Names

• Element and attribute names must be continuous sequences of letters or hyphens or underscores.

Example:Legal Names:

<_legal> <This-is-OK>

I Illegal Names: <2-Part-Question> <Two Part Question>

<Question 4You = “Yes”>

18

Rule 6 – Attribute Values

• Attribute values – go in opening tags.– should be enclosed by matching quotes (‘ or “)– should have only text and not tags

Legal Example:

<Question Poster = “Yitzchak”>Do you like XML? </Question>

<Answer Poster = ‘Yaakov’>I do.</Answer>

19

Rule 6 – Continued

Illegal Examples:

<Question Poster = “Yitzchak’>Do you like XML? </Question>

<Question>Do you like XML? </Question Poster = “Yitzchak”>

<Question Poster = “<first>Yitzchak</first>”>Do you like XML? </Question>

20

Rule 7 – Empty Elements

• Empty elements are elements that do not contain text or nested elements. They can be written in a compact syntax:

<Person First = “Shmuel” Last = “Levy”></Person>

is the same as

<Person First = “Shmuel” Last = “Levy” />

21

Abstract View of XML

22

A Different Data Model

Relational Semi-Structured

Abstract

Model

Sets of tuples

Labeled Directed Graph

Concrete

Model

Tables XML Documents

Standard

for

Storing Data

Data Exchange

23

An Example<?xml version=“1.0”?>

<transaction>

<account>89-344</account>

<buy shares = “100”>

<ticker exch = “NASDAQ”>WEBM</ticker>

</buy>

<sell shares = “30”>

<ticker exch = “NYSE”>GE</ticker>

</sell>

</transaction>

24

Corresponding Treetransaction

account

89-344

buy

ticker

shares

100

NASDAQ WEBM

exch

sell

ticker

shares

30

NYSE GE

exch

25

Using XML

• Quering XML: There are query languages that query XML and return XML. Examples: XQuery, XPath, SQL4X

• Displaying XML: An XML document can have an associated style-sheet which specifies how the document should be translated to HTML. Examples: CSS, XSL

26

Namespaces

• Namespaces are used to attach an accepted meaning to a set of tags.

• Syntax for defining a namespace

<SomeElement xmlns:prefixname=“namespaceURL” >

the namespace will be recognized within the SomeElement element.

27

Example Namespace

<irs:Form id=“1040” xmlns:irs=“http://www.irs.gov”><irs:Name>Tina Wells</irs:Name><PhoneNumber>03-5655666</PhoneNumber>

</irs:Name>

• In order for the namespace to be recognized in all elements, the declaration should be in the document element

28

XSQL Pages

29

What are XSQL Pages?

• XSQL pages are XML documents that have SQL queries embedded in them.

• When a user requests to view an XSQL page, the web server:1. Dynamically computes the embedded queries2. Translates the query results into XML3. Inserts the results in the proper places in the

document4. Transforms the result to HTML if a stylesheet is

given

30

A Simple Example

<?xml version=“1.0”?>

<xsql:query connection=“scott” xmlns:xsql=“urn:oracle-xsql”>

SELECT sname

FROM Sailors

</xsql:query>You should specify the connection and the namespace on the document element

31

Page Seen in Browser

<?xml version=“1.0”?>

<ROWSET>

<ROW num = “1” >

<SNAME>Rusty</SNAME>

</ROW>

<ROW num = “2” >

<SNAME>Justin </SNAME>

</ROW>

</ROWSET>

• A ROWSET element encloses query result

• Each ROW element encloses each row

• Each column in the row is within a tag with its column’s name

32

Another Example

<?xml version=“1.0”?>

<RESULTS connection=“scott” xmlns:xsql=“urn:oracle-xsql”>

Here is something interesting:

<xsql:query>

SELECT sname, age + rating as ra

FROM Sailors

WHERE sid = 13

</xsql:query>

</RESULTS>

33

Resulting Document

<?xml version=“1.0”?>

<RESULTS>

Here is something interesting:

<ROWSET>

<ROW num = “1” >

<SNAME>Rusty</SNAME>

<RA>55</RA>

</ROW>

</ROWSET>

</RESULTS>

34

Using Parameters

• Your page can use parameters. The value of a parameter param is determined in the following fashion:1. The value of the URL parameter param if

supplied2. The value of the HTTP session object param if

supplied3. The value of the closest ancestor’s attribute

named param, if present4. An empty string

35

Example with Parameters

<?xml version=“1.0”?>

<xsql:query connection=“scott” xmlns:xsql=“urn:oracle-xsql”

sname = “Joe”>

SELECT *

FROM Sailors

WHERE sname = ‘{@sname}’

</xsql:query>

36

Evaluating the Query

• Suppose the XSQL document is at:

http://cs.huji.ac.il/~db/query1.xsql• Then, requesting the url:

http://cs.huji.ac.il/~db/query1.xsql?sname=Jim

will return all the details of Jim.• Requesting

http://cs.huji.ac.il/~db/query1.xsql

will return all the details of Joe (the defualt value)

37

A Strange Example

<?xml version=“1.0”?>

<xsql:query connection=“scott” xmlns:xsql=“urn:oracle-xsql”

select = “*” where = “1=1” order=“1”>

SELECT {@select}

FROM {@from}

WHERE {@where}

ORDER BY {@order}

</xsql:query>

38

Customizing Results

• The query tag can have different attributes that customize the query results. Here are some of the important options:– max-rows: The maximum number of rows returned– skip-rows: The number of rows to skip before

returning rows– rowset-element: The name of the rowset element– row-element: The name of the row element

39

Customizing Results

<?xml version=“1.0”?>

<xsql:query connection=“scott” xmlns:xsql=“urn:oracle-xsql”

skip = “0” max-rows=“2” skip-rows={@skip} >

SELECT *

FROM Program

ORDER BY url

</xsql:query>

By calling the same page with different values for skip, we can see the different programs

40

Notes

• An XSQL document can have many queries.• The queries can appear within arbitrary XML

tags

• We can produce XML that has a more nested structure using the CURSOR function...

41

Remembering Subqueries in the SELECT Clause

• Subqueries in the SELECT clause must return a single value. What do we do if we want for each boat, all the sailors who reserved the boat?

• We want each bid to be associated with a table of Sailors data!

42

Using the CURSOR Function

<?xml version=“1.0”?>

<xsql:query connection=“scott” xmlns:xsql=“urn:oracle-xsql”>SELECT bid,

CURSOR(SELECT sid, sname FROM Sailors S, Reserves R WHERE S.sid = R.sid

and R.bid = B.bid) as Reservers

FROM Boats B;</xsql:query>

43

<?xml version=“1.0”?>

<ROWSET>

<ROW num = “1” >

<BID>113</BID>

<RESERVERS>

<RESERVERS_ROW num = “1” >

<SID> 13 </SID>

<SNAME> Joe </SNAME>

</RESERVERS_ROW>

<RESERVERS_ROW num = “2” >

.... </RESERVERS_ROW>

</RESERVERS>

</ROW>

</ROWSET>

Note use of select query alias instead of inner row set and row tags.

44

Setting Page Level Parameters

• The following statement defines a parameter pname. The value of pname is the value in the first column of the first row of the query

• The variable pname will be recognized in the page

<xsql:set-page-param name=“pname”>

SELECT Statement

</xsql:set-page-param>

45

Example<?xml version=“1.0”?>

<page connection=“scott” xmlns:xsql=“urn:oracle-xsql”>

<xsql:set-page-param name=“num-stories”> SELECT headings_num

FROM user_prefs WHERE userid={@user}

</xsql:set-page-param>

<xsql:query max-rows={@num-stories} > SELECT title, url FROM latest_news

</xsql:query>

</page>

46

Another Way to Define a Page Level Parameter

• Page level parameters can also be set with the statement:

<xsql:set-page-param name=“pname” value=“val”/>

• For example:

<xsql:set-page-param name=“num-stories” value=“10”/>

47

Additional Options

• The set-page-param element can have the following attributes:– only-if-unset: If the value is “yes” then the

parameter will be set only if it has no value– ignore-empty-value: If value is “yes” then the

parameter will be set only if its value will not be an empty string

48

Setting Cookie Values

• The following statement defines a parameter pname. The value of pname is the value in the first column of the first row of the query

• The variable pname will be recognized until the cookie expires

<xsql:set-cookie name=“pname”> SELECT Statement

</xsql:set-cookie>

49

Additional Attributes for Set-Cookie

• The set-cookie element can have the following attributes:– max-age: The number of seconds before

the cookie expires (defaults to expire when user exits current browser instance)

– only-if-unset– ignore-empty-value

50

Example

<?xml version=“1.0”?>

<page connection=“scott” xmlns:xsql=“urn:oracle-xsql”>

<xsql:set-cookie name=“siteuser” max-age=“31536000”

only-if-unset=“yes” ignore-empty-value=“yes”> SELECT username

FROM site_users WHERE username= ‘{@username}’ and password=‘{@password}’

</xsql:set-cookie>

<!-- Other Actions Here -->

</page>

51

DML or PL/SQL• We can do DML (update, insert, delete) or call PL/SQL

procedures with the following basic syntax:

<xsql:dml> DML Statement

</xsql:dml>

or

<xsql:dml>BEGIN

Any valid PL/SQL StatementEND;

</xsql:dml>

52

Example<xsql:dml>

INSERT INTO page_requests_log(page,userid) VALUES(‘page12.xsql’, ‘{@siteuser}’)

</xsql:dml>

If successful the following element is added to the page:

<xsql-status action=“xsql:dml” rows=“n” />

Otherwise, an error element is added:<xsql-error action=“xsql:dml”> ...</xsql-error>