On Views and XML
description
Transcript of On Views and XML
On Views and XML
Serge AbiteboulINRIAPODS 1999
05/99 Views and XML - Serge Abiteboul 2
Organization
Introduction XML View := Query
+:= Change Control+:= Objects+:= Structured & Semistructured Data+:= Active Features +:= Incomplete Information+:= more...
05/99 Views and XML - Serge Abiteboul 3
This is not a survey on database views This is not a tutorial on XML
This is about the use of XML&ecommerce as excuses to survey some works on views cast in a fashionable context: O2views, views of OEM, ActiveViews, Lorel/Ozone...(and also motivate future works)
Warning
Executive Summary: Database folks should be interested in XML Views and more and more are
Footnote: this is a great way to recycle your old results on views, incomplete information, deductive databases, universal instance assumption, dependency theory, etc.
05/99 Views and XML - Serge Abiteboul 5
Introduction: XML in short
Document mark-up language; descendant of SGML
Standard for data exchange on the Web
We are interested here in data exchange and not in document editing and retrieval
05/99 Views and XML - Serge Abiteboul 6
EXAMPLE: EDI Electronic Data Interchange
Standard for business data exchange 2 standards:
ANSI X12 in US -- all B2G by end 1999 EDIFACT in world -- UN committee
translate EDI transmit
05/99 Views and XML - Serge Abiteboul 7
<!DOCTYPE Book-Order PUBLIC "-//Editor//DTD Book Order Message//EN">
<Book-Order Supplier="4012345000094" Send-to="http://www.bic.org/order.in">
<title>Editor Lite-EDI Book Ordering</title> <Order-No>967634</Order-No>
<Message-Date>19961002</Message-Date> <Buyer-EAN>5412345000176</Buyer-EAN>
<Order-Line Reference-No="0528837">
<ISBN>0316907235</ISBN>
<Author-Title>Labaln, Brian/Chrome</Author-Title>
<Quantity>2</Quantity>
</Order-Line>
<Order-Line Reference-No="0528838">
<ISBN>0856674427</ISBN>
<Author-Title>Parry, Linda (ed)/William Morris</Author-Title>
<Quantity>1</Quantity>
</Order-Line><input type="checkbox" name="partial" value="allowed"/>
<text>Tick here if a delayed/partial supply of order is acceptable</text>
<input type="checkbox" name="confirmation" value="requested"/>
<text>Tick here if Confirmation of Acceptance of Order is to be returned by e-mail</text>
<input type="checkbox" name="DeliveryNote" value="required"/>
<text>Tick here if e-mail Delivery Note is required to confirm details of delivery</text>
<E-Address>E-mail address: <input name="e-address" size="25"></input></E-Address>
<Language>Please respond in:<select name="response-language">
<option value="EN" selected>English</option><option value="FR">Français</option>
<option value="DE">Deutsch</option> <option value="ES">Espagnol</option>
<option value="IT">Italian</option> </select></language>
<input type="submit" value="Press here to send completed form to supplier">
</Book-Order>
data in XML/EDI
05/99 Views and XML - Serge Abiteboul 8
I personally prefer:
05/99 Views and XML - Serge Abiteboul 9
XML
Some noise and confusion Is the syntax important? No What is XML?
the means to exchange tree/graph data on the Web
an object-oriented API for it more
05/99 Views and XML - Serge Abiteboul 10
A (simplified) model for XML
XML-tree :- list(node)node :- string | element | ref nodeelement :- label list(att : string)
list(node) label :- string att :- string an attribute occurs at most once
05/99 Views and XML - Serge Abiteboul 11
XML in short
<person> <name>Serge Abiteboul</name>PODS invited speaker <a xml:link=`simple’ href=“gif/serge.gif”> old picture</a><address> <city>Le Chesnay</city><zip>92310</zip></address> <a xml:link=`simple’ href=“www-rocq.inria.fr/~abitebou”>Web</a>
</person>
DTD: grammar DCD: some typingDOM: object API RDF: meta dataXPOINTER/XLINK ...
05/99 Views and XML - Serge Abiteboul 12
XML Views
Webbrowsers
Webbrowsers
Webbrowsers
Viewserver
QueryPublish&subscribe Crawler&filter engineSecurity managerRequest brokerBusiness intelligenceOutput/report/delivery
Information repository
DataWarehouse
OLAP
Imagevideo
reports
What databases can bring to XML is query optimization and query rewriting
View := Query
05/99 Views and XML - Serge Abiteboul 14
View = Query
like for relational model use of query optimization techniques use of query rewriting techniques processing queries using views
main issue: virtual vs. materialized
05/99 Views and XML - Serge Abiteboul 15
B2C: Comparative Shopping
http://www.addall.com
24 bookstores searched in about 10 seconds
between $42 and $78 that’s why people will use them!
What DB can bring to XML is the control of changes
View +:= Change Control
05/99 Views and XML - Serge Abiteboul 17
Some of the most studied problems for relational views
update propagation: incremental updates view update problem
05/99 Views and XML - Serge Abiteboul 18
D2V: Incremental Updates
a customer has loaded portions of the catalog
some prices change no need to reload the entire catalog
many such examples on the Web updates
05/99 Views and XML - Serge Abiteboul 19
V2D: View Update
Sometimes considered less of an issue: the Web is read only!
Many Web applications involve updates We may be able to annotate the
products of the catalog some of the data is in read mode some data is not visible (this is only a view!) some data may be updated
05/99 Views and XML - Serge Abiteboul 20
Example: Change Detection
A customer (self) is in a department (self.department) and may want to see only the current promotions of products in this department (MyPromotions)
let MyPromotions beselect I.*from I in Catalog.promotions.item where I.department = self.department
05/99 Views and XML - Serge Abiteboul 21
Query Subscription: Changes [from Chawathe’s thesis]
Changes in label graphs : as in DOEM
Catalog
promotion
name
department
price
Gismos78
electronic
£234description
super sale
£27899/02/01
01/05/03
item
departmentself
05/99 Views and XML - Serge Abiteboul 22
Query Subscription: Changes
Change value of atomic vertex value Creation of new vertex Addition/removal of an edge
Change of the label on an edge: add/remove
Move a vertex: add/remove
annotations on edges and vertexes
05/99 Views and XML - Serge Abiteboul 23
Query Subscription: Queries
select P.code, P.description
from P in Catalog.product
where P.price <changed>Q vertex annotation
where P.<added>description edge annotation
where P.price data in annotation
<changed <old=Q’, date T>>Q
and Q - Q’ > 100 and T > “99/04/03”
05/99 Views and XML - Serge Abiteboul 24
Query Subscription: Examples On the first of each month, send me the list
of all products in my interest list such that their price increased by more than 10%
Each time there are ten new employees, send me their names and departments
Notify me if the price of this house decreases
similarity on event when condition do action
XML +:= World of Objects
The underlying model for XML is object-based and XML views should be based on OO(DB) technology
05/99 Views and XML - Serge Abiteboul 26
Views +:= World of objects
API for XML: Domain Object Model Views XML as object-oriented Allows designing C++ or Java
applications E.g.:
use subclass Promotion of XMLNode Catalog.promotions is only a set of virtual elements
the list of promotions is generated on demand based on the nature of customers
05/99 Views and XML - Serge Abiteboul 27
Views in OODB: O2Views
Virtual values like for relational views entirely virtual XML document, e.g., view of
relational data virtual attributes
e.g., product: code, name, price,…alternatives = the set of products thatare “similar” and are on promotion
05/99 Views and XML - Serge Abiteboul 28
Views in OODB: O2Views
Virtual class: a set of database objects that are grouped together and as such acquire a new interface catalog1/DTD1,…,catalog17/DTD17 products are represented differently in
each catalog unique DTD that allows to view all products each product can be “viewed” with that
DTD
05/99 Views and XML - Serge Abiteboul 29
Views in OODB: O2Views
Imaginary class: groups objects that are all virtual, e.g., join of two relations
For more: see Souza’s thesis
XML data/views +:= semistructured + structured data
XML should also allow the exchange of structured data as in relational/ODMG models
05/99 Views and XML - Serge Abiteboul 31
Semistructured + Structured Data
If we know about the structure of data, not using it may damage performance
The use of structure facilitates the programming of applications, e.g., in Java
Structure may be useful to explain data to users
For more: see Lahiri’s thesis [and Ozone = OQL + Lorel ]
05/99 Views and XML - Serge Abiteboul 32
Web catalog - continued
Product-basic all productscategory=electronic, subcategory=sound,name=Gismo223, code=F2GHYYRF,selling-price=1200FF
Product-specific for Gismos onlyvoltage=list(110,220), Gismo-norm=GHTF333
External resourcesdescription=http://m.ec.fr/cat/Gismoreviews=http://reviews.com/Gismo
Private databuying-price=100$, quantity-in-stock=20000, supplier=Sears, authorized-discount=30%
05/99 Views and XML - Serge Abiteboul 33
This data in XML<product>
<basic> <cat> electronic <subcat >sound </subcat><cat> <n>Gismo223 </n><c>F2GHYYRF</c><sp currency=French-franc>1200</sp> </basic>
<specific><v>110</v><v>220</v> <Gismo-norm>=GHTF333</Gismo-norm> </specific>
<external> … </external><private>
<bp currency=dollar>100</bp> <qis>20000</qis>, <s>Sears</s> <ad>30</ad></private><\product>
05/99 Views and XML - Serge Abiteboul 34
What is such data exactly?
A mix of structured and semistructured data with pointers between two worlds
Purely XML. Then use a relation as a materialized viewProduct(name, code, category, subcategory, price,
rest) Index on name and subcategory select P.name, P.price from P in Product
where P.subcategory = “sound”
05/99 Views and XML - Serge Abiteboul 35
Digression: storage of XML
as blobs generic mapping : ignore the
structure specific mapping
relational object
hybrid
05/99 Views and XML - Serge Abiteboul 36
As blobs
<product> <basic> <cat> electronic <subcat >sound </subcat><cat> <n>Gismo22</n><c>F2GHYYRF</c> <sp currency=French-franc>1200</sp> </basic> <specific> <v>110</v><v>220</v> <Gismo-norm>=GHTF333</Gismo-norm> </specific> <external> … </external> <private> <bp currency=dollar>100</bp> <qis>20000</qis>, <s>Sears</s> <ad>30</ad></private><\product>
+ full-text index
05/99 Views and XML - Serge Abiteboul 37
Generic mapping
root product o1 o3 electronico1 basic o2 o4 sound o2 cat o3 o5 Gismo223 o2 subcat o4 o6 F2GHYYRFo2 n o5 o7 1200...o2 c o6o2 sp o7...
o7 currency French-franco12 currency dollar...
05/99 Views and XML - Serge Abiteboul 38
Specific
Class Product type tuple( cat:string; subcat:set(string);
n: string, c:string; price: Price; specific: OEM;
external: list(tuple(label:string;val:URL));
private pr: tuple(bp:Price; qis: integer; supplier: Company; ) )
type Price : tuple(sum:int, currency:Currency);
05/99 Views and XML - Serge Abiteboul 39
What is better? Hybrid?
Need for comparative studies My feeling/common sense?:
Use structure for very structured portions of data
Use semistructured for less so or portions with very evolving structures
Use blobs for components accessed mostly via full-text indexing, e.g., paragraphs in a document
Views += Active Features
05/99 Views and XML - Serge Abiteboul 41
Active Views
System developed at INRIA Long term goals:
Declarative specification of data intensive applications with cooperation between partners
Ease of use and fast deployment (Automatic) verification
05/99 Views and XML - Serge Abiteboul 42
ArchitectureArchitecture
O2 O2
XMLrepositoryXMLrepository
Java ClientJava Client
Java RMIJava RMI
Web BrowserWeb Browser
O2 NotificationO2 Notification
JAVAJAVAAVApi
Java applicationJava application
DOM
05/99 Views and XML - Serge Abiteboul 43
Motivations Database Applications:
passive behavior closed systems persistence, concurrency, access control
New needs interactions between clients: e.g., notification change control reactive behavior E.g: e-Commerce, cooperative work
05/99 Views and XML - Serge Abiteboul 44
Illustration of Interactions: Notification
In the vendor view:
when Customer.entersDept(dept)if dept = self.deptthen notifyme
05/99 Views and XML - Serge Abiteboul 45
Notification
AVServer
AVClient customer
AVClient vendorin book dept
AVServer entersDeptentersDeptbookbook
notifynotify
notifynotify
05/99 Views and XML - Serge Abiteboul 46
Illustration of Interaction : Change Control
In the customer viewlet monitored MyPromotions be
s elect I.name, I.pricefrom I in Catalog.promotions.item where I.department = self.department
read, write, append, monitored, refresh, deferred…
simpler case: monitoring of the catalog
05/99 Views and XML - Serge Abiteboul 47
Change control
AVServer
AVClient
AVClient
1 Read
2 Read
3 Modification
4 Write
5 Notification
6 Notification
AVServer
7.Read
05/99 Views and XML - Serge Abiteboul 48
Choices
All XML XML repository XML query language XML views
Declarative specification almost no code to write compilation to an executable application active rules
05/99 Views and XML - Serge Abiteboul 49
Important Aspects
workflow e.g., customization: to search for a biblio ref, look first in my own files, otherwise look in dblp otherwise look…
activities (search, buy, accounting, chat…)
active rules logical traces notifications
View +:= Incomplete Information
Use something like Imielinski-Lipski tables
05/99 Views and XML - Serge Abiteboul 51
Example: portal
Q1: gismo vendors{ V | P sell(V,gismo,P) }Q1 = v1, v2, v3, v4, v5
Q2: price for each vendor{ V, P | sell(V,gismo,P) }
Q3: cheap gismo vendors{ V | P (sell(V,gismo,P) and P<80) }
Q1Q1 Q2Q2compcomp comp comp pricepricev1v1 v1v1 109109v2v2 v2v2 XXv3v3 v3v3 9999v4v4 v4v4 8989v5v5 v5v5 YY
Q3Q3comp comp priceprice condcondv2v2 XX X<80X<80v5v5 YY Y<80Y<80
05/99 Views and XML - Serge Abiteboul 52
Example: more portal
Load all electronic products expiration: e.g. to recover storage
space for all products loaded before May 1st,
discard images and text of annotations give me the gismos that have been
annotated by Jeff Ullman and the annotations
View +:= workspace, distribution, cache...
Just to say, there is much more to it...
Conclusion
05/99 Views and XML - Serge Abiteboul 55
Some Challenges: Semistructured Data Processing
XML storage under non generic form XML query language & optimization XML bulk loading data conversion, integration incomplete information
05/99 Views and XML - Serge Abiteboul 56
Some Challenges: Change Control and View Interaction
update detection incremental propagation temporal XML: versions, DOEM... rule and trigger management management of large number of
user active views (personalized)
05/99 Views and XML - Serge Abiteboul 57
Some Challenges: Workflow
workflow management: task sequencing
declarative specification of applications
program Verification
Conclusion
Database folks should be interested in XML Views and more and more are...