XML- : an extendible framework for manipulating XML data
description
Transcript of XML- : an extendible framework for manipulating XML data
1XML-KSI, 2004
XML-: an extendible framework for manipulating XML data
Jaroslav Pokorny
Charles University
Praha
2XML-KSI, 2004
Two approaches to XML
logical or physical
Idea: XML as a database– DB of XML documents – „mix“ of (relational) DB and XML data– XML views (over non-XML and/or XML data)
Advantages: – independence on original platforms and models on
processed data– more flexible for design, manipulation (integration,
updates, querying)
3XML-KSI, 2004
Two approaches to XML
implications– implementations: XML DBs (native, via relational,
OO, OR), – special demands on query languages
• how do them powerful• how to describe their semantics• how implement them
– new types of software: wrappers, mediators (personal) goal: to develop a powerful formal
approach appropriate for manipulating both XML and non-XML data
4XML-KSI, 2004
Outline XML - shortly
XML – functional data model
functional typing XML (and non-XML data)
LT language
XML-schema, XML-database
XML- framework
Conclusions
5XML-KSI, 2004
XML – an example<!DOCTYPE biblio [<!ELEMENT biblio (book monograph)*><!ELEMENT book (title, author*)><!ELEMENT title (#PCDATA)<!ELEMENT monograph (title, author, editor)><!ATTLIST monograph year CDATA #REQUIRED><!ELEMENT editor (monograph*)><!ELEMENT author (name, address?)><!ELEMENT name (firstname?, surname)><!ELEMENT firstname (#PCDATA) ><!ELEMENT surname (#PCDATA) ><!ELEMENT address(locality, ZIP)><!ELEMENT locality (#PCDATA) ><!ELEMENT ZIP (#PCDATA) >]>
6XML-KSI, 2004
XML – an example<book>
<title> Fundamentals of DBS </title><author >
<name><firstname> Ramez </firstname><surname> Elmasri </surname>
</name><address >
<locality> Arlington </locality><ZIP> 76019 </ZIP>
</address></author ><author >
<name><firstname> Shamkant </firstname><surname> Navathe </surname>
</name></author >
</book>
7XML-KSI, 2004
XML model
Usually: tree- or graph-oriented
Here: inspiration by functional approach to conceptual modelling
DEPARTMENT
MEMBER*
PROJECT*
For example, the HIT data model from 80s.
8XML-KSI, 2004
Synopsis of the approach Typing XML data
Background: – a functional type system (base of primitive types + functions,
tuples, and unions)
Extensions to:– typing XML regular expressions,– typing XML elements.
Querying XML elements– a general typed -calculus (functional variables and
constants, tuples, applications of functions, -abstractions)• XML-database schema as a set of variables of types,• XML-database as any valuation of these variables
– XML- - a syntactic variant of the typed -calculus over XML-data
9XML-KSI, 2004
Typing XML data - informally
E … a set of abstract elements. The content of an abstract element will be either a string
from PCDATA, in the easiest example, or a sequence of abstract subelements (or groups), or empty.
Ex: <phone>781 7090</phone>. It is an instance of a phone element object.
For an eE, phone(e) returns e.g. the phone number ‘781 7090‘.
phone element object will be conceived as a (partial) function from E into PCDATA.
10XML-KSI, 2004
Typing XML data - informallyEx:
<!ELEMENT name (firstname?, surname)>is conceived a set of functions from E E EThe current name element object, i.e. the one
stored in a given XML database, is a function assigning to each abstract element eE at most a couple of abstract elements.
Hierarchy of notions:element type, element object, element
11XML-KSI, 2004
Functional typing
B … a set of symbols (the base)
T ::= S primitive type
(T1 T2) functional type
(T1,...,Tn) tuple type
(T1 + T2) union type
where S B
Remark: relations are ((T1,...,Tn ) BOOL)-objects!
12XML-KSI, 2004
Functional typingInterpretation:
Members of B … mutually disjoint non-empty sets, (T1
T2) ... the set of all (total or partial) functions from T1 into T2, (T1,...,Tn) … T1... Tn, (T1+…+Tn ) … Ti
Exs: arithmetic operations: +, -, *, / are
((NUMBER, NUMBER) NUMBER)-objects. logic:
– and/((BOOL, BOOL) BOOL), – universal R-quantifier R, and existential R-quantifiers R are
( (R BOOL) BOOL) - objects.– R-identity =R is ((R,R) BOOL)-object.
aggregation functions: COUNTR /((R BOOL) NUMBER)
13XML-KSI, 2004
Typing XML regular expressions
Let B = {PCDATA, BOOL, NAME}. The type system Treg over B is recursively defined as follows.
T ::= tag: PCDATA tag:
where tag NAME. elementary regular expression
T* zero or more
T+ one or more
T? zero or one
where T is an alternative or elementary regular expression.
(T1 T2) alternative
14XML-KSI, 2004
Typing XML regular expressions
Interpretation:
Ex.:
(T1 T2) … a set of objects of type T1 T2.
T* … (T BOOL) /partially ordered model/
T* … ((T, NUMBER) BOOL) /ordered model/
– Consider a function f of this type. For a couple (t, i),
f(t, i) = TRUE iff t is ith object in an (ordered) set of T-objects.
15XML-KSI, 2004
Typing XML elements and attributesTreg over B, E.
The type system TE induced by Treg (or TE if Treg is understood) containing the regular element expressions given by the following rules:
E ::= TAG:T TAG: elementary element typeswhere tag:T and tag: are elementary regular expressions over B
E* E+ E? (E1 E2)
TAG:(E1,..., En)
where tag NAME.
Elementary element types and regular element expressions TAG:(E1,...,En) are called element types.
16XML-KSI, 2004
Typing XML elements and attributesSemantics of element types:
TAG:PCDATA … the set of all (partial functions) from E to tag:PCDATA
… etc
Attributes are also functions.
Ex.: year (of monograph) is a function assigning to each monograph its year (of issue).
Notation:
EMONOGRAPH CDATA
17XML-KSI, 2004
Example: BIBLIO element types
TITLE:PCDATAFIRSTNAME:PCDATASURNAME:PCDATALOCALITY:PCDATAZIP:PCDATAADDRESS:(LOCALITY, ZIP)BOOK:(TITLE, AUTHOR*)NAME:(FIRSTNAME, SURNAME)MONOGRAPH:(TITLE, AUTHOR, EDITOR)
YEAR/(MONOGRAPH CDATA)
EDITOR:MONOGRAPH*
AUTHOR:(NAME, ADDRESS?)
BIBLIO: (BOOK MONOGRAPH)*
18XML-KSI, 2004
LT language (Language of Terms)Func ... constants, each of a fixed type, variables for
each type from T. Let types T, T1, ..., Tn (n 1) are members of T.
Typed constants and variables are terms.
M(M1,...,Mn) application
x1,...,xn(M) -abstraction
where x1,...,xn are distinct variables
(M1,...,Mn) tuple
Mi projections
for a term M (M1,...,Mn) K:M tagged termwhere K/NAME. If M/T, then K:M/(E T).
19XML-KSI, 2004
Schema and DB
XML-database schema, SXML, is a set of variables of types from TE.
Given a database schema SXML, an XML-database is any valuation of these variables.
Ex.: SURNAME, AUTHOR
20XML-KSI, 2004
XML- framework What is it? XML- framework is a subset of LT + syntactic sugarFeatures: queries are expressed by terms Ex.: AUTHOR (1)
RESULT: AUTHOR …. more „XML-like“)Typically: .. ( .. …(expression)…),
where expression/BOOLx (AUTHOR(x)) does the same as (1)
paths as compositions of functionsEx.: SURNAME(NAME(AUTHOR(m)))
where m is a monograph abstract element objectNotation: m.AUTHOR.NAME.SURNAME
21XML-KSI, 2004
XML- framework applications of logic, arithmetic, … functions
e (b.AUTHOR(e) and e.NAME.SURNAME = ‘Smith’)
where b is a book abstract element object
b e (b.AUTHOR(e) and e.NAME.SURNAME = ‘Smith’)
is a YES/NO query.
22XML-KSI, 2004
XML- framework restructuring
name:x.NAME (title:y (.BOOK.(AUTHOR(x) and
TITLE = y)) )
title:y (name:x.NAME (.BOOK.(AUTHOR(x) and
TITLE = y)) )Notation: tagged variables, content of abstract elements by y, x
aggregations + nesting
D. For each book, find the number of its authors.
x, n (.BOOK..(TITLE = x and COUNT(AUTHOR) = n))Notation: dots .. for omitting parts of paths and prefixes
possibility to embed any user defined function
23XML-KSI, 2004
XML- framework D(XQuery):
FOR $x IN distinct(document(“biblio1.xml”)//book)
LET $n := count($x/author)
RETURN <book>
<name>$x/title/text()</name>
<numb_of_auth>$n</numb_of_auth>
</book>
24XML-KSI, 2004
Integration of heterogeneous information sources
relational schemes, DTDs, ADTs, classes in OO
user
queryanswer
typed objects
25XML-KSI, 2004
ConclusionsIssues: finding appropriate restrictions of XML- for querying implementation is in progress
The forthcoming paper: cleaning the model (ordered and unordered) formal semantics of types, extensions to tagged variables
Future: XML- with tag variables semantics of XQuery in XML- framework