A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario...
-
date post
21-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario...
![Page 1: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/1.jpg)
A Type System for a Semistructured and XML Data Base Management System
Ph. D. Thesis Proposal
Dario Colazzo
![Page 2: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/2.jpg)
Thesis Goals Formal developement and study of
a type system for XML querying Implementation of a concrete type
system for an XML data base management system: the Xtasy system
![Page 3: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/3.jpg)
Presentation outline Semistructured data and XML Data models Type languages: DTD, XML
Schema Querying XML data: Tequyla Processing XML data: XDuce Thesis goals
![Page 4: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/4.jpg)
Semistructured data Irregular and instable structure Self-describing representation No separate schema information:
few guarantees of reliability and efficiency of applications
![Page 5: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/5.jpg)
OEM graph
person
addr
person
age
first“Dario Colazzo”
second
name
30 “Pisa”
age
“Carlo”
30
“Sartiani”
name email
addrbook
![Page 6: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/6.jpg)
XML syntax<addrbook>
<person><name>Dario Colazzo</name><addr>Pisa</addr>
</person><person>
<name><first> Carlo </first>
<second> Sartiani</second></name>
<addr>Pisa</addr> <email>[email protected]</email>
</person></addrbook>
![Page 7: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/7.jpg)
Attributes and element reference<db>
<state id="01"> <name>Italy</name> <code>IT</code>
</state>.......<city region=“Toscana” state-of="01">
<name>Italy</name> <code>PI</code>
</city></db>
![Page 8: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/8.jpg)
XML Query Data Model Based on node labeled forest trees
(set of documents) Several kind of nodes:
element node attribute node value node
Identifier and reference attributes modeled as general attribute
![Page 9: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/9.jpg)
XML Tree
person
addr
person
age
first
“Dario Colazzo”
second
name
30 “Pisa”
age
“Carlo” 30“Sartiani”
nameemail
addrbook element node
attribute node
value node
addr
“Pisa”
![Page 10: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/10.jpg)
XML schema languages Document Type Declarations:
schemas as grammars for documents. Regular type expressions
XML Schemas: closer to traditional type languages
![Page 11: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/11.jpg)
DTD Regular type expressions:
T | U union T,U sequence T* zero or more T? zero or one X=T[X] recursive definitions
coupled-tag element declarations global definitions only one base type: string (PCDATA) no type reusing
![Page 12: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/12.jpg)
DTD, example
<!DOCTYPE addrbook[<!ELEMENT addrbook (person*)<!ELEMENT person (name, addr,
tel?)><!ELEMENT name #PCDATA><!ELEMENT addr #PCDATA><!ELEMENT tel #PCDATA>
zero or more
zero or one
![Page 13: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/13.jpg)
XML Schema decoupled-tag: elements and types
may be defined separately local definitions base types: intgers, string,
decimal,... type reusing:
type refining type extension with subtyping
![Page 14: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/14.jpg)
XML Schema, example
<xsd:complexType name="person"><xsd:sequence><xsd:element name="name" type="xsd:string" /><xsd:element name="age" type="xsd:ageType"/><\xsd:sequence>
<\xsd:complexType>
<xsd:complexType name="newPerson" base="typeOfPerson" derivedBy="extension">
<xsd:element name="car" type="xsd:string" /><\xsd:complexType>
![Page 15: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/15.jpg)
Querying XML data XML querying is based on the use of patterns to
select portions of document Untyped query languages:
XQL XML-QL Quilt
Typed: Tequyla XDuce (functional language)
Forthcoming W3C query language...?.. probably Quilt
![Page 16: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/16.jpg)
Tequyla SQL-like query language query free-nesting typed:
query correctness query typing
Currently: only non algorithmical definitions, and weak subtyping
![Page 17: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/17.jpg)
Tequyla queries The body of a Tequila query is a from
clause composed by XPath patterns x=addressbook.xml;
bind to x the root element of addressbook.xml
y in x//person/addr starting from the root (x) search for a
person element at an arbitrary depth (//), then for an addr sub element (/), finally bind the node found to y
![Page 18: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/18.jpg)
A Tequyla query
Q = from x=addressbook.xml;
y in x//person/addr; z in x//person/name; where y="Pisa" select nome[z]
XPath
![Page 19: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/19.jpg)
XDuce Typed functional language Regular expressions types Type based pattern language
![Page 20: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/20.jpg)
XDuce schema A schema is a set of type definitions
E= {Addressbook = addrbook [(Name, Addr, Tel?) *] Name = name [String]Addr = addr[String]Tel = tel[String]
}
![Page 21: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/21.jpg)
An XDuce funtion: telephone list
Consider T= (Name, Addr,Tel?) in
fun mkTelList : T* --> (Name,Tel)* =
name[n], addr[a], tel[t], rest:T* --> name[n],tel[t], mkTelList(rest)
| name[n], addr[a], rest: T*--> mkTelList(rest)
| () --> ()
![Page 22: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/22.jpg)
XDuce subtyping: language inclusion XDuce provides a simple but rather
powerful notion of subtyping based on inclusion between sets of values
Examples Name, Addr <: Name, Addr,Tel? Name, Addr,Tel <: Name,
Addr,Tel? XML Schema extension subtyping
is not captured
![Page 23: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/23.jpg)
Xtasy type system
![Page 24: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/24.jpg)
Type language As expressive as DTD and XML
Schema Base types Attributes and id/idref types Type refining and extension Local type definitions Unordered sequence types
![Page 25: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/25.jpg)
Schema extraction and schema inferring For untyped data, a schema will be
inferred according to the XML Schema style
For typed XML data, the schema will be converted in the internal schema representation
Type inference for query results
![Page 26: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/26.jpg)
Data conformity An algorithm will be defined to
check data conformity to a schema The problem is EXPTIME-complete Optimization techniques exist Further ones has to be found to
deal with unordered sequence types and id/idref types
![Page 27: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/27.jpg)
Query correctness Only type correct queries will be
executed Type correctness is based on
successful matching between the query structural requirements and the type of the data to be queried
![Page 28: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/28.jpg)
Correct queries, an example (1/2)
ConsiderE= {
Adrressbook = addrbook [Person*] Person = (Name, Addr, Tel?) Name = name [String] Addr = addr[String] Tel = tel[String]
}
![Page 29: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/29.jpg)
Correct queries, an example (2/2) A correct query:
Q = from x=addressbook.xml;
y in x//person/addr; z in x//person/name; where y="Pisa" select nome[z]
![Page 30: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/30.jpg)
Correctness & union types Consider:Q’ = from x=addressbook.xml; y in x//person/addr; z in x//person/tel; where y="Pisa" select results[z] Schould we consider this query
correct?
![Page 31: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/31.jpg)
Correctness & union types: existential approach The previous query is considered
as correct The user will be warned about
optional elements required by patterns
![Page 32: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/32.jpg)
Total approach The previous query is considered
as not correct Too severe discipline A lot of queries with non empty
results would be cut off
![Page 33: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/33.jpg)
Type equivalences Several type equivalences laws will
be considered In particular:
(T | U) , S = (T , S) | (T , S) Useful to simplify schema
definitions
![Page 34: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/34.jpg)
Subtyping A subtype relation E E’ will be
defined such that: If a query Q is correct wrt E’ then it is
also correct wrt E Type extension will be supported:
if E is an extension of E’ then E E’
![Page 35: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/35.jpg)
Parametric polymorphism (1/3)
Used in some functional languages (e.g. ML and Haskel) to define generic functions, for example:
funtion Sort (t :Type; L:List t; Ord:t X t Bool): List tbegin.....end.
It will allow us to define generic queries
![Page 36: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/36.jpg)
Parametric polymorphism (2/3)
Parametric types fits well in the description of irregular data structure
For example E(t)= {Adrressbook = addrbook [(Name, Addr, Tel?) *]
Name = name [String] Addr = addr[t] Tel = tel[String]}
addr elements content can have, for example, a street and a city sub-element
![Page 37: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/37.jpg)
Parametric polymorphism (3/3)
A generic query:
Q = t: Type; a : E(t) . from x= a ;
y in x//person/addr; z in x//person/name; where z=“dario" select indirizzo[y]
More precise typing: the type Any* is different from t*
![Page 38: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/38.jpg)
Conclusions The type system will provide:
union types reference types recursive types subtyping parametric polymorphism
![Page 39: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/39.jpg)
Avanzamento
![Page 40: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/40.jpg)
Presentation outline
Proposal What has been done Ongoing and future work
![Page 41: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/41.jpg)
Thesis Goals Formal developement and study of
a type system for XML querying The query language is an abstract
version of XQuery (W3C) The type langueage is expressive
enough to capture the essence of current standards
![Page 42: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/42.jpg)
Xquery type system Only result analisis: XQuery type
system is defined to determine and check at query-analysis time the output type of a query on documents conforming to an expected input type.
Query correctness is not defiend and checked (only some ideas).
![Page 43: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/43.jpg)
What has been done We have:
formally defined the notion of query type correctness
defined a type system to statically check it and to perform result analisys; the rules define a terminating algorithm.
intruduced an alternative, wrt Xquery, approach to deal with recursive types
![Page 44: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/44.jpg)
Observations Our type system also performs query
analisys and, in this respect, presents some differences wrt XQuery approach
Till now, we have considered a type system feeaturing product, union and recursive types
We have discovered that these type mechnanism are sufficient enough to make the study interesting and (as we will see) rather subtle.
![Page 45: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/45.jpg)
Observations discovered that for particular
queries (fortunately not frequent ones) the type system is not able to exactly capture the semantical characterization of correctness
Introduced a further notion of correctness, path-covering, and provided rules to check this property
![Page 46: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/46.jpg)
Papers A first defintion of the type system can be
found in A Typed Text Retrieval Query Language for XML Documents , Journal of the American Society for Information Science and Technology (JASIS) Special Issue 2001
In Types for Correctness of Queries over Semistructured Data, the system has been improved by a finer notion of query correctness and by the notion of path covering. The work will be submitted at WebDB2002 workshop
![Page 47: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/47.jpg)
Tequyla (or µXQuery) SQL-like query language query free-nesting typed:
type conformance of data query correctness query typing (result unalysis)
![Page 48: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/48.jpg)
Tequyla queries The body of a Tequila query is a from
clause composed by XPath patterns x=addressbook.xml;
bind to x the root element of addressbook.xml
y in x//person/addr starting from the root (x) search for a
person element at an arbitrary depth (//), then for an addr sub element (/), finally bind the node found to y
![Page 49: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/49.jpg)
Types T,U ::= () empty sequence
B atomic type (char, int,…)T + U union
T; U sequencel[T] element typeX type name
Type environments: type definitions + type binding for query free variables
E ::= ()X=T, E
x:X, E
![Page 50: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/50.jpg)
A type environment E=
Adrressbook= addrbook [ Person*], Person= person[Name, Addr, (Tel
+EMail)], Name = name [String], Addr = addr[String], Tel= tel[String],
EMail= email[String],x: Adrressbook
![Page 51: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/51.jpg)
A correct query
Q ::=
from y in x//person/addr; z in x// person/name; where y="Pisa" select nome[z]
XPath
![Page 52: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/52.jpg)
An incorrect query
Q ::=
from x=addressbook.xml; y in
x//person/address; z in x/name; where y="Pisa" select nome[z]
![Page 53: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/53.jpg)
Queries:
Q1, Q2 :: = ()
VB
l[Q]
Q1; Q2from x=Q1 select Q2from x in Q1 select Q2x
Q p Observe: no where clauses.
![Page 54: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/54.jpg)
Some notation Given s= {x1= f1,...., xn= fn}
s::E
means that xi = fi s iff xi:T E and fi
T
E|-- Q means that each fv x in Q is
typed in E (x:T E)
![Page 55: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/55.jpg)
Definition of correctness: first step Given a query Q, a schema E for its
free variables, and s::E :
1. [[Q]]s=<f, F> or
2. [[Q]]s=<f, NF> Essentially, in s, Q correctely returns a
forest f (case 1.) if Q’ p in Q, the path p finds a match with the forest returned by Q’
![Page 56: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/56.jpg)
Query correctnessQuery correctness
Given a query Q and E s.t. E|-- Q :
Q is strongly correct iff for each s::E
[[Q]] s=<f, F>
Q is weakly correct iff there exists s::E
[[Q]] s=<f, F>
Q is incorrect iff for each s::E[[Q]] s=<f, NF>
![Page 57: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/57.jpg)
Example: strongly correct query
Consider the type environment X=a[Y],
Y=b[Int]+c[Int],x: X
and the queryx(/b+/c)
![Page 58: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/58.jpg)
Example: weakly correct query
Consider the queryx/b
Only some instance of type X contains the path /b
X=a[Y],Y=b[Int]+c[Int],x: X
![Page 59: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/59.jpg)
Example: incorrect query
Consider the queryx/d
No instance of type X contains the path /d
X=a[Y],Y=b[Int]+c[Int],x: X
![Page 60: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/60.jpg)
Type system To check correctness and to infer the
type of query results we have defined a set of rules that: define an algorithm: determinism +
termination deals with recursion in a different way wrt to
Xquery type system in same cases (// + guarded recursion)
infers context free types do not rely on any notion of type inclusion:
only matching between paths and types
![Page 61: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/61.jpg)
Some properties Given E |-- Q if the system return
E |-- Q :<T, θ> with θ{s, w, i}then
[[Q]] [[T]]and
θ=s/i Q is stongly correct/incorrectIf θ=w then in most cases Q is weakly
correct, but in some cases Q is strongly correct or, even worst, incorrect
![Page 62: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/62.jpg)
Weak correctness problem (1)
Unsoundness for the case θ=w (and incorrect queries) is due to particuluar queries where two different paths start from the same root (x) and traverse two “disjoint” paths
Example:x/b; x/c where
x :X,X=a[Y],Y=b[Int]+c[Int]
![Page 63: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/63.jpg)
Observations Observe, the problem does not arise for
x/b; x/b or x/b; y/cwhere x :X,y: X,X=a[Y],Y=b[Int]+c[Int]Both queries are weakly correct as
inferred by the type system
![Page 64: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/64.jpg)
Strong correctness problem Consider the strongly correct queryConsider
x(/b+/c)wherex: X,X=a[Y],Y=b[Int]+c[Int],
In this case the type system infers: < b[Int]?+c[Int]?, w>
![Page 65: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/65.jpg)
Solution We have a possible solution for
these problems It is based on a different
representation of union types Currentely we are working on the
defiition of simple rules that implement this approach
![Page 66: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/66.jpg)
Path covering In strong correctness we require that for
each alternative path in the input type there is a path selection in the query,
In the notion of path covering we require that each alternative expressed in the query appears in the input type
![Page 67: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/67.jpg)
Path covering, examplesConsider X=a[Y], Y=b[Int]+c[Int],
x: Xand the query
x(/b+/c+/d)
This path selection is not path-covered wrt to X, the path /d is superflous
The same is for x(/b+/d), while both x(/b+/c) and x(/b) are path-covered
![Page 68: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/68.jpg)
Path covering It is useful for programmers as they are
statically informed about extra paths that may ineffeciently attempt to match input data
Moreover they can improve and simplify their queries by eleiminating superflous paths or by subtituting them with actually occurring ones
![Page 69: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/69.jpg)
Path covering The type system defined for
corretness has been easily extended to check path covering
The system constituets a formal framework where several other notions of correctness can be defined and compared
![Page 70: A Type System for a Semistructured and XML Data Base Management System Ph. D. Thesis Proposal Dario Colazzo.](https://reader031.fdocuments.in/reader031/viewer/2022032704/56649d6c5503460f94a4c0b7/html5/thumbnails/70.jpg)
Ongoing and future work Currently we are working on:
the defintion of (simple) rules that solves the unsoundness problems previously outlined
the formal proofs of properties of the current system
In next months we: complete the developement of formal stuff
for both systems for query correctness and for the system for path covering
extend the language with where clauses