1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo...
-
Upload
gladys-lawson -
Category
Documents
-
view
213 -
download
0
description
Transcript of 1 Storing and Maintaining Semistructured Data Efficiently in an Object- Relational Database Mo...
1
Storing and Maintaining Semistructured Data Efficiently in an Object-Relational Database
Mo Yuanying and Ling Tok Wang
2
Contests1. Main accomplishment2. Related Works3. ORA-SS4. Storing Algorithm5. Comparison with Related Works6. Conclusion
3
Main Accomplishment This study provides an efficient and consistent
storage for semistructured data by developing algorithms that map the XML document to logical ORA-SS model and then to an object-relational data store.
4
Contests1. Main accomplishment2. Related Works3. ORA-SS4. Storing Algorithm5. Comparison with Related Works6. Conclusion
5
(1) the file system store each XML document as a separate operating
system file and use a DOM or SAX parser whenever the document is accessed by a query
Disadvantage XML files in ASCII format need to be parsed every time when
they are accessed for either browsing or querying. the entire parsed file must be memory-resident during query
processing in DOM. it is hard to build and maintain indices on documents stored
this way. update operations are difficult to implement.
Related Works
6
(2)Using a relational DBMS XML data is stored in relations and the XML query
language (for example, XQuery) is translated to SQL and executed by the underlying relational database system
Related Works
Disadvantages A great deal of redundancy Difficult to do search or update Handling multi-valued attribute is
expensive
-- The Edge Approach-- The Attribute Approach-- Universal Table-- Normalized Universal
Approach-- STORED
7
(3)Using a storage manager
the XML query is parsed, translated to a suitable operator tree representation, optimized, and then executed by an XML Query Engine
-- Shore-- B-tree
Related Works
Disadvantage Inconvenient when doing the search or update
8
(4)Our approach --Store ORA-SS in nested relations
Problems in existing storage approaches Stored in flat files – it is long and difficult to query or update Relational DBMS – these approaches cannot get the semantic
information ORA-SS reflects the nested structure of semi-structured data,
distinguishes between object classes, relationship types and attributes. It is possible to specify the degree of n-ary relationship types and indicate if an attribute is an attribute of a relationship type or an attribute of an object class. Such information is essential for designing an efficient and non-redundant storage organization for semi-structured data
Handling multi-valued attribute better in nested relations
Related Works
9
Contests1. Main accomplishment2. Related Works3. ORA-SS4. Storing Algorithm5. Comparison with Related Works6. Conclusion
10
ORA-SS A semantically richer data model for semi-
structured data 3 main concepts
Object class Relationship type Attribute
13
Example (Cont) The distinction between binary and ternary
relationship types cannot be made in other semi-structured data models.
ORA-SS
14
ORA-SS ORA-SS can specify the degree of n-ary
relationship types ORA-SS can indicate if an attribute is an
attribute of a relationship type or an attribute of an object class
Existing semi-structured data models cannot specify such information while it is essential and important for storage
15
Contests1. Main accomplishment2. Related Works3. ORA-SS4. Storing Algorithm5. Comparison with Related Works6. Conclusion
16
ORA-SS to OR database Object-Relational database can handle multi-
valued attributes efficiently. Multi-valued attributes are treated as repeating groups
in nested relations.
Storing Algorithm
17
ORA-SS to OR database Main rules
Each object class together with its attributes forms a nested relation while multi-valued attributes as repeating groups of this relation (Object relation).
Each relationship type(object classes involved in this relationship type) together with its attributes forms a nested relation while multi-valued attributes as repeating groups of this relation (Relationship relation).
Storing Algorithm
18
(1)Object class translation algorithm
O1 The identifier and candidate key of this object class is the primary key and candidate key of the generated relation.
O2 Each single-valued attribute of this object class is a single-valued attribute of the generated relation.
O3 Composite attributes of object class are represented directly. They are replaced by their components in the generated relation.
Storing Algorithm
19
Object class translation algorithm (cont)
O4 Each multi-valued attribute of this object class forms a repeating group in this relation.
O5 Each reference is a foreign key in this relation. O6 Each disjunctive attribute is treated as two
attributes. O7 For the ID dependency relationship type, the
rule for the ID dependent object class is the same as the rule for the regular object class. The ID dependent object class together with its attributes forms a nested relation within its parent object class.
Storing Algorithm
21
(2)Relationship type translation algorithm
R1 All the identifiers of the object classes participating in this relationship type form the single-valued attributes of the nested relation. The key of the relationship type can be determined by
the participation constraint of the relationship type. R2 Each single-valued attribute of this
relationship type is a single-valued attribute of the generated relation.
Storing Algorithm
22
Relationship type translation algorithm (cont)
R3 Composite attributes of relationship type are represented directly. They are replaced by their components in the generated relation
R4 Each multi-valued attributes of this relationship type forms a repeating group in this relation.
R5 A disjunctive relationship type is treated as two relationship types.
R6 There is no need to translate ID dependency relationship type.
Storing Algorithm
24
Translation for Ordering and ANY (3)Translation for Ordering
we define another attribute named ordinal within the ordered object class (ie, the ordered attribute).
(4)Translation for ANY the unknown structured attribute or an attribute may have a
different structure for different instances, which is denoted as ANY
we define a separate table as (Identifier, ANY, ANY-value). Identifier is the identifier of the object class or the relationship type
which this ANY belongs to. ANY is the different structure name (the TAG) for the different
instances. ANY-value is its value.
Storing Algorithm
25
Translation Results Followed these algorithms, the Normal Form
ORA-SS schema will result in the normal form nested relations.
the undesirable update anomalies in semi-structured databases are removed and any redundancy due to many-to-many relationships and n-ary relationships are controlled
Storing Algorithm
26
Contests1. Main accomplishment2. Related Works3. ORA-SS4. Storing Algorithm5. Comparison with Related Works6. Conclusion
28
Conclusion Our approach is to use ORA-SS as our data
model and use object-relational database as the database management system.
We can store and access the semi-structured data correctly, more efficient and without avoidable redundancy.
There is no node ID needed in our approach.