© 2006 AG DBIS DASMOD 2006 DASMOD Project A3XDB: XML Databases Christian Mathis...
-
Upload
irene-morton -
Category
Documents
-
view
213 -
download
0
Transcript of © 2006 AG DBIS DASMOD 2006 DASMOD Project A3XDB: XML Databases Christian Mathis...
© 2006 AG DBIS
DASMOD2006
DASMOD Project A3XDB: XML DatabasesDASMOD Project A3XDB: XML Databases
Christian [email protected]
Databases and Information Systems Group
1st DASMOD Summer School1st DASMOD Summer SchoolJuly 31st – August 13thJuly 31st – August 13thUniversity of KaiserslauternUniversity of Kaiserslautern
© 2006 AG DBIS 2
DASMOD2006 A3XDB Project MembersA3XDB Project Members
Joint project of the Information Systems Group with the Software Technology Group
A3XDB is part of XTC (XML Transaction Coordinator)
Chairs• Theo Härder (Information Systems)• Arnd Poetzsch-Heffter (Software Technology)
Scientific Staff• Michael Haustein (Locking and Recovery; Project
Founder)• Christian Mathis (Query Processing)• Jose de Aguiar Moraes Filho (Cost Model)• Karsten Schmidt (Adaptivity)• Patrick Michel (Adaptivity)
© 2006 AG DBIS 3
DASMOD2006 OutlineOutline
Why XML Database Systems? And how do they look like?
Let's (sky-)dive into XTC• L5: 33,000 ft. (XML Management)
- XML, XQuery, DOM, SAX
• L4: 15,000 ft. (Node Management)- XML Tree
• L3: 10,000 ft. (Record Management)- Mapping onto Records, Pages
• L2: 5,000 ft. (Buffer Management)- DB Buffer
• L1: 0 ft. (I/O Management)- Containers, Blocks
Adaptivity Aspects
© 2006 AG DBIS 4
DASMOD2006 Why XML Database Systems (XDBMS)?Why XML Database Systems (XDBMS)?
Q: When do I need an XML Database System? A: When you have a lot of XML data.
• … and if you also need some of these nice DBMS features- ACID transactions- high-level data handling (declarative query processing)- efficient and parallel processing of large data volumes- high availability and fault tolerance- scalabilty w.r.t transaction workload and data volumes- adaptive tuning
Examples:• Document centric view: document collections
- books, articles, web pages, …→ application: structure-sensitive information retrieval
• Data centric view: semistructured data model - messages, configuration files, semistructured data per se→ application: helthcare information management
© 2006 AG DBIS 5
DASMOD2006
Native XMLStore
XQuery DBMS
XQuery XML
Tables
SQL DBMS
SQL Tuples
How do XDBMS look like?How do XDBMS look like?
XQuery Rewriter
XQuery XML
SQL Rewriter
SQL Tuples
• XOR: "XML over Relational"• "Shredding" XML -> Tables
• ROX: "Relational over XML"• "Native" XML storage• SQL Systems become legacy
© 2006 AG DBIS 6
DASMOD2006 XML Transaction Coordinator (XTC)XML Transaction Coordinator (XTC)XML Transaction Coordinator (XTC)XML Transaction Coordinator (XTC)
OS File SystemTransaction LogContainer FilesContainer LogsTemporary Files
XTC
serv
er
Transaction Services
File Services
Propagation Control
Access Services
Node Services
XML Services
Interface Services
XTCdriver
Http Agent Ftp Agent DOM RMI SAX RMI API RMI
XML Manager XSLT ProcessorXQuery Processor
Node Manager
Record Mgr Index Mgr Catalog Mgr
Buffer Manager
I/O Manager Temp File Mgr
Transaction Manager
Lock Manager
Deadlock Detector
DOM
SAXXTCconnection
Browser FTP Client
L1
L2
L3
L4
L5
© 2006 AG DBIS 7
DASMOD2006 L5 (33,000 ft.): Example XML DocumentL5 (33,000 ft.): Example XML Document
• <bib><book year=“1994“ id=“1“>
<title>TCP/IP Illustrated</title><author>
<first>W.</first><last>Stevens</last>
</author><price>65.95</price>
</book><book year=“2000“ id=“2“>
<title>Data on the Web</title><author>
<last>Abiteboul</last><first>Serge</first>
</author><author>
<last>Buneman</last><first>Peter</first>
</author><author>
<last>Suciu</last><first>Dan</first>
</author><price>39.95</price>
</book><book year=“1999“ id=“3“>
<title>The Economics of . . . </title><editor>
<last>Gerbarg</last><first>Darcy</first><affiliation>CITI</affiliation>
</editor><price>129.95</price>
</book></bib>
© 2006 AG DBIS 8
DASMOD2006 L5 (33,000 ft.): Example API-AccessL5 (33,000 ft.): Example API-Access
XQuery
DOM
SAX
<result>{for $b in //book[@year=2000]where count($b/author) > 2return $b/title
}</result>
Node contextNode = document.getDocumentElement ();
// navigate to first book element contextNode = contextNode.getFirstChild ();
// navigate to next sibling book element contextNode = contextNode.getNextSibling ();
public void startElement(String namespaceURI, String lName, ...) {}
public void endElement(String namespaceURI, String lName, ...) {}
public void characters(char ch[], int start, int length) {}
© 2006 AG DBIS 9
DASMOD2006 L5 (33,000 ft.): XTC Command CenterL5 (33,000 ft.): XTC Command Center
document handling• store/delete documents
document navigation/modification/querying in transactional context• DOM, SAX, XQuery
© 2006 AG DBIS 10
DASMOD2006 L4 (15,000 ft.) taDOM data modelL4 (15,000 ft.) taDOM data model
<?xml version="1.0"?><bib> <book year="2004" id="book1"> <title>The Title</title> <author> <first>FirstName</first> <last>LastName</last> </author> <price>49,99</price> </book></bib>
T
bib
book
title author price
id year Tfirst last
TT
The Title
FirstName
LastName
49,99book1 2004
attribute root node
element node
attribute node
string node
text node
© 2006 AG DBIS 11
DASMOD2006 L4 (15,000 ft.) SPLID node addressing schemeL4 (15,000 ft.) SPLID node addressing scheme
T
bib
book
title author price
id year Tfirst last
TT
The Title
FirstName
LastName
49,99book1 2004
1
1.3
1.3.3 1.3.5 1.3.7
1.3.3.3
1.3.3.3.1
1.3.5.31.3.5.5
1.3.5.3.3
1.3.5.3.3.1
1.3.5.5.3
1.3.5.5.3.1
1.3.7.3
1.3.7.3.1
1.3.11.3.1.3 1.3.1.5
1.3.1.3.11.3.1.5.1
Stable Path Labeling IDentifiers• for document storage• for query processing• for locking support
© 2006 AG DBIS 12
DASMOD2006 L4 (15,000 ft.) Simple Locking ExampleL4 (15,000 ft.) Simple Locking Example
- R X
R + + -
X + - -Object
modify read
• needs exclusive access• requests X lock
• needs shared access• requests R lock
Protocol: Compatability Matrix
T
bib
book
title author price
id year Tfirst last
TT
The Title
FirstName
LastName
49,99book1 2004
1
1.3
1.3.3 1.3.5 1.3.7
1.3.3.3
1.3.3.3.1
1.3.5.31.3.5.5
1.3.5.3.3
1.3.5.3.3.1
1.3.5.5.3
1.3.5.5.3.1
1.3.7.3
1.3.7.3.1
1.3.1
1.3.1.5
1.3.1.3.11.3.1.5.1
On a tree: hierarchical locking!
T1: X
T2: R
T1
T2: R OK!
T2
© 2006 AG DBIS 13
DASMOD2006 L4 (15,000 ft.) taDOM3+ Compatability MatrixL4 (15,000 ft.) taDOM3+ Compatability Matrix
- IR NR LR SR IX
NRIX
LRIX
SRIX
CX NRCX
LRCX
SRCX
NU LRNU
SRNU
NX LRNX
SRNX
SU SX
IR + + + + + + + + + + + + + + + + + + + - -
NR + + + + + + + + + + + + + - - - - - - - -
LR + + + + + + + + + - - - - - - - - - - - -
SR + + + + + - - - - - - - - - - - - - - - -
IX + + + + - + + + - + + + - + + - + + - - -
NRIX + + + + - + + + - + + + - - - - - - - - -
LRIX + + + + - + + + - - - - - - - - - - - - -
SRIX + + + + - - - - - - - - - - - - - - - - -
CX + + + - - + + - - + + - - + - - + - - - -
NRCX + + + - - + + - - + + - - - - - - - - - -
LRCX + + + - - + + - - - - - - - - - - - - - -
SRCX + + + - - - - - - - - - - - - - - - - - -
NU + + + + + + + + + + + + + - - - - - - - -
LRNU + + + + + + + + + - - - - - - - - - - - -
SRNU + + + + + - - - - - - - - - - - - - - - -
NX + + - - - + - - - + - - - - - - - - - - -
LRNX + + - - - + - - - - - - - - - - - - - - -
SRNX + + - - - - - - - - - - - - - - - - - - -
SU + + + + + - - - - - - - - - - - - - - - -
SX + - - - - - - - - - - - - - - - - - - - -
© 2006 AG DBIS 14
DASMOD2006 L3 (10,000 ft.) XTC Document IndexL3 (10,000 ft.) XTC Document Index
• document mapped to records and distributed across fixed sized pages• efficient DOM navigations• prefix compression works
1.3.1.31.3.11.31
1.3.31.3.1.5.11.31.51.3.1.3.1
1.3.5.31.3.51.3.3.3.11.3.3.3
1.3.5.5.31.3.5.51.3.5.3.3.11.3.5.3.3
1.3.7.3.11.3.7.31.3.71.3.5.5.3.1
SPLID node data (byte representation)
1.3.1.3.1
1 1.3.5.3.3
1.3.5.5.3.1
1.3.3.3
do
cum
ent
ind
exd
ocu
men
t co
nta
iner
© 2006 AG DBIS 15
DASMOD2006 Buffer ManagementBuffer Management
Buffer = main memory area with fixed number of frames for pages
Exploits reference locality Typical BufferManager operations
• fetch page, allocate page, clear page, fix page, unfix page Page replacement strategy LRU or LRD-V2 Page addressing by 4-byte page number (external memory
address)
Data Page Data Page Data Page Data Page
Frame Frame Frame Frame Frame
Database BufferPageNumber (4 Bytes)
PageType (1 Byte)
© 2006 AG DBIS 16
DASMOD2006 I/O-Manager (1)I/O-Manager (1)
Container file is sliced into fixed sized blocks (blockSize == pageSize)
I/O-Manager handles container file• read block, write block, allocate block, release block.
Dynamic allocation of new external memory space, if container is full Indexblock an Position 0 verwaltet Block- und Erweiterungsgröße Before-Image-Block at position 1 for Update-In-Place with Write-
Ahead-Log Block addressing with 3-byte block number
Block 0Index
Block 1Before Image
Block 2Data Block
Block 3Data Block
Block nData Block
…
Block SizeExtent Size
Container
© 2006 AG DBIS 17
DASMOD2006 Approaches to Adaptivity of System Behavior Approaches to Adaptivity of System Behavior
DBS have a large number of tuning parameters
Choose default values for tuning: rules of thumb• OK for workload-independent parameters:
page size, striping unit, minimal buffer size • insufficient for load balancing aspects: MPL limit, etc.
Hardware is cheap: the KIWI principle• OK if applied with care• however, it often implies a waste of resources
Autonomic computing: online feedback control loop• OK, but requires additional ressources (cycles,
memory, ...)
© 2006 AG DBIS 18
DASMOD2006 Automate some Tasks of the DBAAutomate some Tasks of the DBA
Process the loop automatically• monitor – analyze – plan – react• prediction needs quantitative models!• additional information flow within / between layers
© 2006 AG DBIS 19
DASMOD2006 Local Self-Tuning – Index SelectionLocal Self-Tuning – Index Selection
Automatic creation of indexes in L3 Analogy:
Global self-tuning requires distributed knowledge• Workload statistics collected in L5• Use of path processing algorithms in L4• availability of alternative indexes in L3
Countingtrafficlocally
Planning new resources?
Global traffic observation
Better solution!
© 2006 AG DBIS 20
DASMOD2006 ConclusionsConclusions
XTC is a real database system• Try it: www.xtc-project.de
We dived through the 5 XTC layers• XML management• Node Management• Record Management• Buffer Management• I/O Management
Adaptivity• We are only at the beginning• Central concept: online feedback control loop• First step in XTC: Let the components talk to each
other