Mirror, Mirror on the Wall, What is the Best Database Solution of All? Akmal B Chaudhri Senior...
-
Upload
miles-sharp -
Category
Documents
-
view
214 -
download
0
Transcript of Mirror, Mirror on the Wall, What is the Best Database Solution of All? Akmal B Chaudhri Senior...
Mirror, Mirror on the Wall, What is the Best <XML> Database Solution of All?
Akmal B ChaudhriSenior ArchitectInformix Labs
Copyright © 2001 Informix 2
Disclaimer Any opinions expressed are mine and not
necessarily those of my employer
Copyright © 2001 Informix 3
Acknowledgements All trademarks are acknowledged Various people at Informix for some of
the presentation material Robert Sutor, Lee Kheng Joo and Eve
Maler
Copyright © 2001 Informix 4
Abstract Managing XML documents is problematic
when document collections grow. How do we successfully store and query document collections? One solution is a database. We will discuss the problem of integrating XML with databases and examine choices, such as relational databases, object databases, object-relational databases and native XML servers
Copyright © 2001 Informix 5
Speaker Biography The speaker has been working in the
area of Object Databases for 10 years He has previously worked for Reuters,
Logica and Computer Associates as well as OODB research at City University
?
Copyright © 2001 Informix 6
Life at Informix Labs I build linear accelerators! Technology Evangelist Technology Pragmatist
Copyright © 2001 Informix 7
Agenda The Importance of XML Architectural View of XML Demonstrations
XML to ORDB using “roll-your-own”XML to ORDB using Mapping Tool
Copyright © 2001 Informix 9
Waves of Technology
Mainframes
Departmental Servers
Client Server
Internet Web Computing
eCommerce
Copyright © 2001 Informix 10
The Importance of XML
By 2003, more than 75% of ebusiness applications will include XML, regardless of which language the application has been written-in.
By 2003, more than 75% of ebusiness applications will include XML, regardless of which language the application has been written-in.
Copyright © 2001 Informix 11
Tasks/Roles Assumed by XML
0 10 20 30 40 50 60
Other
As a centralized database
As a document repository, possiblyreplacing SGML repositories
As middleware between an RDBMS andan e-commerce front end
Data transfer between applications andsystems
%
Source: [Walker00]
Copyright © 2001 Informix 12
Vendor Market Share 1999Vendor Product Revenue
US$ Million
Market Share %
Sterling/CA
Vision 3 26.8
SAG Tamino 2.4 21.4
Poet CMS 2.1 18.8
eXcelon eXcelon 1.5 13.4
All Others 2.2 19.6Source: [IDC00]
Copyright © 2001 Informix 13
XML DBs Predicted Growth
0
100
200
300
400
500
600
700
800
1999 2000 2001 2002 2003 2004
US
$ M
illi
on
Source: [IDC00]
Copyright © 2001 Informix 16
XML Persistence Options Indexed File System Database System
RelationalObjectNative
Dynamic Hashing Libraries Hybrid
Source: [Edwards01]
Copyright © 2001 Informix 17
XML Database ProductsType Data-
CentricDocument-
Centric
Middleware XML-Enabled DBs Native XML DBs XML Servers XML App Servers CMS Persistent DOM
Source: [Bourret00]
Copyright © 2001 Informix 18
Data-Centric Fine-grained data Order of elements not significant Examples
Sales OrderFlight ScheduleRestaurant Menu…
Machine consumption
Source: [Bourret00]
Copyright © 2001 Informix 19
Document-Centric Large-grained data Order of elements is significant Examples
BookEmailAdvertisement…
Human consumption
Source: [Bourret00]
Copyright © 2001 Informix 20
Three Types of XML DBs XML Generating Database XML Document Database XML Component Database
Source: [Chelsom00]
Copyright © 2001 Informix 21
XML Generating Database XML is generated from the database
XMLDocument
XMLDocument
XMLFor-
matter
Copyright © 2001 Informix 22
XML Document Database Database stores complete XML
documents or document fragments
XMLDocument
XMLDocument XML
DocumentXMLDocumentXML
Document
Copyright © 2001 Informix 23
XML Component Database Full XML awareness
XMLDocument
XMLDocument <A>
<B>...</B></A>
<A><B>...</B></A>
<A><B>...</B></A>
Copyright © 2001 Informix 25
Object Databases
World Wide Web
Java Language
Enterprise Java Beans
eXtensible Markup Language
Copyright © 2001 Informix 27
Cattell vs. Stonebraker Cattell
Stonebraker
Source: [Leavitt00]
ODBMSs occupy a small niche market that has no broad appeal. The technology is in semi-rigor mortis, …
ODBMSs occupy a small niche market that has no broad appeal. The technology is in semi-rigor mortis, …
Object-oriented databases are doing just fine, and the news of their demise is highly exaggerated.
Object-oriented databases are doing just fine, and the news of their demise is highly exaggerated.
Copyright © 2001 Informix 28
DB Sales Revenue
1999
US$
2001*
US$
RDB/ORDB 11.1 Billion 15.6 Billion
OODB 211 Million 265 Million
Source: [IDC00]*Predicted
Copyright © 2001 Informix 29
XML and OO XML is not OO
No inheritanceNo encapsulationNo behaviour...
OODB is overkill for structured text Some Content Management Systems are
built on top of OODBs
Copyright © 2001 Informix 31
Native Databases Many vendors developing “Native” XML
databases Documents needed in original form
Structural information is maintainedStorage, query and retrieval of structure and
content Good for point solutions Support for non-XML data?
Copyright © 2001 Informix 33
Relational Databases RDB products scale well Traditional and semi-structured data can
co-exist and be used by multiple applications
RDBs can process complex XML queries on large databases within seconds
Source: [Florescu99]
Copyright © 2001 Informix 34
Three Things We Need To Do Get XML into Database (storage) Get XML out of Database (retrieval) Query XML (processing)
Copyright © 2001 Informix 35
XML Storage
purchase_order
customer items
Multiple RelationalTable Mapping
1.0ORDERid, date
1.1PERSONgender,age
1.2ITEM
id
1.3ITEM
id
1.1.1NAME
1.1.1.1FAMILY
1.1.1.2GIVEN
1.1.2ADDRESS
XML DataPortHierarchical
Storage
BLOB/CLOB Storage
<?xml version='1.0'?><ORDER id="abc123" date="27 Oct 1999"> <PERSON age="50" gender="Male"> <NAME> <FAMILY>Doe</FAMILY> <GIVEN>John</GIVEN> </NAME> <ADDRESS> ... </ADDRESS> </PERSON> <ITEM id="s1">Shirt</ITEM> <ITEM id="j2">Jacket</ITEM></ORDER>
Copyright © 2001 Informix 36
Choosing RDB Storage Model If Relational schema already exists
Consider mapping to multiple tables
If no Relational schema exists Consider BLOB/CLOB model
If documents needed in original form Consider hierarchical model Structural information is maintained Storage, query and retrieval of structure and content
Copyright © 2001 Informix 37
XML Processing XML is SGML derivative HTML is SGML derivative Therefore …
Tools used for HTML can be reworked for XML
DTDs/XML SchemaSELECT query results formatted as XML
Copyright © 2001 Informix 38
XML Storage/Retrieval Multiple Relational Tables
Roll-your-ownMapping ToolJDBC
BLOB/CLOBVerity/Excalibur
Hierarchical Storage
Copyright © 2001 Informix 39
BLOB/CLOB BLOB storage for semi-structured data
This is the usual approach Indexing is key to efficient query
processingFull-text indexing for semi-structured dataAdvanced indexing for path queries
Copyright © 2001 Informix 41
Indexing Example
create table docs (id serial, xml_doc clob);
insert into docs values (0,FileToClob('d:\xml\order_abc123.xml', 'server'));
create index idx1 on docs (xml_doc vts_clob_ops)using vts in sbspace;
select * from docswhere vts_contains(xml_doc, '(John) <IN> GIVEN');
create table docs (id serial, xml_doc clob);
insert into docs values (0,FileToClob('d:\xml\order_abc123.xml', 'server'));
create index idx1 on docs (xml_doc vts_clob_ops)using vts in sbspace;
select * from docswhere vts_contains(xml_doc, '(John) <IN> GIVEN');
Copyright © 2001 Informix 42
XML Storage/Retrieval Multiple Relational Tables
Roll-your-ownMapping ToolJDBC
BLOB/CLOBVerity/Excalibur
Hierarchical Storage
Copyright © 2001 Informix 43
XML Storage/Retrieval Multiple Relational Tables
Roll-your-ownMapping ToolJDBC
BLOB/CLOBVerity/Excalibur
Hierarchical Storage
Copyright © 2001 Informix 44
XML Storage/Retrieval Multiple Relational Tables
Roll-your-ownMapping ToolJDBC
BLOB/CLOBVerity/Excalibur
Hierarchical Storage
Copyright © 2001 Informix 45
JAXP Overview Java API for XML Parsing (JAXP) is
currently available for programmatically accessing XML documents
JAXP can be divided into three setsSimple API for XML (SAX)Document Object Model (DOM)Plugability Layer
Copyright © 2001 Informix 46
JAXP Glossary SAX - event-driven protocol, with the
programmer providing callback methods that the parser invokes when parsing a document
DOM - random-access protocol, which converts an XML document into a collection of in-memory objects
Plugability Layer - standardizes access to SAX/DOM by providing “Factory” methods for creating and configuring SAX parsers and creating DOM objects (type “Document”)
Copyright © 2001 Informix 47
XML in JDBC 2.20 We would like to support users who use JAXP
in their JDBC applications without putting code that is specifically related to JDBC in the driver
New static methods to facilitate storage and retrieval of XML data in database columns
These methods not only support users of XML but also provide flexibility regarding which JAXP package the user is using
Copyright © 2001 Informix 48
Storing XML Data The methods used during data storage
will assist inParsing the XML dataVerify that well-formed and/or valid XML
data are storedInvalid XML data are rejected
Copyright © 2001 Informix 49
XMLtoString() Example
-- Example of inserting an XML file into an lvarchar columncreate table tab1 (col1 lvarchar);
try { String cmd = "insert into tab1 values(?)"; PreparedStatement pstmt = conn.prepareStatement(cmd); pstmt.setString(1, UtilXML.XMLtoString("/tmp/x.xml")); pstmt.execute(); pstmt.close();} catch (SQLException e) { ... }
-- Example of inserting an XML file into an lvarchar columncreate table tab1 (col1 lvarchar);
try { String cmd = "insert into tab1 values(?)"; PreparedStatement pstmt = conn.prepareStatement(cmd); pstmt.setString(1, UtilXML.XMLtoString("/tmp/x.xml")); pstmt.execute(); pstmt.close();} catch (SQLException e) { ... }
Copyright © 2001 Informix 50
Retrieving XML Data The methods used during data retrieval
will assist in convertingXML data to type “InputSource” which is the
standard input type for both SAX and DOM methods
XML data to DOM
Copyright © 2001 Informix 51
getInputSource() Example (1)
-- Fetch XML data from an lvarchar column into an InputSource -- for (SAX) parsing
try { String sql = "select col1 from tab1"; Statement stmt = conn.createStatement(); ResultSet r = stmt.executeQuery(sql);
// Other SAX parsers can go here if desired
Parser p = ParserFactory.makeParser("com.sun.xml.parser.Parser"); p.setDocumentHandler(new myHandler()); p.setErrorHandler(new errHandler());
-- Fetch XML data from an lvarchar column into an InputSource -- for (SAX) parsing
try { String sql = "select col1 from tab1"; Statement stmt = conn.createStatement(); ResultSet r = stmt.executeQuery(sql);
// Other SAX parsers can go here if desired
Parser p = ParserFactory.makeParser("com.sun.xml.parser.Parser"); p.setDocumentHandler(new myHandler()); p.setErrorHandler(new errHandler());
Copyright © 2001 Informix 52
getInputSource() Example (2)
while(r.next()) { InputSource i = UtilXML.getInputSource(r.getString(1)); p.parse(i); } r.close();} catch (SQLException e) { ... }
while(r.next()) { InputSource i = UtilXML.getInputSource(r.getString(1)); p.parse(i); } r.close();} catch (SQLException e) { ... }
Copyright © 2001 Informix 53
DOM Support The DOM specification does not provide
a standard way to create a DOM object JAXP provides factory methods that
provide a standard way of creating DOM objects
Copyright © 2001 Informix 54
InputStreamtoDOM() Example
-- Fetch XML data from a text column into a DOM objectcreate table tab2 (col1 text);
try { String sql = "select col1 from tab2"; Statement stmt = conn.createStatement(); ResultSet r = stmt.executeQuery(sql); while(r.next()) { Document doc = UtilXML.InputStreamtoDOM(r.getAsciiStream(1)); } r.close();} catch (SQLException e) { ... }
-- Fetch XML data from a text column into a DOM objectcreate table tab2 (col1 text);
try { String sql = "select col1 from tab2"; Statement stmt = conn.createStatement(); ResultSet r = stmt.executeQuery(sql); while(r.next()) { Document doc = UtilXML.InputStreamtoDOM(r.getAsciiStream(1)); } r.close();} catch (SQLException e) { ... }
Copyright © 2001 Informix 55
XML Parser JDBC driver uses Sun’s JAXP API and
by default a non-validating XML Parser The default can be changed in two ways
where <new parser> is the alternative parser% java -Dorg.xml.sax.parser=<new parser>System.setProperty("org.xml.sax.parser",
"<new parser>");
Copyright © 2001 Informix 56
JAXP Summary JDBC 2.20 XML support makes it easy to
store/retrieve XML documents to/from an Informix Database using Sun’s JAXP 1.0 API
Ensures valid or well-formed XML document during insertion because of XML parsing using the SAX protocol
Sun’s non-validation parser is used by default, but the ability to specify and use any parser is provided
Copyright © 2001 Informix 59
Cloudscape Cloudscape can store Java objects in
table columnsNot just blobs – objects have structure
Java code can accept different data and store as XML
Embed XML formatter into CloudscapeExtend server
Copyright © 2001 Informix 60
Cloudscape Demo: Tables
create table xml_objects (dtd_name char(20),constraint dtd_name_primary_key primary key,xml serialize(xmlobject));
create table dtd_nodes (nodename char(20),constraint nodename_primary_key primary key,contains_elements varchar(20),node_root boolean,contains_attributes varchar(20),attribute_required boolean,contains_data boolean,data_required boolean);
create table xml_objects (dtd_name char(20),constraint dtd_name_primary_key primary key,xml serialize(xmlobject));
create table dtd_nodes (nodename char(20),constraint nodename_primary_key primary key,contains_elements varchar(20),node_root boolean,contains_attributes varchar(20),attribute_required boolean,contains_data boolean,data_required boolean);
Copyright © 2001 Informix 61
Cloudscape Demo: Java (1)
import ...
public class XMLObject implements Serializable {
public Vector elementNames; public Vector elementValues; public String rtnString;
public void XMLObject(Vector names, Vector values) { this.elementNames = names; this.elementValues = values; }...
import ...
public class XMLObject implements Serializable {
public Vector elementNames; public Vector elementValues; public String rtnString;
public void XMLObject(Vector names, Vector values) { this.elementNames = names; this.elementValues = values; }...
Copyright © 2001 Informix 62
Cloudscape Demo: Java (2)
...
public String returnXMLFormat(String DTD) {
genFromDTD dtd = new genFromDTD(); // XML Formatter
rtnString = genFromDTD.returnXMLFormat(this, DTD);
return rtnString;
}
public String toString() {
return "XMLObject Class";
}
}
...
public String returnXMLFormat(String DTD) {
genFromDTD dtd = new genFromDTD(); // XML Formatter
rtnString = genFromDTD.returnXMLFormat(this, DTD);
return rtnString;
}
public String toString() {
return "XMLObject Class";
}
}
Copyright © 2001 Informix 63
Cloudscape Demo: SQL
select xml.returnXMLFormat('BOOKS')from xml_objectswhere dtd_name = 'BOOKS';
select xml.returnXMLFormat('BOOKS')from xml_objectswhere dtd_name = 'BOOKS';
Copyright © 2001 Informix 64
Cloudscape XML Demo
1.Start Cloudview
4.Start Cloudview
5.Start Web Server
6.Start Browser
7.Stop Web Server
2.View XML
3.Compile Java Files
Copyright © 2001 Informix 65
Object Translator Provides an object view of a database
Supports Java™/EJB (and VB/MTS)
Builds an object model from a relational schema DBA can focus on the schema, developers focus on
Java
Outputs components Supports Cloudscape, Informix and other
JDBC sources
Copyright © 2001 Informix 66
Mapping/Modelling Process
OR MapsOR Maps
Data ModelData Model
Compile-timeSQL
Compile-timeSQL
Runtime DatabaseAccess
Runtime DatabaseAccess
Reverse Engineer
Object ModelObject Model
UMLUML
Forward Engineer
Object Translator Solution
Copyright © 2001 Informix 67
Object Translator 1.1 Developer maps XML documents to map
objects Generated Java objects become XML
document handlersStore and restore the XML document data in
the database XML markup is not stored or restored
Allows applications to use existing schemas for incoming XML documents
Copyright © 2001 Informix 68
Object Translator XML Demo Use an existing XML document Create links between elements of XML
document and attributes of map object Generate Java files and servlet from map
object Compile and run
Copyright © 2001 Informix 69
Object Translator XML Demo
1.Start Cloudview
4.Copy Files
5.Start Web Server
6.Start Browser
7.Stop Web Server
2.View XML
3.Start OT
Copyright © 2001 Informix 70
What about Performance? A couple of independent benchmarks are
being developedXMach-1XML Store...
Copyright © 2001 Informix 71
Example Performance Results
We conclude DTD approach is the best strategy among the six approaches we studied and there is no clear need to build an “XML-specific” database system.
We conclude DTD approach is the best strategy among the six approaches we studied and there is no clear need to build an “XML-specific” database system.
Source: [Tian]
Copyright © 2001 Informix 72
Final Thoughts ... Technology is moving fast Vendor marketing ahead of product
capabilities Many “Beta” products available
Copyright © 2001 Informix 73
Software Downloads Cloudscape
http://www.cloudscape.com/ Object Translator
http://www.informix.com/idn-secure/webtools/ot/