XML, Databases and Business Intelligence Presentation to the GCPCUG Data Warehousing SIG - March 19,...
-
Upload
megan-lewis -
Category
Documents
-
view
215 -
download
1
Transcript of XML, Databases and Business Intelligence Presentation to the GCPCUG Data Warehousing SIG - March 19,...
XML, Databases and XML, Databases and Business IntelligenceBusiness Intelligence
Presentation to thePresentation to the
GCPCUG Data Warehousing SIG -GCPCUG Data Warehousing SIG -
March 19, 2001March 19, 2001
Copyright © 2001 by Michael A. Mina - [email protected]
Presentation OverviewPresentation Overview
Introduction to XMLIntroduction to XML
XML and DatabasesXML and Databases
XML and Business IntelligenceXML and Business Intelligence
XML ResourcesXML Resources
Copyright © 2001 by Michael A. Mina - [email protected]
What is XML?What is XML?
Extensible Markup Language - born Extensible Markup Language - born 2/19982/1998
Extensible - allows new markup Extensible - allows new markup languageslanguages
More than HTML, less than SGMLMore than HTML, less than SGML XML family of specificationsXML family of specifications
• XML, XSL, DOM, XML Namespaces, XLink, XML, XSL, DOM, XML Namespaces, XLink, XPointer, XPath, etc.XPointer, XPath, etc.
More specifications on the wayMore specifications on the way• XML Schema, XML Query LanguageXML Schema, XML Query Language
Copyright © 2001 by Michael A. Mina - [email protected]
Uses of XMLUses of XML
Data StorageData Storage
Data InterchangeData Interchange
Data Display/RenderingData Display/Rendering
It’s about It’s about datadata
Copyright © 2001 by Michael A. Mina - [email protected]
Uses of XMLUses of XML
Data StorageData Storage
• Products marketed as “XML Products marketed as “XML databases”databases”– TaminoTamino– TEXTMLTEXTML
• Texts dealing with XML databasesTexts dealing with XML databases• XML-enabled databasesXML-enabled databases
Copyright © 2001 by Michael A. Mina - [email protected]
Uses of XMLUses of XML
When is XML Suited for Data Storage?When is XML Suited for Data Storage?• Data needs to be accessed by many Data needs to be accessed by many
systemssystems• Hierarchical dataHierarchical data• Smaller data setSmaller data set• Speed not criticalSpeed not critical• Simpler queries usedSimpler queries used• Data types not criticalData types not critical• Data must be stored for a long timeData must be stored for a long time
Copyright © 2001 by Michael A. Mina - [email protected]
Uses of XMLUses of XML
Data InterchangeData Interchange
• No middleware needed if applications No middleware needed if applications can read and write XMLcan read and write XML
• By 2003, up to 80% of data By 2003, up to 80% of data interchange between applications interchange between applications over public networks will be in XML over public networks will be in XML (per Gartner Group)(per Gartner Group)
Copyright © 2001 by Michael A. Mina - [email protected]
Uses of XMLUses of XML
Data Display/RenderingData Display/Rendering• Present the same content differently Present the same content differently
for different devicesfor different devices
Before XML . . .Before XML . . .• Either support older standard only Either support older standard only
(e.g., HTML 3.2)(e.g., HTML 3.2)• Or develop multiple sets of pages and Or develop multiple sets of pages and
redirect user based on their browserredirect user based on their browser
Copyright © 2001 by Michael A. Mina - [email protected]
Uses of XMLUses of XML
With XML . . .With XML . . .
• One set of XML documentsOne set of XML documents– One XSL document for each browser/deviceOne XSL document for each browser/device
• If a new device or new use for existing If a new device or new use for existing device emerges…device emerges…– develop new standard protocol (e.g., WAP)develop new standard protocol (e.g., WAP)– develop another XSL documentdevelop another XSL document
Copyright © 2001 by Michael A. Mina - [email protected]
Uses of XMLUses of XML
Then eitherThen either• serve XML and XSL to clientserve XML and XSL to client
OrOr• transform XML with XSL at servertransform XML with XSL at server• serve appropriate markup to clientserve appropriate markup to client
Copyright © 2001 by Michael A. Mina - [email protected]
Why is XML needed?Why is XML needed?
Consider HTMLConsider HTML
• HyperText Markup LanguageHyperText Markup Language
• Based on SGMLBased on SGML
• Most web pages use HTMLMost web pages use HTML
Copyright © 2001 by Michael A. Mina - [email protected]
Why is XML needed?Why is XML needed?
Advantages of HTMLAdvantages of HTML• Easy to learn compared to most Easy to learn compared to most
programming languagesprogramming languages Readily available authoring tools Readily available authoring tools
(even a text file editor)(even a text file editor) Readily available rendering toolReadily available rendering tool
Browsers are free, all new PCs have Browsers are free, all new PCs have browsers installedbrowsers installed
Copyright © 2001 by Michael A. Mina - [email protected]
Why is XML needed?Why is XML needed?
Disadvantages of HTMLDisadvantages of HTML• Deviation from its original purpose Deviation from its original purpose
– Presentation should be based on a styling Presentation should be based on a styling languagelanguage
• Lack of extensibilityLack of extensibility• Toleration of faulty codeToleration of faulty code
– acceptable for web page designacceptable for web page design– unacceptable for transmission of drug unacceptable for transmission of drug
datadata
Copyright © 2001 by Michael A. Mina - [email protected]
Why is XML needed?Why is XML needed?
Consider SGMLConsider SGML
• Standard Generalized Markup Standard Generalized Markup LanguageLanguage– No toleration of faulty codeNo toleration of faulty code– Completely extensibleCompletely extensible
• HTML, XML based on SGMLHTML, XML based on SGML
Copyright © 2001 by Michael A. Mina - [email protected]
Why is XML needed?Why is XML needed?
The advantages of SGML are actually The advantages of SGML are actually disadvantages in the web disadvantages in the web environmentenvironment
Complete extensibility of SGML Complete extensibility of SGML meansmeans• It is not cost-effective to develop It is not cost-effective to develop
browsers to support SGMLbrowsers to support SGML• Potentially huge bandwidth and storage Potentially huge bandwidth and storage
issuesissues
Copyright © 2001 by Michael A. Mina - [email protected]
Why is XML needed?Why is XML needed?
XML allows the use of metadata - XML allows the use of metadata - “data about data”“data about data”
HTML tagsHTML tags• <p>The Gettysburg Address was <p>The Gettysburg Address was
written by Abraham Lincoln</p>written by Abraham Lincoln</p> XML elementsXML elements
• <document>The Gettysburg Address <document>The Gettysburg Address </document> was written by </document> was written by <president><author>Abraham <president><author>Abraham Lincoln</author></president>Lincoln</author></president>
Copyright © 2001 by Michael A. Mina - [email protected]
Basic XMLBasic XML
<?xml version="1.0"?><CONTACTS> <DATABASE USERTYPE="PERSONAL">Contact List</DATABASE> <ENTRY NUMBER="1"> <NAME> <LAST_NAME>Sanford</LAST_NAME> <FIRST_NAME>Bill</FIRST_NAME> </NAME> <TITLE>VP, Controller</TITLE> <COMPANY>SDC, Inc.</COMPANY> <WEBSITE>www.sdcinc.biz</WEBSITE> <ADDRESS>4132 Homestead Rd.</ADDRESS> <CITY>Parma</CITY> <STATE>OH</STATE> <ZIP>44134</ZIP> <PHONE> <DIRECT>440-398-2098</DIRECT> <CELLULAR>440-123-4567</CELLULAR> </PHONE> <EMAIL>[email protected]</EMAIL> </ENTRY></CONTACTS>
XML Markup includes:
• XML declaration
• Root Element
• Elements
• Attributes
• Entities
Copyright © 2001 by Michael A. Mina - [email protected]
XHTMLXHTML
Next-generation of HTMLNext-generation of HTML HTML specification rewritten to be XML HTML specification rewritten to be XML
compliantcompliant XML is XML is notnot going to replace HTML, going to replace HTML,
XHTML isXHTML is Differences between HTML, XHTML Differences between HTML, XHTML
include:include:• lower case tags requiredlower case tags required• proper nesting and closure of tagsproper nesting and closure of tags• quoting attributesquoting attributes
Copyright © 2001 by Michael A. Mina - [email protected]
ParsersParsers
A parser is a program that A parser is a program that processes an XML document. processes an XML document.
IE includes a parser that allows the IE includes a parser that allows the rendering of XML documents. rendering of XML documents.
Parsers are either validating or Parsers are either validating or non-validating. non-validating.
Copyright © 2001 by Michael A. Mina - [email protected]
Well-formednessWell-formedness
An XML document is An XML document is well-formedwell-formed if if• attribute values are in quotesattribute values are in quotes• tags are properly nestedtags are properly nested• start and end tags are the same casestart and end tags are the same case• there is one root elementthere is one root element• empty elements must be formatted empty elements must be formatted
properlyproperly
If it’s not well-formed, it’s not XMLIf it’s not well-formed, it’s not XML
Copyright © 2001 by Michael A. Mina - [email protected]
Document Type Definition Document Type Definition (DTD)(DTD)
Used to specify how elements, Used to specify how elements, attributes, etc. relate to each otherattributes, etc. relate to each other
DTDs are DTDs are notnot XML documents, but are XML documents, but are used by themused by them
DTDs do not support data typingDTDs do not support data typing XML Schema being developed to XML Schema being developed to
address lack of data typingaddress lack of data typing• Schemas currently exist (e.g., Microsoft Schemas currently exist (e.g., Microsoft
XDR)XDR)• The W3C is working on an XML Schema The W3C is working on an XML Schema
recommendationrecommendation
Copyright © 2001 by Michael A. Mina - [email protected]
Document Type Definition Document Type Definition (DTD)(DTD)
<!ELEMENT CONTACTS (DATABASE, ENTRY+)><!ELEMENT DATABASE (#PCDATA)><!ATTLIST DATABASE USERTYPE (PERSONAL|CORPORATE) "PERSONAL"><!ELEMENT ENTRY (NAME, TITLE, COMPANY, WEBSITE?, ADDRESS, CITY, STATE, ZIP, PHONE, PAGER?, FAX?, EMAIL?)><!ATTLIST ENTRY NUMBER CDATA #IMPLIED><!ELEMENT NAME (LAST_NAME, FIRST_NAME)><!ELEMENT LAST_NAME (#PCDATA)><!ELEMENT FIRST_NAME (#PCDATA)><!ELEMENT TITLE (#PCDATA)><!ELEMENT COMPANY (#PCDATA)><!ELEMENT WEBSITE (#PCDATA)><!ELEMENT ADDRESS (#PCDATA)><!ELEMENT CITY (#PCDATA)><!ELEMENT STATE (#PCDATA)><!ELEMENT ZIP (#PCDATA)><!ELEMENT PHONE (OFFICE?, DIRECT?, CELLULAR?)>. . .ETC.
Copyright © 2001 by Michael A. Mina - [email protected]
Validating XMLValidating XML
An XML document that conforms to its An XML document that conforms to its DTD is DTD is validvalid
Validating parsersValidating parsers• IBM's XML4J ParserIBM's XML4J Parser
– online at online at http://www.oasis-open.org/cover/xml4j-http://www.oasis-open.org/cover/xml4j-check00.htmlcheck00.html
• IBM's DOMit: A servlet for XML validationIBM's DOMit: A servlet for XML validation– online at http://www.networking.ibm.com/ online at http://www.networking.ibm.com/
xml/XmlValidatorForm.htmxml/XmlValidatorForm.htm
• IE itself, modified by installing a download IE itself, modified by installing a download from http://msdn.microsoft.comfrom http://msdn.microsoft.com
Copyright © 2001 by Michael A. Mina - [email protected]
Validating XMLValidating XML
Copyright © 2001 by Michael A. Mina - [email protected]
Validating XMLValidating XML
Copyright © 2001 by Michael A. Mina - [email protected]
Validating XMLValidating XML
Copyright © 2001 by Michael A. Mina - [email protected]
XSLXSL
Extensible Stylesheet LanguageExtensible Stylesheet Language
Two specificationsTwo specifications• XSL Transformations (XSLT)XSL Transformations (XSLT)• XSL Formatting ObjectsXSL Formatting Objects
XSLT is a W3C recommendation, XSLT is a W3C recommendation, XSL Formatting Objects is not (yet)XSL Formatting Objects is not (yet)
Copyright © 2001 by Michael A. Mina - [email protected]
XSLTXSLT
Transforms XML into other markup Transforms XML into other markup languageslanguages
Often used to transform XML to Often used to transform XML to HTMLHTML
Limited query-like functionalityLimited query-like functionality
Copyright © 2001 by Michael A. Mina - [email protected]
An XSL DocumentAn XSL Document
<?xml version="1.0" ?><xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"><xsl:template match="/"><html> <head> <title> <xsl:value-of select="CONTACTS/DATABASE" /> </title> </head><body style="background-color: DDDDDD;"> <h2 align="center"> <xsl:value-of select="CONTACTS/DATABASE" /> <hr /> </h2><!-- --> <xsl:for-each select="CONTACTS/ENTRY[COMPANY='SDC, Inc.']" order-by ="NAME/LAST_NAME"> <table align="center" width="400" style="font-family: sans-serif; font-size: 10pt; background-color: EEEEEE;"> <tr> <td width="200"><b> <xsl:value-of select="NAME/FIRST_NAME" />
SELECT
WHERE
ORDER BY
XSLT Query-like
functionality:
Copyright © 2001 by Michael A. Mina - [email protected]
An XSL DocumentAn XSL Document
<?xml version="1.0" ?><xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"><xsl:template match="/"><html> <head> <title> <xsl:value-of select="CONTACTS/DATABASE" /> </title> </head><body style="background-color: DDDDDD;"> <h2 align="center"> <xsl:value-of select="CONTACTS/DATABASE" /> <hr /> </h2><!-- --> <xsl:for-each select="CONTACTS/ENTRY[COMPANY='SDC, Inc.']" order-by ="NAME/LAST_NAME"> <table align="center" width="400" style="font-family: sans-serif; font-size: 10pt; background-color: EEEEEE;"> <tr> <td width="200"><b> <xsl:value-of select="NAME/FIRST_NAME" />
XSLT
HTML
CSS
Other functionality:
Copyright © 2001 by Michael A. Mina - [email protected]
XML, XSL and JavaScriptXML, XSL and JavaScript
<html><head><title>Test XML Page</title></head><body><script language = "JavaScript">
var xmlObject = new ActiveXObject("microsoft.xmldom")xmlObject.async = falsexmlObject.load("contacts.xml")
var xslObject = new ActiveXObject("microsoft.xmldom")xslObject.async = falsexslObject.load("contacts.xsl")
document.write(xmlObject.transformNode(xslObject))
</script></body></html>
Copyright © 2001 by Michael A. Mina - [email protected]
XML, XSL and JavaScriptXML, XSL and JavaScript
Copyright © 2001 by Michael A. Mina - [email protected]
XML and DatabasesXML and Databases
Microsoft SQL Server 2000Microsoft SQL Server 2000
Oracle products (various)Oracle products (various)
IBM DB2 UDB v. 7.1IBM DB2 UDB v. 7.1
Copyright © 2001 by Michael A. Mina - [email protected]
Microsoft SQL Server 2000Microsoft SQL Server 2000
SQL can retrieve results in XML SQL can retrieve results in XML formatformat
Three XML modes: Raw, Auto, ExplicitThree XML modes: Raw, Auto, Explicit Raw mode - result row tagged <row>Raw mode - result row tagged <row> Auto mode - more control over tagsAuto mode - more control over tags Explicit modeExplicit mode
• Default tags - table names, field namesDefault tags - table names, field names• Overwrite by specifying DTD with queryOverwrite by specifying DTD with query• Specify shape of the XML tree Specify shape of the XML tree • Requires relatively complex SQL queriesRequires relatively complex SQL queries
Copyright © 2001 by Michael A. Mina - [email protected]
Microsoft SQL Server 2000Microsoft SQL Server 2000
XML View MapperXML View Mapper• Create schema file to relate XML Data Create schema file to relate XML Data
Reduced (XDR) schema to SQL Server Reduced (XDR) schema to SQL Server schemaschema
UpdategramsUpdategrams• Express changes to XML document as Express changes to XML document as
database inserts, updates, and database inserts, updates, and deletesdeletes
Copyright © 2001 by Michael A. Mina - [email protected]
Oracle ProductsOracle Products
Intelligent Webhouse InitiativeIntelligent Webhouse Initiative
Oracle 8i - “the world’s first XML-Oracle 8i - “the world’s first XML-enabled database”enabled database”
Oracle Reports 6iOracle Reports 6i• Reports can be stored as XSLReports can be stored as XSL
Copyright © 2001 by Michael A. Mina - [email protected]
Oracle ProductsOracle Products
Oracle JDeveloper 3.1Oracle JDeveloper 3.1• Allows development of web Allows development of web
applications that process XML dataapplications that process XML data• Syntax-checking for XML, XSLSyntax-checking for XML, XSL• XSQL: Java programs that read XML XSQL: Java programs that read XML
from and write XML to databasefrom and write XML to database• Integration with Oracle 8iIntegration with Oracle 8i
Copyright © 2001 by Michael A. Mina - [email protected]
IBM DB2 UDB v. 7.1IBM DB2 UDB v. 7.1
DB2 XML ExtenderDB2 XML Extender• facility to enable DB2 to work with facility to enable DB2 to work with
XMLXML
Net.DataNet.Data• macro language for DB2 UDBmacro language for DB2 UDB
Copyright © 2001 by Michael A. Mina - [email protected]
IBM DB2 UDB v. 7.1IBM DB2 UDB v. 7.1
DB2 XML ExtenderDB2 XML Extender
• Repository for XML and DTDsRepository for XML and DTDs
• Storage methodsStorage methods– XML columnXML column– XML collectionXML collection
Copyright © 2001 by Michael A. Mina - [email protected]
IBM DB2 UDB v. 7.1IBM DB2 UDB v. 7.1
XML columnXML column• Entire XML document stored in one Entire XML document stored in one
column as an XML UDTcolumn as an XML UDT• Data Access Definition (DAD) defines Data Access Definition (DAD) defines
indexes based on elements and indexes based on elements and attributesattributes
XML collectionXML collection• Relational tables mapped to/from XMLRelational tables mapped to/from XML• DAD maps DTD to tables and columnsDAD maps DTD to tables and columns
Copyright © 2001 by Michael A. Mina - [email protected]
IBM DB2 UDB v. 7.1IBM DB2 UDB v. 7.1
DB2 XML Extender also allowsDB2 XML Extender also allows
• SQL to query XML based on elements SQL to query XML based on elements and attributesand attributes
• Stored procedures to generate XML Stored procedures to generate XML from DB2from DB2
Copyright © 2001 by Michael A. Mina - [email protected]
IBM DB2 UDB v. 7.1IBM DB2 UDB v. 7.1
Net.DataNet.Data
• Allows conversion of SQL results to Allows conversion of SQL results to XMLXML
• Is not restricted to DB2 UDB as a data Is not restricted to DB2 UDB as a data sourcesource
Copyright © 2001 by Michael A. Mina - [email protected]
XML and Query XML and Query LanguagesLanguages
XPathXPath• not based on XMLnot based on XML• limited functionalitylimited functionality• relatively difficult to understandrelatively difficult to understand
XSLTXSLT• based on XMLbased on XML• works with XPath, HTML, CSSworks with XPath, HTML, CSS• also has limited functionalityalso has limited functionality
Copyright © 2001 by Michael A. Mina - [email protected]
XML and Query XML and Query LanguagesLanguages
Per the W3C website:
"The mission of the XML Query working group is to provide flexible query facilities to extract data from real and virtual documents on the Web, therefore finally providing the needed interaction between the web world and the database world. Ultimately, collections of XML files will be accessed like databases.”
(emphasis added)
Copyright © 2001 by Michael A. Mina - [email protected]
XML EditorsXML Editors
Microsoft - XML NotepadMicrosoft - XML Notepad Tanyitech - Easy XML 1.0Tanyitech - Easy XML 1.0
– $39 at http://www.tanyitech.com$39 at http://www.tanyitech.com
Altova - XML SpyAltova - XML Spy– $199 at http://www.xmlspy.com$199 at http://www.xmlspy.com
Extensibility - Turbo XMLExtensibility - Turbo XML– $269 at http://www.entensibility.com$269 at http://www.entensibility.com
Popkin Software - Envision XMLPopkin Software - Envision XML– http://www.popkin.comhttp://www.popkin.com
Copyright © 2001 by Michael A. Mina - [email protected]
XML EditorsXML Editors
Copyright © 2001 by Michael A. Mina - [email protected]
XML Servers/DatabasesXML Servers/Databases
IxiaSoft - TEXTML ServerIxiaSoft - TEXTML Server• http://www.ixiasoft.comhttp://www.ixiasoft.com• TEXTML Server LiteTEXTML Server Lite, a free evaluation , a free evaluation
version, is availableversion, is available
Software AG - TaminoSoftware AG - Tamino• http://www.softwareag.com/taminohttp://www.softwareag.com/tamino
Copyright © 2001 by Michael A. Mina - [email protected]
XML and Business XML and Business IntelligenceIntelligence
XML for AnalysisXML for Analysis
Common Warehouse Metamodel Common Warehouse Metamodel (CWM)(CWM)
Predictive Model Markup Language Predictive Model Markup Language (PMML)(PMML)
Copyright © 2001 by Michael A. Mina - [email protected]
XML for AnalysisXML for Analysis
A platform-independent Microsoft A platform-independent Microsoft specificationspecification
Enable access to analytical data from Enable access to analytical data from XML for Analysis-compliant clientsXML for Analysis-compliant clients
Based on HTTP, XML, SOAP, OLE DB Based on HTTP, XML, SOAP, OLE DB for OLAP, OLE DB for Data Miningfor OLAP, OLE DB for Data Mining
Supporters include AlphaBlox, Brio, Supporters include AlphaBlox, Brio, Business Objects, Cognos, SAS, SPSSBusiness Objects, Cognos, SAS, SPSS
Copyright © 2001 by Michael A. Mina - [email protected]
Common Warehouse Common Warehouse MetamodelMetamodel
Per the CWM website (http://www.cwmforum.org):
“The purpose of OMG’s Common Warehouse Metadata Initiative (CWMI) is to enable easy interchange of metadata between data warehousing tools and metadata repositories in distributed heterogeneous environments.
Copyright © 2001 by Michael A. Mina - [email protected]
Common Warehouse Common Warehouse MetamodelMetamodel
The CWM is a specification for The CWM is a specification for modeling metadata (relational, non-modeling metadata (relational, non-relational, multidimensional) found in relational, multidimensional) found in a data warehousing environment. a data warehousing environment.
Instances of the metamodel are Instances of the metamodel are exchanged via XMI (XML Metadata exchanged via XMI (XML Metadata Interchange) documents. Interchange) documents.
““The ultimate goal of CWM is to do for The ultimate goal of CWM is to do for data warehousing and business data warehousing and business intelligence tools what HTML did for intelligence tools what HTML did for web browsersweb browsers.”.”
Copyright © 2001 by Michael A. Mina - [email protected]
PMMLPMML
Predictive Model Markup LanguagePredictive Model Markup Language• Developed by the Data Mining Group Developed by the Data Mining Group
(http://www.dmg.org/html/pmml_v1_1(http://www.dmg.org/html/pmml_v1_1.html).html)
Allows reuse of predictive models Allows reuse of predictive models between PMML-compliant between PMML-compliant applicationsapplications
Copyright © 2001 by Michael A. Mina - [email protected]
XML ResourcesXML Resources
World Wide Web ConsortiumWorld Wide Web Consortium• http://www.w3.orghttp://www.w3.org
The XML Industry PortalThe XML Industry Portal• http://www.xml.orghttp://www.xml.org
XML101.comXML101.com• http://www.xml101.comhttp://www.xml101.com
XML MagicXML Magic• http://www.xmlmagic.comhttp://www.xmlmagic.com
<closing><closing>Thank You For AttendingThank You For Attending
</closing></closing>