Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples...

Post on 22-Dec-2015

220 views 0 download

Transcript of Processing XML Part II Parser Operations with DOM and SAX overview XML Validation with examples...

Processing XML Part II

• Parser Operations with DOM and SAX overview • XML Validation with examples

• Processing XML with SAX (locally and on the internet)

FixedFloatSwap.xml

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"><FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments></FixedFloatSwap>

FixedFloatSwap.dtd

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

Operation of a Tree-based Parser

Tree-BasedParser

ApplicationLogic

Document Tree

Valid

XML DTD

XML Document

Tree Benefits

• Some data preparation tasks require early

access to data that is further along in the

document (e.g. we wish to extract titles to build a table of contents)

• New tree construction is easier (e.g. xslt works from a tree to convert FpML to WML)

Operation of an Event Based Parser

Event-BasedParser

ApplicationLogic

Valid

XML DTD

XML Document

Operation of an Event Based Parser

Event-BasedParser

ApplicationLogic

Valid

XML DTD

XML Document

public void startDocument ()public void endDocument ()public void startElement (String name, AttributeList attrs)public void endElement (String name)public void characters (char buf [], int offset, int len)

public void error(SAXParseException e) throws SAXException { System.out.println("\n\n--Invalid document ---" + e); }

Event-Driven Benefits

• We do not need the memory required for trees

• Parsing can be done faster with no tree construction going on

XML Validation

A batch validating process involves comparing the DTD against a complete document instance and producing a report containing any errors or warnings.

Software developers should consider batch validation to be analogous to program compilation, with similar errors detected.

Interactive validation involves constant comparison of the DTDagainst a document as it is being created.

XML Validation

The benefits of validating documents against a DTD include:

• Programmers can write extraction and manipulation filters without fear of their software ever processing unexpected input.

• Using an XML-aware word processor, authors and editors can be guided and constrained to produce conforming documents.

XML Validation Examples

XML elements may contain further, embedded elements, andthe entire document must be enclosed by a single documentelement.

The degree to which an element’s content is organized into childelements is often termed its granularity.

Some hierarchical structures may be recursive.

The Document Type Definition (DTD) contains rules for each elementallowed within a specific class of documents.

// Validate.java

import java.io.*;import org.xml.sax.*;import javax.xml.parsers.SAXParserFactory;import javax.xml.parsers.ParserConfigurationException;import javax.xml.parsers.SAXParser;

public class Validate extends HandlerBase{ public static boolean valid = true;

public static void main (String argv []) { if (argv.length != 1) { System.err.println ("Usage: java Validate filename.xml"); System.exit (1); }

SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setValidating(true);

We’ll run this program against several xml fileswith DTD’s.

try { SAXParser saxParser = factory.newSAXParser(); saxParser.parse( new File(argv [0]), new Validate());

} catch (Throwable t) {

t.printStackTrace ();

} System.out.println("Valid document is " + valid); System.exit (0); }

public void error(SAXParseException e) throws SAXException { System.out.println(e.toString()); valid = false; }}

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"><FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments></FixedFloatSwap>

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

XML Document

DTD

Valid document is true

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"><FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments></FixedFloatSwap>

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

XML Document

DTD

Valid document is false

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Swaps SYSTEM "FixedFloatSwap.dtd"><Swaps> <FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

<FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap></Swaps>

XML Document

<?xml version="1.0" encoding="utf-8"?><!ELEMENT Swaps (FixedFloatSwap+) ><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

DTD

C:\McCarthy\www\46-928\examples\sax>java Validate FixedFloatSwap.xml

Quantity Indicators ? 0 or 1 time + 1 or more times * 0 or more times

Valid document is true

The locations where document text data is allowed are indicated by the keyword ‘PCDATA’ (Parsed Character Data).

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd">

<FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears> <StartYear>2000</StartYear> <EndYear>2002</EndYear> </NumYears> <NumPayments>6</NumPayments>

</FixedFloatSwap>

XML Document

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

C:\McCarthy\www\46-928\examples\sax>java Validate FixedFloatSwap.xmlorg.xml.sax.SAXParseException: Element "NumYears" does not allow "StartYear" --(#PCDATA)org.xml.sax.SAXParseException: Element type "StartYear" is not declared.org.xml.sax.SAXParseException: Element "NumYears" does not allow "EndYear" -- (#PCDATA)org.xml.sax.SAXParseException: Element type "EndYear" is not declared.Valid document is false

Output of program afterbeing modified to displaythe error.

DTD

There are strict rules which must be applied when an element is allowed to contain both text and child elements.

The PCDATA keyword must be the first token in the group, and the group must be a choice group (using “|” not “,”).

The group must be optional and repeatable.

This is known as a mixed content model.

<?xml version="1.0" encoding="utf-8"?><!ELEMENT Mixed (emph) ><!ELEMENT emph (#PCDATA | sub | super)* ><!ELEMENT sub (#PCDATA)><!ELEMENT super (#PCDATA)>

DTD

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE Mixed SYSTEM "Mixed.dtd"><Mixed> <emph>H<sub>2</sub>O is water.</emph></Mixed>

XML Document

Valid document istrue

AttributesAn attribute is associated with a particular element by the DTDand is assigned an attribute type.

The attribute type can restrict the range of values it can hold.

Example attribute types include :

CDATA indicates a simple string of characters NMTOKEN indicates a word or token A named token group such as (left | center | right)

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) ><!ATTLIST Notional currency (Dollars | Pounds) #REQUIRED>

DTD

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

XML Document

C:\McCarthy\www\46-928\examples\sax>java Validate FixedFloatSwap.xmlorg.xml.sax.SAXParseException: Attribute value for "currency" is #REQUIRED.

Valid document is false

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) ><!ATTLIST Notional currency (Dollars | Pounds) #REQUIRED>

DTD

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Notional currency = “Pounds”>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

XML Document

Valid document is true

DTD

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Notional currency = “Pounds”>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

XML Document

Valid document is true#IMPLIED means optional

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) ><!ATTLIST Notional currency (Dollars | Pounds) #REQUIRED><!ATTLIST FixedFloatSwap note CDATA #IMPLIED>

DTD

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap note = “For your eyes only”> <Notional currency = “Pounds”>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

XML Document

Valid document is true

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) ><!ATTLIST Notional currency (Dollars | Pounds) #REQUIRED><!ATTLIST FixedFloatSwap note CDATA #IMPLIED>

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd" [ <!ENTITY bankname "Mellon National Bank and Trust" > ]> <FixedFloatSwap> <Bank>&bankname;</Bank> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Bank,Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Bank (#PCDATA) ><!ELEMENT Notional (#PCDATA) ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

DTD

Document usinga General Entity

Validate is true

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match = "Bank"> <WML> <CARD> <xsl:apply-templates/> </CARD> </WML> </xsl:template>

<xsl:template match = "Notional | Fixed_Rate | NumYears | NumPayments"> </xsl:template> </xsl:stylesheet>

XSLT Program

C:\McCarthy\www\46-928\examples\sax>java -Dcom.jclark.xsl.sax.parser=com.jclark.xml.sax.CommentDriver com.jclark.xsl.sax.Driver FixedFloatSwap.xml FixedFloatSwap.xsl FixedFloatSwap.wml

C:\McCarthy\www\46-928\examples\sax>type FixedFloatSwap.wml

<?xml version="1.0" encoding="utf-8"?>

<WML><CARD>Mellon National Bank and Trust</CARD></WML>

XSLT OUTPUT

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd" [

<!ENTITY bankname SYSTEM "JustAFile.dat" >

]> <FixedFloatSwap> <Bank>&bankname;</Bank> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

An external text entity

Mellon Bank And Trust CorporationWhen you need a friend!

XSLT Output

<?xml version="1.0" encoding="utf-8"?>

<WML><CARD>Mellon Bank And Trust CorporationWhen you need a friend!</CARD></WML>

JustAFile.dat

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Notional, Fixed_Rate, NumYears, NumPayments ) ><!ENTITY % parsedCharacterData "(#PCDATA)"><!ELEMENT Notional %parsedCharacterData; ><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

XML Document

DTD

Internal Parameter Entities

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Bank> &bankname; </Bank> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap (Bank, Notional, Fixed_Rate, NumYears, NumPayments ) ><!ENTITY bankname "Mellon National Bank and Trust Corporation" ><!ELEMENT Bank (#PCDATA)><!ELEMENT Notional (#PCDATA)><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

XML Document

DTD

General Entity defined in the DTD

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd"> <FixedFloatSwap> <Notional>100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> <Note> <![CDATA[This is text that <b>will not be parsed for markup]]> </Note> </FixedFloatSwap>

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap ( Notional, Fixed_Rate, NumYears, NumPayments, Note ) ><!ELEMENT Notional (#PCDATA)><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) ><!ELEMENT Note (#PCDATA) >

XML Document

DTD

CDATA Section

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match = "Note"> <WML> <CARD> <xsl:apply-templates/> </CARD>h </WML> </xsl:template>

<xsl:template match = "Notional | Fixed_Rate | NumYears | NumPayments"> </xsl:template> </xsl:stylesheet>

XSLT Program

<?xml version="1.0" encoding="utf-8"?><WML><CARD>

This is text that &lt;b&gt;will not be parsed for markup

</CARD></WML>

XSLT Output

DTD Components<?xml version="1.0" encoding = "UTF-8"?><!DOCTYPE ORDER SYSTEM "order.dtd"><!-- example order form --><ORDER SOURCE ="web" CUSTOMERTYPE="consumer" CURRENCY="USD"> <addresses> <address ADDTYPE="billship"> <firstname>Kevin</firstname> <lastname>Dick</lastname> <street ORDER="1">123 Anywhere Lane</street> <street ORDER="2">Apt 1b</street> <city>Palo Alto</city> <state>CA</state> <postal>94303</postal> <country>USA</country> </address>

Order.xml

<address ADDTYPE="bill"> <firstname>Kevin</firstname> <lastname>Dick</lastname> <street ORDER="1">123 Not The Same Lane</street> <street ORDER="2">Work Place</street> <city>Palo Alto</city> <state>CA</state> <postal>94300</postal> <country>USA</country> </address> </addresses>

An order may have more than oneaddress.

<lineitems> <lineitem ID="line1"> <product CAT="MBoard">440BX Motherboard</product> <quantity>1</quantity> <unitprice>200</unitprice> </lineitem> <lineitem ID="line2"> <product CAT = "RAM">128 MB PC-100 DIMM</product> <quantity>2</quantity> <unitprice>175</unitprice> </lineitem> <lineitem ID="line3"> <product CAT="CDROM">40x CD-ROM</product> <quantity>1</quantity> <unitprice>50</unitprice> </lineitem> </lineitems>

Several productsmay be purchased.

<payment> <card CARDTYPE="VISA"> <cardholder>Kevin S. Dick</cardholder> <cardnumber>11111-22222-33333</cardnumber> <expiration>01/01</expiration> </card> </payment></ORDER>

The payment is witha Visa card.

Valid document is true

order.dtd<?xml version="1.0" encoding="UTF-8"?>

<!-- Example Order form DTD adapted from XML: A Manager's Guide -->

<!-- Define an ORDER element -->

<!ELEMENT ORDER (addresses, lineitems, payment)> <!ATTLIST ORDER SOURCE (web | phone | retail) #REQUIRED CUSTOMERTYPE (consumer | business) "consumer" CURRENCY CDATA "USD">

Define an order based on other elements.

<!ENTITY % anAddress SYSTEM "address.dtd" >%anAddress;

<!-- Collection of Addresses --><!ELEMENT addresses (address+)>

<!ENTITY % aLineItem SYSTEM "lineitem.dtd" >%aLineItem;

<!-- Collection of LineItems --><!ELEMENT lineitems (lineitem+)>

<!ENTITY % aPayment SYSTEM "payment.dtd" >%aPayment;

The other elements are in their own dtd files.

External parameterentities

address.dtd<!-- Address Structure --><!ELEMENT address (firstname, middlename?, lastname, street+, city, state,postal,country)>

<!ELEMENT firstname (#PCDATA)><!ELEMENT middlename (#PCDATA)><!ELEMENT lastname (#PCDATA)><!ELEMENT street (#PCDATA)><!ELEMENT city (#PCDATA)><!ELEMENT state (#PCDATA)><!ELEMENT postal (#PCDATA)><!ELEMENT country (#PCDATA)><!ATTLIST address ADDTYPE (bill | ship | billship) "billship"><!ATTLIST street ORDER CDATA #IMPLIED>

lineitem.dtd<!ELEMENT lineitem (product,quantity,unitprice)><!ATTLIST lineitem ID ID #REQUIRED>

<!ELEMENT product (#PCDATA)><!ATTLIST product CAT (CDROM|MBoard|RAM) #REQUIRED>

<!ELEMENT quantity (#PCDATA)><!ELEMENT unitprice (#PCDATA)>

<!ELEMENT payment (card | PO)><!ELEMENT card (cardholder, cardnumber, expiration)><!ELEMENT cardholder (#PCDATA)><!ELEMENT cardnumber (#PCDATA)><!ELEMENT expiration (#PCDATA)><!ELEMENT PO (number,authorization*)><!ELEMENT number (#PCDATA)><!ELEMENT authorization (#PCDATA)>

<!ATTLIST card CARDTYPE (VISA|MasterCard|Amex) #REQUIRED>

payment.dtd

Processing XML with SAX

• Important interfaces and classes are found in org.xml.sax package

• We will look at the following interfaces and then study an example

interface DocumentHandler -- reports on document events interface ErrorHandler – reports on validity errors class HandlerBase – implements both of the above plus two others

public interface DocumentHandler

Receive notification of general document events.

This is the main interface that most SAX applications implement: if the application needs to be informed of basic parsing events, it implements this interface andregisters an instance with the SAX parser.

The parser uses the instance to report basic document-related events like thestart and end of elements and character data.

void characters(char[] ch, int start, int length) Receive notification of character data.void endDocument() Receive notification of the end of a document.void endElement(java.lang.String name) Receive notification of the end of an element.void startDocument() Receive notification of the beginning of a document. void startElement(java.lang.String name, AttributeList atts) Receive notification of the beginning of an element.

Some methods from the DocumentHandler Interface

public interface ErrorHandler

Basic interface for SAX error handlers.

If a SAX application needs to implement customized error handling, it must implement this interface and then register an instance with the SAX parser.The parser will then report all errors and warnings through this interface.

Some methods are:void error(SAXParseException exception) Receive notification of a recoverable error.void fatalError(SAXParseException exception) Receive notification of a non-recoverable error.void warning(SAXParseException exception) Receive notification of a warning.

public class HandlerBaseextends java.lang.Objectimplements EntityResolver, DTDHandler, DocumentHandler, ErrorHandler

Default base class for handlers.

This class implements the default behaviour for four SAX interfaces: EntityResolver, DTDHandler, DocumentHandler, and ErrorHandler.

<?xml version="1.0" encoding="utf-8"?><!ELEMENT FixedFloatSwap ( Bank, Notional, Fixed_Rate, NumYears, NumPayments ) ><!ELEMENT Bank (#PCDATA)><!ELEMENT Notional (#PCDATA)><!ATTLIST Notional currency (dollars | pounds) #REQUIRED><!ELEMENT Fixed_Rate (#PCDATA) ><!ELEMENT NumYears (#PCDATA) ><!ELEMENT NumPayments (#PCDATA) >

FixedFloatSwap.dtd

Input

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap SYSTEM "FixedFloatSwap.dtd" [<!ENTITY bankname "Pittsburgh National Corporation"> ]> <FixedFloatSwap> <Bank>&bankname;</Bank> <Notional currency = "pounds">100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

FixedFloatSwap.xml

Input

// NotifyStr.java// Adapted from XML and Java by Maruyama, Tamura and Uramoto// IBM Tokyo Research, Addison-Wesley

import java.io.*;import org.xml.sax.*;import javax.xml.parsers.SAXParserFactory;import javax.xml.parsers.ParserConfigurationException;import javax.xml.parsers.SAXParser;

Processing

Java event-driven processing

public class NotifyStr extends HandlerBase{ public static void main (String argv []) { if (argv.length != 1) { System.err.println ("Usage: java NotifyStr filename.xml"); System.exit (1); } SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setValidating(true); NotifyStr myHandler = new NotifyStr(); try {

SAXParser saxParser = factory.newSAXParser(); saxParser.parse( new File(argv [0]), myHandler);

} catch (Throwable t) { t.printStackTrace (); } System.exit (0); }

public NotifyStr() {}

public void startDocument() throws SAXException { System.out.println("startDocument called:"); }

public void endDocument() throws SAXException { System.out.println("endDocument called:"); }

public void startElement(String Name, AttributeList aMap) throws SAXException {

System.out.println("startElement called: element name =" + Name); // examine the attributes for(int i = 0; i < aMap.getLength(); i++) {

String attName = aMap.getName(i); String type = aMap.getType(i); String value = aMap.getValue(i); System.out.println(" attribute name = " + attName + " type = " + type + " value = " + value); } }

public void endElement(String name) throws SAXException { System.out.println("endElement is called:" + name);

}

public void characters(char[] ch, int start, int length) throws SAXException {

// build String from char array String dataFound = new String(ch,start,length); System.out.println("characters called:" + dataFound);

}

public void error(SAXParseException e) throws SAXException {

System.out.println("Parsing error"); System.out.println(e.toString()); }}

C:\McCarthy\www\46-928\examples\sax>java NotifyStr FixedFloatSwap.xmlstartDocument called:startElement called: element name =FixedFloatSwapstartElement called: element name =Bankcharacters called:Pittsburgh National CorporationendElement is called:BankstartElement called: element name =Notional attribute name = currency type = ENUMERATION value = poundscharacters called:100endElement is called:NotionalstartElement called: element name =Fixed_Ratecharacters called:5endElement is called:Fixed_RatestartElement called: element name =NumYearscharacters called:3endElement is called:NumYearsstartElement called: element name =NumPaymentscharacters called:6endElement is called:NumPaymentsendElement is called:FixedFloatSwapendDocument called:

Output

Accessing the swap from Jigsaw

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap [<!ENTITY bankname "Pittsburgh National Corporation"> ]> <FixedFloatSwap> <Bank>&bankname;</Bank> <Notional currency = "pounds">100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

Saved under Www/fpml/ServerSwap.xml

// This servlet file is stored in WWW/Jigsaw/servlet/GetXML.java// This servlet returns a user selected xml file from// the Www/fpml directory and returns it to the client.

import java.io.*;import java.util.*;import javax.servlet.*;import javax.servlet.http.*;

public class GetXML extends HttpServlet { public void doGet(HttpServletRequest req, HttpServletResponse res) throws ServletException, IOException {

String theData = ""; String extraPath = req.getPathInfo(); extraPath = extraPath.substring(1);

Servlet Code

// read the file and write it to the client try { // open file and create a DataInputStream FileInputStream theFile = new FileInputStream("c:\\Jigsaw\\Jigsaw\\Jigsaw\\Www\\fpml\\“ +extraPath); //DataInputStream dis = new DataInputStream(theFile); InputStreamReader is = new InputStreamReader(theFile); BufferedReader br = new BufferedReader(is);

// read the file into the string theData String thisLine; while((thisLine = br.readLine()) != null) { theData += thisLine + "\n"; } } catch(Exception e) { System.err.println("Error " + e); }

PrintWriter out = res.getWriter();

out.write(theData); System.out.println("Wrote document to client"); // write data to console System.out.println(theData); out.close(); }

}

// Sax Clientimport java.io.*;import org.xml.sax.*;import javax.xml.parsers.SAXParserFactory;import javax.xml.parsers.ParserConfigurationException;import javax.xml.parsers.SAXParser;

public class JigsawNotifyStr extends HandlerBase{ public static void main (String argv []) { if (argv.length != 1) { System.err.println ("Usage: java NotifyStr filename.xml"); System.exit (1); }

String serverString = "http://localhost:8001/servlet/getXML/"; String fileName = argv[0];

InputSource is = new InputSource(serverString + fileName);

System.out.println("Got the input source");

SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setValidating(true);

JigsawNotifyStr myHandler = new JigsawNotifyStr();

try { SAXParser saxParser = factory.newSAXParser(); saxParser.parse( is, myHandler);

} catch (Throwable t) { System.out.println("Big error");

t.printStackTrace (); } System.exit (0); }

public JigsawNotifyStr() {}

public void startDocument() throws SAXException {

System.out.println("startDocument called:"); }

public void endDocument() throws SAXException {

System.out.println("endDocument called:");

} // Same as before // public void error(SAXParseException e) throws SAXException {

// describe each arror and show each error method System.out.println("Parsing error"); System.out.println(e.toString()); }}

Being served by the servlet

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE FixedFloatSwap [<!ENTITY bankname "Pittsburgh National Corporation"> ]> <FixedFloatSwap> <Bank>&bankname;</Bank> <Notional currency = "pounds">100</Notional> <Fixed_Rate>5</Fixed_Rate> <NumYears>3</NumYears> <NumPayments>6</NumPayments> </FixedFloatSwap>

Got the input sourcestartDocument called:Parsing errororg.xml.sax.SAXParseException: Element type "FixedFloatSwap" is not declared.startElement called: element name =FixedFloatSwapcharacters called: Parsing errororg.xml.sax.SAXParseException: Element type "Bank" is not declared.startElement called: element name =Bankcharacters called:Pittsburgh National CorporationendElement is called:Bankcharacters called: Parsing errororg.xml.sax.SAXParseException: Element type "Notional" is not declared.Parsing errororg.xml.sax.SAXParseException: Attribute "currency" is not declared for element "Notional".startElement called: element name =Notional attribute name = currency type = CDATA value = poundscharacters called:100endElement is called:Notionalcharacters called:

We have some parsing errors.

Do you see why?

Parsing errororg.xml.sax.SAXParseException: Element type "Fixed_Rate" is not declared.startElement called: element name =Fixed_Ratecharacters called:5endElement is called:Fixed_Ratecharacters called: Parsing errororg.xml.sax.SAXParseException: Element type "NumYears" is not declared.startElement called: element name =NumYearscharacters called:3endElement is called:NumYearscharacters called: Parsing errororg.xml.sax.SAXParseException: Element type "NumPayments" is not declared.startElement called: element name =NumPaymentscharacters called:6endElement is called:NumPaymentscharacters called: endElement is called:FixedFloatSwapendDocument called:

The InputSource Class

The SAX and DOM parsers need XML input. The “output”produced by these parsers amounts to a series of method calls(SAX) or an application programmer interface to the tree (DOM).

An InputSource object can be used to provided input to theparser.

InputSurce SAX or DOM

Tree

Eventsapplication

So, how do we build an InputSource object?

Some InputSource constructors:

InputSource(String pathToFile); InputSource(InputStream byteStream); InputStream(Reader characterStream);

For example: String text = “<a>some xml</a>”; StringReader sr = new StringReader(text); InputSource is = new InputSource(sr); : myParser.parse(is);

But what about the DTD?

public interface EntityResolver

Basic interface for resolving entities.

If a SAX application needs to implement customized handling for external entities, it must implement this interface and registeran instance with the SAX parser using the parser'ssetEntityResolver method.

The parser will then allow the application to intercept any externalentities (including the external DTD subset and external parameterentities, if any) before including them.

EntityResolver

public InputSource resolveEntity(String publicId, String systemId) {

// Add this method to the client above. The systemId String // holds the path to the dtd as specified in the xml document. // We may now access the dtd from a servlet and return an // InputStream or return null and let the parser resolve the // external entity. System.out.println("Attempting to resolve" + "Public id :" + publicId + "System id :" + systemId); return null;

}