Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie...

1
7 th Annual Rising Voices Workshop Converging Voices: Building relationships and practices for intercultural science Wednesday, May 15 – Friday, May 17, 2019 National Center for Atmospheric Research Boulder, CO Website: https://risingvoices.ucar.edu/ Register to attend by February 15, 2019: https://www.regonline.com/7thannualrisingvoicesworkshop The 7 th annual workshop of Rising Voices: Climate Resilience through Indigenous and Earth Sciences will be held at the National Center for Atmospheric Research in Boulder, Colorado from 15-17 May, 2019. Rising Voices facilitates intercultural approaches for understanding and adapting to extreme weather and climate events, variability, and change. It is a vibrant network of Indigenous and Western scientific professionals, tribal and community leaders, environmental and communication experts, students, educators, and artists from across the US and around the world. At its core, Rising Voices aims to advance science through collaborations that bring Indigenous and Earth (atmospheric, social, biological, ecological) sciences into partnership, supports adaptive and resilient communities through sharing scientific capacity, and provides opportunities for Indigenous students and early career scientists through scientific and community mentoring. The theme of the 7 th Annual Workshop is “Converging Voices: Building relationships and practices for intercultural science.” Through facilitated group discussions, plenary sessions, and active participation, we will collectively engage together to address the following key questions: What does intercultural research collaboration look like in practice? What are the appropriate steps on the path to intercultural collaboration? How can the elements of intercultural collaboration be put into practice? To attend Rising Voices 7 please register by February 15, 2019 at https://www.regonline.com/7thannualrisingvoicesworkshop. Limited travel funding is available. Only those who register by February 15 will be considered for funding. For information about Rising Voices, please visit the website (https://risingvoices.ucar.edu/) or contact: Heather Lazrus ([email protected]) or Julie Maldonado ([email protected])

Transcript of Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie...

Page 1: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Student Workbook

Page 2: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Page ii Rev 5.2.2 © 2007 ITCourseware, LLC

Introduction to XML

Introduction to XML

Paul Hoffmann, Jamie Romero, and Todd Wright

Published by ITCourseware, LLC, 7245 South Havana Street, Suite 100, Centennial, CO 80112

Editor: Jan Waleri

Editorial Assistant: Danielle North

Special thanks to: Many instructors whose ideas and careful review have contributed to the quality of this

workbook, including Paul Hoffmann, Richard Raab, and Rob Seitz, and the many students who have

offered comments, suggestions, criticisms, and insights.

Copyright © 2007 by ITCourseware, LLC. All rights reserved. No part of this book may be reproduced

or utilized in any form or by any means, electronic or mechanical, including photo-copying, recording, or by

an information storage retrieval system, without permission in writing from the publisher. Inquiries should be

addressed to ITCourseware, LLC., 7245 South Havana Street, Suite 100, Centennial, Colorado, 80112.

(303) 302-5280.

All brand names, product names, trademarks, and registered trademarks are the property of their respective

owners.

Page 3: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

© 2007 ITCourseware, LLC Rev 5.2.2 Page iii

Introduction to XML

Contents

Chapter 1 - Course Introduction ............................................................................................................. 7

Course Objectives ............................................................................................................................ 8

Course Overview ........................................................................................................................... 10

Using the Workbook ...................................................................................................................... 11

Suggested References ..................................................................................................................... 12

Chapter 2 - Getting Started with XML .................................................................................................. 15

Data and Document Structure ......................................................................................................... 16

XML.............................................................................................................................................. 18

Well-Formed XML ........................................................................................................................ 20

Valid vs. Well-Formed XML ........................................................................................................... 22

Enforcing Valid Documents: DTD .................................................................................................... 24

Enforcing Valid Documents: XML Schema ...................................................................................... 26

Presentation Style ........................................................................................................................... 28

XSL and XSLT .............................................................................................................................. 30

Using XML .................................................................................................................................... 32

Labs ............................................................................................................................................... 34

Chapter 3 - Writing Well-Formed XML ................................................................................................ 37

XML Fundamentals ........................................................................................................................ 38

Tag Attributes ................................................................................................................................. 40

Naming Rules ................................................................................................................................. 42

Empty and Non-Empty Elements .................................................................................................... 44

Nesting and Hierarchy of Tags ........................................................................................................ 46

Processing Instructions and the XML Declaration ............................................................................ 48

Other XML Constructs ................................................................................................................... 50

Entity and Character References ..................................................................................................... 52

Labs ............................................................................................................................................... 54

Page 4: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Page iv Rev 5.2.2 © 2007 ITCourseware, LLC

Introduction to XML

Chapter 4 - Namespaces ...................................................................................................................... 57

Why Namespaces? ......................................................................................................................... 58

Namespace Prefixes and Declaration .............................................................................................. 60

Multiple Namespace Declarations ................................................................................................... 62

Declaring Namespaces in the Root Element ..................................................................................... 64

Default Namespaces ....................................................................................................................... 66

Labs ............................................................................................................................................... 68

Chapter 5 - Validating XML with DTDs ................................................................................................ 71

XML DTDs .................................................................................................................................... 72

DOCTYPE .................................................................................................................................... 74

Element Conditions and Quantifiers ................................................................................................. 76

Attributes ....................................................................................................................................... 78

Attribute Types ............................................................................................................................... 80

REQUIRED, IMPLIED, and FIXED .............................................................................................. 82

Parsed General Entities ................................................................................................................... 84

Parsed Parameterized Entities ......................................................................................................... 86

DTDs and Namespaces .................................................................................................................. 88

Labs ............................................................................................................................................... 90

Chapter 6 - Validating XML with XML Schemas ................................................................................... 93

Schema Overview .......................................................................................................................... 94

A Minimal Schema .......................................................................................................................... 96

Associating XML with a Schema ..................................................................................................... 98

Simple and Built-in Types .............................................................................................................. 100

Complex Types ............................................................................................................................ 102

Element Declarations .................................................................................................................... 104

Attribute Declarations ................................................................................................................... 106

Choices ........................................................................................................................................ 108

Named Types and Anonymous Types ............................................................................................ 110

Labs ............................................................................................................................................. 112

Chapter 7 - Using XML Schema with Namespaces ............................................................................. 115

Qualified and Unqualified XML ..................................................................................................... 116

Associating Qualified XML with a Schema .................................................................................... 118

Associating a Schema with a Namespace ...................................................................................... 120

Controlling Element and Attribute Qualification .............................................................................. 122

Merging Schema with the Same Namespace ................................................................................. 124

Page 5: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

© 2007 ITCourseware, LLC Rev 5.2.2 Page v

Introduction to XML

Merging Schemas with Different Namespaces ............................................................................... 126

Labs ............................................................................................................................................. 128

Chapter 8 - Introduction to XSLT ....................................................................................................... 131

Stylesheet, Source, and Result ...................................................................................................... 132

XSLT Processors ......................................................................................................................... 134

Processor Implementations ........................................................................................................... 136

XPath Basics ................................................................................................................................ 138

xsl:stylesheet ................................................................................................................................. 140

xsl:template ................................................................................................................................... 142

xsl:value-of ................................................................................................................................... 144

xsl:apply-templates ....................................................................................................................... 146

xsl:output ...................................................................................................................................... 148

Labs ............................................................................................................................................. 150

Chapter 9 - XPath Nodetypes ............................................................................................................ 153

XPath Expressions ........................................................................................................................ 154

XPath Context .............................................................................................................................. 156

XPath Location Steps ................................................................................................................... 158

Element and Root Nodes .............................................................................................................. 160

Text and Attribute Nodes .............................................................................................................. 162

Comment and Processing Instruction Nodes .................................................................................. 164

Namespace Nodes ....................................................................................................................... 166

Wildcards ..................................................................................................................................... 168

whitespace ................................................................................................................................... 170

Default Template Rules ................................................................................................................. 172

Labs ............................................................................................................................................. 174

Chapter 10 - XPath Axes and Predicates ............................................................................................ 177

Location Paths and Location Steps ............................................................................................... 178

Peer Axis Types ............................................................................................................................ 180

More Peer Axis Types .................................................................................................................. 182

Descendant Axis Types ................................................................................................................. 184

Ancestor Axis Types ..................................................................................................................... 186

Node Tests ................................................................................................................................... 188

Predicates .................................................................................................................................... 190

Functions ...................................................................................................................................... 192

Labs ............................................................................................................................................. 194

Page 6: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Page vi Rev 5.2.2 © 2007 ITCourseware, LLC

Introduction to XML

Chapter 11 - XSLT Flow Control ....................................................................................................... 197

xsl:if .............................................................................................................................................. 198

xsl:choose ..................................................................................................................................... 200

xsl:for-each ................................................................................................................................... 202

xsl:sort .......................................................................................................................................... 204

Named Templates ......................................................................................................................... 206

Mode ........................................................................................................................................... 208

Labs ............................................................................................................................................. 210

Chapter 12 - XML in Applications ...................................................................................................... 213

Reasons and Places for Using XML .............................................................................................. 214

DOM Parsers ............................................................................................................................... 216

SAX Parsers ................................................................................................................................ 218

Web Services ............................................................................................................................... 220

Appendix A - Effective Document Design ............................................................................................ 223

Design Goals ................................................................................................................................ 224

Intended Audience ........................................................................................................................ 226

Document Types ........................................................................................................................... 228

Choosing a Validation Method ...................................................................................................... 230

Incorporating Namespaces ........................................................................................................... 232

Modular Document Design ........................................................................................................... 234

Planning for Extensibility ................................................................................................................ 236

Solutions ............................................................................................................................................ 239

Index .................................................................................................................................................. 293

Page 7: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Course Introduction

© 2007 ITCourseware, LLC Rev 5.2.2 Page 7

Chapter 1

Chapter 1 - Course Introduction

Page 8: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 8 Rev 5.2.2 © 2007 ITCourseware, LLC

� Explain what XML is, and how it is used in application and document

development.

� Write well-formed documents that conform to XML's basic rules of syntax.

� Validate XML documents with both DTDs and XML Schemas.

� Identify the key differences between DTDs and XML Schemas.

� Use XML Namespaces to distinguish between XML tags.

� Transform an XML document into an HTML document using XSLT.

� Use XPath to navigate a document tree.

� Explain how programs can use DOM and SAX to parse XML documents.

Course Objectives

Page 9: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Course Introduction

© 2007 ITCourseware, LLC Rev 5.2.2 Page 9

Chapter 1

Page 10: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 10 Rev 5.2.2 © 2007 ITCourseware, LLC

� Audience: Application developers, web developers and administrators, and

XML authors.

� Prerequisites: HTML. Familiarity with web and data processing concepts.

Programming experience is helpful, but not necessary.

� Classroom Environment:

� Workstation per student, with a text editor and an XML-compliant

browser.

Course Overview

Page 11: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Course Introduction

© 2007 ITCourseware, LLC Rev 5.2.2 Page 11

Chapter 1

Using the Workbook

Chapter 2 Servlet Basics

© 2002 ITCourseware, LLC Rev 2.0.0 Page 17

Add an init() method to your Today servlet that initializes a bornOn date, then print the bornOn date

along with the current date:

Today.java

...

public class Today extends GenericServlet {

private Date bornOn;

public void service(ServletRequest request,

ServletResponse response) throws ServletException, IOException

{

...

// Write the document

out.println("This servlet was born on " + bornOn.toString());

out.println("It is now " + today.toString());

}

public void init() {

bornOn = new Date();

}

}

Hands On:

The init() method is

called when the servlet is

loaded into the container.

This workbook design is based on a page-pair, consisting of a Topic page and a Support page. When you

lay the workbook open flat, the Topic page is on the left and the Support page is on the right. The Topic

page contains the points to be discussed in class. The Support page has code examples, diagrams, screen

shots and additional information. Hands On sections provide opportunities for practical application of key

concepts. Try It and Investigate sections help direct individual discovery.

In addition, there is an index for quick lookup. Printed lab solutions are in the back of the book as well as

online if you need a little help.

Java Servlets

Page 16 Rev 2.0.0 © 2002 ITCourseware, LLC

� The servlet container controls the life cycle of the servlet.

� When the first request is received, the container loads the servlet class

and calls the init() method.

� For every request, the container uses a separate thread to call

the service() method.

� When the servlet is unloaded, the container calls the destroy()

method.

� As with Java’s finalize() method, don’t count on this being

called.

� Override one of the init() methods for one-time initializations, instead of

using a constructor.

� The simplest form takes no parameters.

public void init() {...}

� If you need to know container-specific configuration information, use

the other version.

public void init(ServletConfig config) {...

� Whenever you use the ServletConfig approach, always call the

superclass method, which performs additional initializations.

super.init(config);

The Servlet Life Cycle

The Topic page provides

the main topics for

classroom discussion.

The Support page has

additional information,

examples, and suggestions.

Code examples are in a

fixed font and shaded. The

online file name is listed

above the shaded area.

Screen shots show

examples of what you

should see in class.

Topics are organized into

first (�), second (�), and

third (�) level points.

Pages are numbered

sequentially throughout

the book, making lookup

easy.

Callout boxes point out

important parts of the

example code.

Page 12: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 12 Rev 5.2.2 © 2007 ITCourseware, LLC

Benz, Brian and John Durant. 2003. XML Programming Bible. John Wiley & Sons, Hoboken, NJ.

ISBN 0764538292.

Box, Don, Aaron Skonnard and John Lam. 2000. Essential XML: Beyond Markup. Addison-

Wesley, Reading, MA. ISBN 0201709147.

Deitel, Harvey M., et al. 2000. XML: How to Program. Prentice Hall, Englewood Cliffs, NJ.

ISBN 0130284173.

Harold, Elliotte Rusty. 2003. Effective XML. Addison-Wesley, Reading, MA. ISBN 0321150406.

Harold, Elliotte Rusty. 2004. XML 1.1 Bible. John Wiley & Sons, Hoboken, NJ. ISBN 0764549863.

Harold, Elliotte Rusty and W. Scott Means. 2004. XML in a Nutshell, 3rd Edition. O’Reilly &

Associates, Sebastopol, CA. ISBN 0596007647.

Hunter, David, et al. 2004. Beginning XML. Wrox Press, Hoboken, NJ. ISBN 0764570773.

Ray, Erik T. 2003. Learning XML. O’Reilly & Associates, Sebastopol, CA. ISBN 0596004206.

Simpson, John E. 2000. Just XML, 2nd Edition. Prentice Hall, Englewood Cliffs, NJ.

ISBN 013018554X.

Skonnard, Aaron and Nartuib Gudgin. 2001. Essential XML Quick Reference: A Programmer's

Reference to XML, XPath, XSLT, XML Schema, SOAP, and More. Addison-Wesley

Professional, Reading, MA. ISBN 0201740958.

St. Laurent, Simon and Michael Fitzgerald. 2005. XML Pocket Reference, 3rd Edition. O’Reilly &

Associates, Sebastopol, CA. ISBN 0596100507.

http://www.cafeconleche.org/

http://www.xml.com/

http://www.w3c.org/

Suggested References

Page 13: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Course Introduction

© 2007 ITCourseware, LLC Rev 5.2.2 Page 13

Chapter 1

Page 14: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 14 Rev 5.2.2 © 2007 ITCourseware, LLC

Page 15: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 2 Getting Started with XML

© 2007 ITCourseware, LLC Rev 5.2.2 Page 15

Chapter 2 - Getting Started with XML

Objectives

� Write and view a simple XML document.

� Distinguish between well-formed and

valid XML.

� Use a DTD or XML Schema to validate an

XML document.

� Control the presentation style of an XML

document.

Page 16: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 16 Rev 5.2.2 © 2007 ITCourseware, LLC

� In general terms, a document is any set of information or data.

� Publications (web, print, and so on) for people to read.

� Information passed by programs to other programs.

� A person can infer a document's structure intuitively.

� You can distinguish between a grocery list and a parts list by reading the

items.

� You can identify an introductory paragraph by its position.

� To a computer program, a document is a sequence of characters.

� A user must point out parts of a document to the program.

Data and Document Structure

Page 17: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 2 Getting Started with XML

© 2007 ITCourseware, LLC Rev 5.2.2 Page 17

Consider a simple document, like a memo.

bossmemo.txtTo: Boss

From: Me

Hey, I am really glad I'm in XML class this week.

You can easily identify the different parts of your memo: the sender, the recipient, the message.

Page 18: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 18 Rev 5.2.2 © 2007 ITCourseware, LLC

� XML stands for Extensible Markup Language.

� XML defines a way of marking up text to describe the structure of data.

� Tags identify the parts of the document.

� These tags build a hierarchy of elements that, in its entirety, makes a

document.

� XML is a way of creating your own markup language.

� You define the tags to explain your data.

� Your tags describe meaning and structure, not appearance.

� XML is a standard for creating markup languages.

� That is, XML is a meta-markup language.

� Industries and organizations use XML to write rules defining their own

markup languages.

� The World-Wide Web Consortium (W3C) created and maintains the

definition of XML.

XML

Page 19: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 2 Getting Started with XML

© 2007 ITCourseware, LLC Rev 5.2.2 Page 19

Jon Bosak of Sun Microsystems formed the XML Working Group in conjunction with the W3C. His efforts

were focused on the evolution of markup languages from way too complex (SGML) to not powerful enough

(HTML). The driving force behind XML is to build an infrastructure of markup languages that would allow

industries to have a standard means of data interchange. Current examples are Scalable Vector Graphics,

MathML, and XHTML. Each industry defines the rules that make a document valid for its particular use.

XHTML, for example, is a redefinition of HTML as a markup language that complies with XML standards.

XML is about creating these specialized languages.

1986 – SGML becomes a standard.

1991 – HTML invented.

1996 – XML defined.

2006 – XML 1.1 recommendation released.

Hands On:

Create a file called bossmemo.xml. Add the following content to your file:

<memo>

<to>Boss</to>

<from>Me</from>

<message>Hey, I am really glad I'm in XML class this week.</message>

</memo>

Load bossmemo.xml in your browser.

Page 20: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 20 Rev 5.2.2 © 2007 ITCourseware, LLC

� A well-formed document conforms to XML's basic rules of syntax.

� Every open tag must be closed.

� The open tag must exactly match the closing tag: XML is case-sensitive.

� All elements must be embedded within a single root element.

� Child tags must be closed before parent tags.

� "Well-formed" doesn't apply any validation tests to the content within the

document.

� That is, a well-formed document has correct XML tag syntax, but the

elements might be invalid for the specified document type.

� Applications must reject your XML if it is not well-formed.

Well-Formed XML

Page 21: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 2 Getting Started with XML

© 2007 ITCourseware, LLC Rev 5.2.2 Page 21

Hands On:

Modify bossmemo.xml, changing the name of the first tag to all uppercase letters:

<MEMO>

<to>Boss</to>

<from>Me</from>

<message>Hey, I am really glad I'm in XML class this week.</message>

</memo>

bossmemo.xml is no longer well-formed, because the open and close tags do not match: XML tag names

are case-sensitive. Try loading bossmemo.xml in your browser. What happens?

Now try this: fix the first tag, but change the order of the tags </to> and <from>:

<memo>

<to>Boss<from>

</to>Me</from>

<message>Hey, I am really glad I'm in XML class this week.</message>

</memo>

bossmemo.xml is no longer well-formed, because the tags are now mismatched. See what happens when

you reload it in your browser. The browser (and any other application) is supposed to reject any XML that

isn't well-formed, regardless of what the document type is.

Restore bossmemo.xml to its well-formed state:

<memo>

<to>Boss</to>

<from>Me</from>

<message>Hey, I am really glad I'm in XML class this week.</message>

</memo>

Page 22: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 22 Rev 5.2.2 © 2007 ITCourseware, LLC

� A valid document conforms to the predefined rules of a specific type of

document.

� These rules can be written by the author of the XML document or by

someone else.

� They might be from the same company or the same industry.

� The rules determine the type of data that each part of a document can

contain.

� Not every application requires your XML to be valid in order to complete its

task.

� For example, a browser can just display a document with no special

treatment of the element structure.

� That is, some applications validate and others don't.

� Application-to-application usage usually requires valid XML.

� The receiving application needs to know which data elements to expect.

Valid vs. Well-Formed XML

Page 23: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 2 Getting Started with XML

© 2007 ITCourseware, LLC Rev 5.2.2 Page 23

Page 24: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 24 Rev 5.2.2 © 2007 ITCourseware, LLC

� A Document Type Definition (DTD) defines rules for a specific type of

document, including:

� Names of elements, and how and where they can be used

� The order of elements

� Proper nesting and containment of elements

� Element attributes

� A DTD consists of a list of element definitions:

<!ELEMENT elementname rule>

� A valid document of this type contains only these elements.

� To apply a DTD to an XML document, you can:

� Include the DTD's element definitions within the XML document itself.

� Provide the DTD as a separate file, whose name you reference in the XML

document.

Enforcing Valid Documents: DTD

Page 25: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 2 Getting Started with XML

© 2007 ITCourseware, LLC Rev 5.2.2 Page 25

Hands On:

We'll create a DTD that declares which elements are valid in a memo and where in a memo the elements

can occur. Create a new file named memo.dtd and enter these element definitions:

<!ELEMENT memo (to,from,message)>

<!ELEMENT to (#PCDATA)>

<!ELEMENT from (#PCDATA)>

<!ELEMENT message (#PCDATA)>

In bossmemo.xml, add a DOCTYPE declaration that identifies bossmemo.xml's document type,

specifying memo.dtd as the file containing the DTD.

<!DOCTYPE memo SYSTEM "memo.dtd">

<memo>

<to>Boss</to>

<from>Me</from>

<message>Hey, I am really glad I'm in XML class this week.</message>

</memo>

Now have a program validate bossmemo.xml (the instructor will show you how to do this with the software

available in the classroom). Your bossmemo.xml file should be valid. If it isn't, double-check

bossmemo.xml and memo.dtd. Make sure the tagnames in the XML file match the element names in the

DTD — case is significant.

What happens if you add a new element after the </message> tag in your XML document — say, a P.S.

element?

...

<message>Hey, I am really glad I'm in XML class this week.</message>

<ps>You should take this class yourself!</ps>

</memo>

Is your document still valid? (Try it and see what happens, then remove this from your memo.)

Page 26: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 26 Rev 5.2.2 © 2007 ITCourseware, LLC

Enforcing Valid Documents: XML Schema

� XML Schema is another popular mechanism for defining the grammar of a

document.

� The rules of XML Schema are another W3C recommendation.

� XML Schema is expressed in the form of a separate XML file.

� Compared to DTD, XML Schema allows for more complex rules on the

document.

� XML Schema provides much more control on element and attribute

datatypes.

� Some datatypes are predefined and new ones can be created.

� The syntax of XML Schema is XML, so it is composed of a series of XML tags.

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<xsd:element name="memo">

<xsd:complexType>

� Tags, such as element and attribute, describe the location of the tags

that make up your XML file.

� Tags, such as complexType and simpleType, help to define the type of

content the elements and attributes can have.

Page 27: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 2 Getting Started with XML

© 2007 ITCourseware, LLC Rev 5.2.2 Page 27

Hands On:

memo.xsd (provided in your chapter directory) is an XML Schema that validates bossmemo.xml:

memo.xsd<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<xsd:element name="memo">

<xsd:complexType>

<xsd:sequence>

<xsd:element name="to" type="xsd:string"/>

<xsd:element name="from" type="xsd:string"/>

<xsd:element name="message" type="xsd:string"/>

</xsd:sequence>

</xsd:complexType>

</xsd:element>

</xsd:schema>

The extension .xsd is the common convention for naming what are termed "XML Schema Definition" files.

As you view this file, note a few things:

1. It is an XML file, so it needs to be well-formed and can even be validated.

2. It contains markup using the XML tag <xsd:element> much in the same way that a DTD file uses

the markup entity named ELEMENT to define an XML element.

3. The description of the content of the element is potentially very rich, starting with the XML tag

<xsd:complexType>.

In order to validate against XML Schema, the XML file needs to indicate the location of the schema.

Remove the DOCTYPE declaration from bossmemo.xml. Then, update the <memo> tag to include

additional information:

<memo xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="memo.xsd">

<to>Boss</to>

<from>Me</from>

<message>Hey, I am really glad I'm in XML class this week!</message>

</memo>

Use the same program you used to perform DTD validation to validate the bossmemo.xml file against the

XML Schema. If there is any sort of failure, verify that your information is typed in the correct case — once

again, case counts — and verify that bossmemo.xml and memo.xsd are located in the same directory.

Page 28: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 28 Rev 5.2.2 © 2007 ITCourseware, LLC

� XML describes the structure of your document.

� Structure helps applications identify and manipulate parts of your

document.

� Some applications present or transform your document (or a subset) in some

manner.

� A browser could present your document as a web page.

� A spreadsheet could present your document as a data table.

� XML, by itself, says nothing about presentation.

� You can use various languages to manipulate XML data for presentation.

� Cascading Style Sheets (CSS) are designed for specifying the display

characteristics of data (HTML or XML) in a browser.

� Extensible Stylesheet Language (XSL) defines general-purpose

formatting characteristics for XML data.

Presentation Style

Page 29: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 2 Getting Started with XML

© 2007 ITCourseware, LLC Rev 5.2.2 Page 29

Page 30: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 30 Rev 5.2.2 © 2007 ITCourseware, LLC

XSL-FO FO Processor PDF, postscript

� XSL Formatting Objects (XSL-FO) allows you to define rules that directly

manipulate the presentation of XML data.

� You can use XSL-FO to define formatting rules that would apply to

printers, typesetters, and various file formats.

� XSL-FO can be directly transformed using an XSL-FO processor into

various formats, such as postcript and PDF files.

� XSL Transformations (XSLT) allows you to define how to transform an XML

document into a different document.

� For example, you could use XSLT to define how to transform your XML

document into HTML, for use with HTML browsers.

� The application reads both the XML and XSL documents, formatting the display

accordingly.

XSL and XSLT

Page 31: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 2 Getting Started with XML

© 2007 ITCourseware, LLC Rev 5.2.2 Page 31

Hands On:

Now let's use XSLT to define a transformation of an XML memo into an HTML file. A stylesheet called

colormemo.xslt is provided in the chapter directory.

colormemo.xslt<xsl:stylesheet version="2.0"

xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">

<html>

<body bgcolor="lightblue">

<b>To: </b>

<font color="red">

<xsl:value-of select="memo/to"/>

</font>

<br></br>

<b>From: </b>

<font color="red">

<xsl:value-of select="memo/from"/>

</font>

<br></br>

<hr></hr>

<font color="black">

<xsl:value-of select="memo/message"/>

</font>

</body>

</html>

</xsl:template>

</xsl:stylesheet>

Add a processing instruction to the top of your XML file referencing the XSLT stylesheet. The xml-

stylesheet processing instruction is used by browsers to format the presentation of the XML file.

<?xml-stylesheet type="text/xsl" href="colormemo.xslt"?>

...

What do you see when you load bossmemo.xml in a browser?

Page 32: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 32 Rev 5.2.2 © 2007 ITCourseware, LLC

1. Create a DTD or XML Schema.

shoppingcart.dtd

<!ELEMENT shoppingcart (item+)>

<!ELEMENT item (id, description, quantity, price)>

<!ELEMENT id (#PCDATA)>

<!ELEMENT description (#PCDATA)>

<!ELEMENT quantity (#PCDATA)>

<!ELEMENT price (#PCDATA)>

� This sets the rules for which elements are permitted and where they are

permitted.

2. Mark up data using the elements defined in the DTD or XML Schema.

shoppingcart.xml

<!DOCTYPE shoppingcart SYSTEM "shoppingcart.dtd">

<shoppingcart>

<item>

<id>27757</id>

<description>Get Well Soon Bouquet</description>

<quantity>1</quantity>

<price>48.95</price>

</item>

<item>

<id>24623</id>

<description>Colors of Summer Bouquet</description>

<quantity>1</quantity>

<price>69.95</price>

</item>

</shoppingcart>

3. Optionally, create a stylesheet that defines presentation rules for your XML.

Using XML

Page 33: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 2 Getting Started with XML

© 2007 ITCourseware, LLC Rev 5.2.2 Page 33

The plus (+) sign used in shoppingcart.dtd indicates that one or more item children are allowed within a

shoppingcart element.

Often, XML designers will prototype their XML document first and then generate a schema or DTD using a

tool.

Page 34: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 34 Rev 5.2.2 © 2007 ITCourseware, LLC

� Update your memo XML file so that it validates against memo.dtd using a DOCTYPE declaration.

Make sure to remove the attributes on the memo tag so that it does not perform validation using

XML Schema. Also, remove the stylesheet processing instruction from your XML. Verify that your

memo validates using your validation software.

(Solution: bossmemo1.xml)

� Add a subject element to your memo XML file. Try to validate your memo file using your validation

software. Does the software validate your XML?

(Solution: bossmemo2.xml).

� Change your DTD to support a subject element between the from and message elements.

(Solutions:bossmemo3.xml, memo3.dtd).

� In your XML document, place the subject element after the message element. What happens

when you try to validate your XML file? Is your XML valid? If it isn't valid, fix your XML.

(Solution: bossmemo4.xml).

� Change your XML Schema to support the new subject element that is between the from and

message elements. Change your XML file so that it can validate using this XML Schema. Test

your changes.

(Solutions: bossmemo5.xml, memo5.xsd).

� (Optional) Add formatting rules to colormemo.xslt so that the subject is displayed in bold text.

(Solutions: bossmemo6.xml, colormemo6.xslt)

� The following table represents a list of books and their associated data. Use XML tags to describe

the data. The data for the table can be found in books.txt. Make sure your tags describe all of the

data's structure.

Note: Just create an XML document with tags describing the data; don't attempt to format the data

as a table.

(Solution: books.xml)

Labs

Page 35: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 2 Getting Started with XML

© 2007 ITCourseware, LLC Rev 5.2.2 Page 35

NBSI eltiT rohtuA dehsilbuPraeY

7849676130 eyRehtnirehctaCehT regnilaS.D.J 1591

6870136440 dribgnikcoMalliKoT eeLrepraH 0691

1221084860 aeSehtdnanaMdlOehT yawgnimeHtsenrE 2591

1436251540 mraFlaminA llewrOegroeG 5491

� Based on books.xml, write a DTD for creating lists of books.

(Solutions: books2.xml, books.dtd)

Page 36: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 36 Rev 5.2.2 © 2007 ITCourseware, LLC

Page 37: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 6 Validating XML with XML Schemas

© 2007 ITCourseware, LLC Rev 5.2.2 Page 93

Chapter 6 - Validating XML with XML Schemas

Objectives

� Create an XML Schema to validate an

XML document.

� Use built-in types for datatyping.

� Use simple types for basic XML content.

� Create complex types to model

subelements and attributes.

� Declare elements and attributes.

� Model choices using schemas.

Page 38: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 94 Rev 5.2.2 © 2007 ITCourseware, LLC

� XML Schema is a W3C specification for validating the content of an XML

document.

� Schemas add capabilities that are not present in DTDs.

� They define datatypes for text elements, such as integer, string, date,

and decimal.

� Regular expressions may be used to further narrow down values

for text data.

� Schemas add more control of quantifiers than DTDs can provide with *,

+, and ?.

� They allow for default element values and use of an enumeration to

specify possible element values.

� DTDs only provide this functionality for attributes.

� They are defined in XML syntax.

� A schema is a well-formed XML document.

� Schemas support namespace usage in XML documents.

� Similar to DTDs, schemas allow you to define elements and attributes.

� Unlike DTDs, schemas do not provide a way to define general entities.

� Schemas only address validation.

Schema Overview

Page 39: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 6 Validating XML with XML Schemas

© 2007 ITCourseware, LLC Rev 5.2.2 Page 95

The following datatypes are predefined by XML Schema:

For a complete list of built-in types, see the XML Schema specification

http://www.w3c.org/TR/xmlschema-2/.

epyT selpmaxE setoN

gnirts eoDnhoJ

naeloob eslaf,eurt

lamiced 87.65,43.211- .stigidlamiced81tsaeltaevahnaC

taolf 5-e43.21,54.32 gnitaolftib-23noisicerp-elgnisEEEI.tniop

elbuod 6+e8.654,54.21 gnitaolftib-46noisicerp-elbuodEEEI.tniop

noitarud M93H7TD32M5Y2P .etunim,ruoh,yad,htnom,raeY

emiTetad 00:70-23:95:51T22-11-1002 ,sdnoces,etunim,ruoh,yad,htnom,raeYdetanidrooCdnihebsruoh(enozemit

.)emiTlasrevinU

emit 00:70-23:95:51 sruoh(enozemit,sdnoces,etunim,ruoH.)emiTlasrevinUdetanidrooCdniheb

etad 52-21-2002 .yad,htnom,raeY

regetni 5421-,54,21 ,0,1-,2-,...{sregetnifotesetinifniehT}...,2,1

regetnIevitageNnon 65,43,0 .0gnidulcnisregetnievitisoP

regetnIevitisop 65,43 .0gnidulcnitonsregetnievitisoP

regetnIevitisoPnon 65-,43-,0 .0gnidulcnisregetnievitageN

regetnIevitagen 65-,43- .0gnidulcnitonsregetnievitageN

gnol 45452-,76544321 7085774586302733229=eulavxaM8085774586302733229-=eulavnim

tni 654321-,54321 7463847412=eulavxaM8463847412-=eulavnim

trohs 21-,86723 86723-=eulavnim76723=eulavxaM

etyb 821-,21 821-=eulavnim721=eulavxaM

Page 40: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 96 Rev 5.2.2 © 2007 ITCourseware, LLC

� Schema documents are modeled in XML syntax.

� The schema must be well-formed.

� Associate your schema with the namespace name:

http://www.w3.org/2001/XMLSchema.

� By convention, the prefix xsd or xs is used.

� Any element or attribute within the document that uses the prefix will be

associated with the schema vocabulary, as opposed to the document

author's vocabulary.

� Every schema must define a root element named schema.

� Within the root element, there will be a variety of subelements, including

element, complexType, and simpleType.

� All children of the schema element are called globals.

� Any global element is eligible to be the root element of the XML

document to be validated.

� Only globals can be referenced from elsewhere within the

document.

A Minimal Schema

Page 41: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 6 Validating XML with XML Schemas

© 2007 ITCourseware, LLC Rev 5.2.2 Page 97

XML Schema definition files

make use of prefixes on types,

as well as on element names.

Schema documents are typically saved with an xsd extension.

patient1.xml<?xml version='1.0'?>

<patient>

Sam Smith

</patient>

patient1.xsd<?xml version='1.0'?>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<xsd:element name="patient" type="xsd:string" />

</xsd:schema>

Page 42: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 98 Rev 5.2.2 © 2007 ITCourseware, LLC

� An XML document must explicitly indicate when it can be validated against an

XML Schema.

� The DOCTYPE declaration provides such a mechanism for DTDs.

� The XML Schema is always an external document and is often referred to

as the schema definition file.

� If the XML document to be validated (the instance document) does not use a

namespace, then associating it with a schema is simple.

� Declare the XMLSchema-instance namespace.

� Define the noNamespaceSchemaLocation attribute to the URI of the

schema document.

<rootElement xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="SchemaDocument.xsd">

Associating XML with a Schema

Page 43: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 6 Validating XML with XML Schemas

© 2007 ITCourseware, LLC Rev 5.2.2 Page 99

patient2.xml<?xml version='1.0'?>

<patient xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="patient2.xsd">

Sam Smith

</patient>

patient2.xsd<?xml version='1.0'?>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<xsd:element name="patient" type="xsd:string" />

</xsd:schema>

Page 44: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 100 Rev 5.2.2 © 2007 ITCourseware, LLC

� A simple type can only contain numbers and strings, but not subelements nor

attributes.

� There are a number of simple types that are built in to the schema definition.

� string, int, decimal, and date are all examples of another simple type.

� An element that is of type int, for example, has a value between

-2147483648 and 2147483647.

� Use the restriction element to derive a custom simple type from another simple

type.

� The base attribute specifies which datatype will be restricted.

� The datatype can be a built-in type or another simple type defined

elsewhere.

� Add facet subelements to the restriction element.

� A facet is a kind of restriction that is applied to a datatype.

Simple and Built-in Types

Page 45: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 6 Validating XML with XML Schemas

© 2007 ITCourseware, LLC Rev 5.2.2 Page 101

patient2.xsd<?xml version='1.0'?>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<xsd:element name="patient" type="xsd:string" />

</xsd:schema>

To limit the values for an integer to numbers between 100 and 1000, use the minInclusive and

maxInclusive facets.

example-facets.xsd...

<xsd:simpleType name="restrictedInt">

<xsd:restriction base="xsd:int">

<xsd:minInclusive value="100"/>

<xsd:maxInclusive value="1000"/>

</xsd:restriction>

</xsd:simpleType>

...

To limit the number of characters in a string, use the minLength and maxLength facets.

example-facets.xsd...

<xsd:simpleType name="smallString">

<xsd:restriction base="xsd:string">

<xsd:minLength value="1"/>

<xsd:maxLength value="30"/>

</xsd:restriction>

</xsd:simpleType>

...

To limit a string to only a certain sequence of characters, use the pattern facet in combination with a

regular expression. The regular expression syntax for schemas is similar to the Perl programming language

syntax for regular expressions.

example-facets.xsd...

<xsd:simpleType name="zipcode">

<xsd:restriction base="xsd:string">

<xsd:pattern value="\d{5}-\d{4}"/>

</xsd:restriction>

</xsd:simpleType>

...

For more complete coverage of facets and regular expressions, see the XML Schema specification:

http://www.w3c.org/TR/xmlschema-2/.

Page 46: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 102 Rev 5.2.2 © 2007 ITCourseware, LLC

Complex Types

� A complex type can contain subelements and/or attributes.

� Use the complexType element to model a complex type.

<xsd:complexType name="PatientType">

<xsd:sequence>

<xsd:element name="fname" type="xsd:string" />

<xsd:element name="lname" type="xsd:string" />

</xsd:sequence>

<xsd:attribute name="status" type="xsd:string" />

</xsd:complexType>

� Add each subelement within the complex type as the child of a sequence,

choice, or all element.

� The order of each element within the sequence is the order in which they

must appear in the resulting document.

� If order does not matter, use all instead of sequence.

� Use choice to model a choice of one element.

� A valid instance document is limited to only one of the child

elements of the choice tag.

� Model attributes by using the attribute element.

� Attributes are defined after all element definitions.

� To model an empty element, omit the sequence/choice/all tags.

� Use the attribute element as before to specify attributes for the empty

element.

Page 47: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 6 Validating XML with XML Schemas

© 2007 ITCourseware, LLC Rev 5.2.2 Page 103

patient3.xml<?xml version='1.0'?>

<patient xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="patient3.xsd">

<id>511-23-5632</id>

<name>Sam Smith</name>

<medication>

<id>med4512</id>

<drug>aspirin</drug>

</medication>

</patient>

patient3.xsd<?xml version='1.0'?>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<xsd:element name="patient" type="PatientType" />

<xsd:complexType name="PatientType">

<xsd:sequence>

<xsd:element name="id" type="xsd:string" />

<xsd:element name="name" type="xsd:string" />

<xsd:element name="medication" type="MedicationType"/>

</xsd:sequence>

</xsd:complexType>

<xsd:complexType name="MedicationType">

<xsd:sequence>

<xsd:element name="id" type="xsd:string" />

<xsd:element name="drug" type="xsd:string" />

</xsd:sequence>

</xsd:complexType>

</xsd:schema>

Page 48: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 104 Rev 5.2.2 © 2007 ITCourseware, LLC

� Declare an element using the element named element.

<xsd:element name="zipcode" type="xsd:string" />

� The type can be any simple or complex type.

� Use the ref attribute to refer to a global element that was defined elsewhere.

<xsd:element ref="zipcode" />

� Global elements are child declarations of a schema.

� To specify how many times an element may occur, use minOccurs or

maxOccurs.

<xsd:element name="street" type="xsd:string"

minOccurs="1" maxOccurs="2"/>

� By default, minOccurs is 1 and maxOccurs is 1.

� You can use unbounded for maxOccurs to specify that there may be

unlimited occurrences of this element.

� Use the default attribute to specify a default value for the element if it appears in

the document without any content (an empty element).

<xsd:element name="name" type="xsd:string"

default="John Doe"/>

� Use fixed to specify that an element must contain a particular value.

<xsd:element name="state" type="xsd:string" fixed="CO"/>

Element Declarations

Page 49: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 6 Validating XML with XML Schemas

© 2007 ITCourseware, LLC Rev 5.2.2 Page 105

patient4.xml<?xml version='1.0'?>

<patient xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="patient4.xsd">

<id>511-23-5632</id>

<name>Sam Smith</name>

<medication>

<id>med4512</id>

<drug>aspirin</drug>

</medication>

<medication>

<id>med4598</id>

<drug>ibuprofen</drug>

</medication>

</patient>

patient4.xsd<?xml version='1.0'?>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<xsd:element name="patient" type="PatientType" />

<xsd:element name="id" type="xsd:string" />

<xsd:complexType name="PatientType">

<xsd:sequence>

<xsd:element ref="id"/>

<xsd:element name="name" type="xsd:string" />

<xsd:element name="medication" type="MedicationType"

minOccurs="0" maxOccurs="unbounded" />

</xsd:sequence>

</xsd:complexType>

<xsd:complexType name="MedicationType">

<xsd:sequence>

<xsd:element ref="id"/>

<xsd:element name="drug" type="xsd:string" />

</xsd:sequence>

</xsd:complexType>

</xsd:schema>

Page 50: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 106 Rev 5.2.2 © 2007 ITCourseware, LLC

� Declare an attribute using the attribute element.

<xsd:attribute name="info" type="xsd:string" />

� Since attributes cannot contain subelements or attributes, the type can

only be a simple type.

� Place the attribute declaration at the end of a complexType declaration.

� Use default and fixed attributes similar to how they are used for elements.

� Note that a default value for an attribute is inserted any time the attribute is

omitted, whereas for an element it is only inserted when that element is

empty.

� Instead of minOccurs or maxOccurs, declare the use attribute to handle

occurrence constraints.

� use="required" specifies that the attribute must be in the XML

document.

� use="optional" says that the attribute may or may not occur in the XML

document.

� use="prohibited" means that the attribute may not be in the XML

document.

� If an attribute has a default value, then it can only specify the value of

optional for use.

Attribute Declarations

Page 51: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 6 Validating XML with XML Schemas

© 2007 ITCourseware, LLC Rev 5.2.2 Page 107

patient5.xml<?xml version='1.0'?>

<patient xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="patient5.xsd"

condition="fair">

<id>511-23-5632</id>

<name>Sam Smith</name>

<medication>

<id>med4512</id>

<drug>aspirin</drug>

</medication>

<medication>

<id>med4598</id>

<drug>ibuprofen</drug>

</medication>

</patient>

patient5.xsd<?xml version='1.0'?>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<xsd:element name="patient" type="PatientType" />

<xsd:element name="id" type="xsd:string" />

<xsd:complexType name="PatientType">

<xsd:sequence>

<xsd:element ref="id"/>

<xsd:element name="name" type="xsd:string" />

<xsd:element name="medication" type="MedicationType"

minOccurs="0" maxOccurs="unbounded" />

</xsd:sequence>

<xsd:attribute name="condition" type="xsd:string"

use="optional" default="critical" />

</xsd:complexType>

<xsd:complexType name="MedicationType">

<xsd:sequence>

<xsd:element ref="id"/>

<xsd:element name="drug" type="xsd:string" />

</xsd:sequence>

</xsd:complexType>

</xsd:schema>

Page 52: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 108 Rev 5.2.2 © 2007 ITCourseware, LLC

� It may be necessary to specify a set of choices for the value of an element or

attribute.

� Use enumeration facets to specify the set of choices.

<xsd:simpleType name="HouseType">

<xsd:restriction base="xsd:string">

<xsd:enumeration value="Ranch"/>

<xsd:enumeration value="Two-Story"/>

<xsd:enumeration value="Bi-Level"/>

</xsd:restriction>

</xsd:simpleType>

Choices

Page 53: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 6 Validating XML with XML Schemas

© 2007 ITCourseware, LLC Rev 5.2.2 Page 109

patient6.xsd<?xml version='1.0'?>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<xsd:element name="patient" type="PatientType" />

<xsd:element name="id" type="xsd:string" />

<xsd:complexType name="PatientType">

<xsd:sequence>

<xsd:element ref="id"/>

<xsd:element name="name" type="xsd:string" />

<xsd:element name="medication" type="MedicationType"

minOccurs="0" maxOccurs="unbounded" />

</xsd:sequence>

<xsd:attribute name="condition" type="ConditionsType"

use="optional" default="critical" />

</xsd:complexType>

<xsd:simpleType name="ConditionsType">

<xsd:restriction base="xsd:string">

<xsd:enumeration value="critical" />

<xsd:enumeration value="serious" />

<xsd:enumeration value="fair" />

<xsd:enumeration value="good" />

</xsd:restriction>

</xsd:simpleType>

<xsd:complexType name="MedicationType">

<xsd:sequence>

<xsd:element ref="id"/>

<xsd:element name="drug" type="xsd:string" />

</xsd:sequence>

</xsd:complexType>

</xsd:schema>

Page 54: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 110 Rev 5.2.2 © 2007 ITCourseware, LLC

� You can create schemas so that each complex type or derived simple type has a

name.

� These are referred to as named types.

� Each named type is referenced by using type= syntax in elements and

attributes.

� Named types can get out of control if a schema has many types that are

referenced only once.

� You can use anonymous types as an alternative.

� Anonymous types do not include names, nor do they need to be

referenced.

� The lack of a name or type= syntax identifies an anonymous type.

� An anonymous type uses containment to model both complex and derived

simple types.

� Since anonymous types don't have names, they can't be referenced from

elsewhere.

Named Types and Anonymous Types

Page 55: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 6 Validating XML with XML Schemas

© 2007 ITCourseware, LLC Rev 5.2.2 Page 111

anonymouspatient.xsd<?xml version='1.0'?>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

<xsd:element name="id" type="xsd:string" />

<xsd:element name="patient">

<xsd:complexType>

<xsd:sequence>

<xsd:element ref="id"/>

<xsd:element name="name" type="xsd:string" />

<xsd:element name="medication"

minOccurs="0" maxOccurs="unbounded">

<xsd:complexType>

<xsd:sequence>

<xsd:element ref="id"/>

<xsd:element name="drug" type="xsd:string" />

</xsd:sequence>

</xsd:complexType>

</xsd:element>

</xsd:sequence>

<xsd:attribute name="condition"

use="optional" default="critical">

<xsd:simpleType>

<xsd:restriction base="xsd:string">

<xsd:enumeration value="critical" />

<xsd:enumeration value="serious" />

<xsd:enumeration value="fair" />

<xsd:enumeration value="good" />

</xsd:restriction>

</xsd:simpleType>

</xsd:attribute>

</xsd:complexType>

</xsd:element>

</xsd:schema>

Try It:This schema works with anonymouspatient.xml.

Page 56: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 112 Rev 5.2.2 © 2007 ITCourseware, LLC

� Update patient6.xml so that each medication also has dosage information. Include amount (number

of pills to take), units (i.e. 100mg), start date, frequency (how many times to take the medication

each day), and duration.

(Solution: patient7.xml)

� Modify patient6.xsd to validate your XML from �.

(Solution: patient7.xsd)

� Create a schema to model a person. Include first name, last name, middle initial, address, and phone

number.

(Solution: person.xsd)

� Write a valid XML document that uses your schema.

(Solution: person.xml)

� (Optional) Write a schema to describe an invoice.

a) The invoice should have a company name, date, shipping method, customer

information, payment information, and one or more items.

b) For the shipping method, use attributes to specify the two options: UPS or FedEx. Make

UPS the default.

c) The customer information should include a customer id, optional firstname, lastname,

address, and zero or more area code and phone number combinations. For every phone

number, include an attribute that describes it, using one of the following: home, work, fax,

pager, or cell. The default is home.

d) The payment information should allow for either checks or credit cards. A check should

have an optional attribute for the check number. A credit card should have an attribute for

the credit card type and child elements for the credit card number and the expiration date.

The credit card type must be given and can only be one of the following: Visa, AmEx, or

MC.

e) Finally, each item has a quantity, part number and unit price. Also add an optional attribute

to the item for a description of the item.

(Solution: invoice1.xsd)

Labs

Page 57: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 6 Validating XML with XML Schemas

© 2007 ITCourseware, LLC Rev 5.2.2 Page 113

� (Optional) Modify invoice.xml so that it validates against the schema you created in �.

(Solution: invoice1.xml)

Page 58: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 114 Rev 5.2.2 © 2007 ITCourseware, LLC

Page 59: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 12 XML in Applications

© 2007 ITCourseware, LLC Rev 5.2.2 Page 213

Chapter 12 - XML in Applications

Objectives

� Recognize opportunities to utilize XML in

application design.

� Describe two major types of XML

parsers.

� Explain the difference between validating

and non-validating parsers.

� Describe how web services use XML.

Page 60: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 214 Rev 5.2.2 © 2007 ITCourseware, LLC

� Applications need a common format for easy data exchange.

� Databases provide a common format, as well as persistence.

� These are expensive in terms of cost and time.

� Flat files and pipes are another approach to data exchange.

� They do not address how data is formatted.

� XML is an ideal approach for data formatting.

� Pipes, sockets, databases, and flat files can still be used for the exchange.

Reasons and Places for Using XML

Page 61: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 12 XML in Applications

© 2007 ITCourseware, LLC Rev 5.2.2 Page 215

Page 62: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 216 Rev 5.2.2 © 2007 ITCourseware, LLC

� Document Object Model (DOM) is a standard way to represent information in

memory.

� DOM parsers build a representation in memory.

� This allows a programmer to easily manipulate the objects.

� Different approaches to iterating through the data are possible.

� From a set of memory-based objects, DOM can generate an XML file.

� There are DOM parsers for a number of languages.

� C, C++, Java, and Perl are the more commonly-used languages.

� These are typically validating parsers that will check the XML against a

DTD or schema.

DOM Parsers

Page 63: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 12 XML in Applications

© 2007 ITCourseware, LLC Rev 5.2.2 Page 217

Application

XML

Data

DOM

Parser

Page 64: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 218 Rev 5.2.2 © 2007 ITCourseware, LLC

� Simple API for XML (SAX) is a fast parser.

� It does not build an internal memory representation.

� It works by calling functions or methods.

� The programmer codes these functions or methods in a SAX

ContentHandler.

� Select elements are easy to search for or extract.

� Most SAX parsers can be selectively validating.

� Programmers do not typically worry about validating with SAX.

� There are a number of SAX parsers.

� Java's API was one of the originals and, as such, is considered the norm.

� There is also SAX for C, C++, and Perl.

SAX Parsers

Page 65: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 12 XML in Applications

© 2007 ITCourseware, LLC Rev 5.2.2 Page 219

Application

XML

Data

SAX

Parser

SAX

ContentHandler

Page 66: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 220 Rev 5.2.2 © 2007 ITCourseware, LLC

� Web services are server applications that are accessed through standard

protocols over the Internet.

� Many distributed services architectures are either difficult (CORBA) or

proprietary (J2EE).

� XML is both simple and universal.

� The simplest web services use HTTP and HTML, with the clients running in

web browsers.

� Client-side processing is done in JavaScript and is limited.

� Locating HTML-based web services is performed with search engines,

which are tedious and imprecise.

� XML-based web services allow more intelligent client applications and more

precise web service location.

� Universal Description, Discovery, and Integration (UDDI) registries are

used to locate web services.

� Web Service Description Language (WSDL) describes the interface to the

web service.

� Simple Object Access Protocol (SOAP) is the protocol the client uses to

send a request to the server.

Web Services

Page 67: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Chapter 12 XML in Applications

© 2007 ITCourseware, LLC Rev 5.2.2 Page 221

Page 68: Introduction to XML - ITCourseware · Introduction to XML Introduction to XML Paul Hoffmann, Jamie Romero, and Todd Wright Published by ITCourseware, LLC, 7245 South Havana Street,

Introduction to XML

Page 222 Rev 5.2.2 © 2007 ITCourseware, LLC