g22 3033 002 c71.prn · n JAXP: Java API for XML Processing n Parsers comparison n Latest W3C APIs...

13
1 1 XML for Java Developers G22.3033-002 Session 7 - Main Theme XML Information Rendering (Part I) Dr. Jean-Claude Franchitti New York University Computer Science Department Courant Institute of Mathematical Sciences 2 Agenda n Summary of Previous Session n Extensible Stylesheet Language Transformation (XSL-T) n Extensible Stylesheet Language Formatting Object (XSL -FO) n XML and Document/Content Management n Assignment 4a+4b (due in two week) 3 Summary of Previous Session n Advanced XML Parser Technology n JDOM: Java-Centric API for XML n JAXP: Java API for XML Processing n Parsers comparison n Latest W3C APIs and Standards for Processing XML n XML Infoset, DOM Level 3, Canonical XML n XML Signatures, XBase, XInclude n XML Schema Adjuncts n Java-Based XML Data Processing Frameworks n Assignment 3a+3b (due next week)

Transcript of g22 3033 002 c71.prn · n JAXP: Java API for XML Processing n Parsers comparison n Latest W3C APIs...

Page 1: g22 3033 002 c71.prn · n JAXP: Java API for XML Processing n Parsers comparison n Latest W3C APIs and Standards for Processing XML n XML Infoset, DOM Level 3, Canonical XML n XML

1

1

XML for Java Developers G22.3033-002

Session 7 - Main ThemeXML Information Rendering (Part I)

Dr. Jean-Claude Franchitti

New York UniversityComputer Science Department

Courant Institute of Mathematical Sciences

2

Agenda

n Summary of Previous Session

n Extensible Stylesheet Language Transformation (XSL -T)

n Extensible Stylesheet Language Formatting Object (XSL -FO)

n XML and Document/Content Management

n Assignment 4a+4b (due in two week)

3

Summary of Previous Session

n Advanced XML Parser Technology

n JDOM: Java-Centric API for XML

n JAXP: Java API for XML Processing

n Parsers comparison

n Latest W3C APIs and Standards for Processing XML

n XML Infoset, DOM Level 3, Canonical XML

n XML Signatures, XBase, XInclude

n XML Schema Adjuncts

n Java-Based XML Data Processing Frameworks

n Assignment 3a+3b (due next week)

Page 2: g22 3033 002 c71.prn · n JAXP: Java API for XML Processing n Parsers comparison n Latest W3C APIs and Standards for Processing XML n XML Infoset, DOM Level 3, Canonical XML n XML

2

4

XML-Based Rendering Developmentn XML Software Development Methodology

n Language + Stepwise Process + Tools

n Rational Unified Process (RUP) v.s. “XML Unified Process”

n XML Application Development Infrastructure

n Metadata Management (e.g., XMI)n XSLT, XPath XSL -FO APIs (JAXP, JAXB, JDOM, SAX, DOM)

n XML Tools (e.g., XML Editors,Apache’s FOP, Antenna House’s XSL Formatter, HTML/CSS1/2/3, XHTML, XForms, WCAG

n XML Applications Involved in the Rendering Phase:n Application(s) of XMLn XML-based applications/services (markup language mediators)

n MOM, POP, Other Services (e.g., persistence)n Application Infrastructure Frameworks

5

What is XSL?

n XSL is a language for expressing stylesheets. It consists of two parts:n A language for transforming XML documentsn A XML vocabulary for specifying formatting semanticsn See http://www.w3.org/Style/XSL for the XSLT 1.0/XPath 1.0

Recs, the XSL-FO 1.0 candidate rec, and working drafts of XSLT 1.1/2.0 and XPath 2.0

n A XSL stylesheet specifies the presentation of a class of XML documents. It describes how an instance of the class is transformed into an XML document that uses the formatting vocabulary

6

XML Data Rendering Patterns

n Manipulating and Rendering XML Structures Using Javan XSL-T

n Transformn Sortn Output

n XSL-T + -FOn Formatn Output

n Querying will be covered separately

Page 3: g22 3033 002 c71.prn · n JAXP: Java API for XML Processing n Parsers comparison n Latest W3C APIs and Standards for Processing XML n XML Infoset, DOM Level 3, Canonical XML n XML

3

7

eXtensible Style Language (XSL)n DSSSL & DSSSL-On CSS 1, 2, 3 …

n http://www.w3.org/Style/CSS/

n XSLTn XPath

n XSL-FOn XSLT Processors

n Stylus Studio XSL development environmentn IBM XSL Editorn Saxon and Xalan XSLT processors

n XSL-FO Processorsn Antenna Housen fop

8

XSL Processingn http://www.w3.org/Style/XSL/n Processing Alternatives:

n HTML + CSS -> Presentation

n XML + CSS -> Presentationn XML + XSLT -> XSL-FO -> Presentation

n XML + XSLT -> XML/HTML + CSS -> Presentation

n Client or Server Processing ?n See Session 2 handout on IE5’s implementation of the XSL Spec.

n Examplesn See Session 2 Sub-Topic 1 Presentation: Beginning XMLn See Session 2 handouts on XSL Tree Transformation Language n See Session 2 handout on Cascading Stylesheetsn See Session 2 handout on Styling Documents Using XSL

9

A Language for “Mapping XML” (LMX)

n LMX is a sample textbook application n LMX can convert a document in one DTD into another DTD and

vice versa

n LMX uses rules to describe bi-directional “MOM” conversions between two sets of documentsn Rules have a “from-pattern” and a “to-pattern”b to respectively match

the source document, and construct the target documentn Some restrictions exist w.r.t. the LMX patterns in order to simp lify the

program as much as possible

n LMX can also be used to convert a XML document to HTML (“POP” application)

Page 4: g22 3033 002 c71.prn · n JAXP: Java API for XML Processing n Parsers comparison n Latest W3C APIs and Standards for Processing XML n XML Infoset, DOM Level 3, Canonical XML n XML

4

10

How Does the LMX Processor Work?

n LMX makes heavy use of the DOM 1.0 APIn LMX uses XML4J internally to:

n Parse a rule filen Parse a source documentn Generate a target document

n See chapter 4.3 in the XML and Java textbook for a detailed description of the LMX implementation

11

LMX v.s. the eXtensible Stylesheet Language (XSL)

n LMX and XSL both provide a syntax to encode “Style Sheets”

n Each XML document can be associated with a style sheet that describes how elements should be organized and formatted for presentation

n XSL style sheets provide custom appearances that give a web site a unified look and feel

12

How Does XSL Work?

n A XSL style sheet is an XML document

n XSL elements in a XSL style sheet correspond to a series of XSL “transformation” rules (i.e., XML tree transformation and/or formatting rules)

n XSL rules describe how particular XML tags are to be converted to “flow objects” as the document is read

Page 5: g22 3033 002 c71.prn · n JAXP: Java API for XML Processing n Parsers comparison n Latest W3C APIs and Standards for Processing XML n XML Infoset, DOM Level 3, Canonical XML n XML

5

13

Part I

Extensible Stylesheet Language Transformation(XSLT)

14

XSL Transformations

n Assume root element of style sheet is <xsl>n Each <xsl> element contains one or more rule elements

n Each rule has a target and an actionn Target is a regular expression defining to which XML elements the

rule appliesn Action is the list of flow objects generated when the rule is applied:

n Actions output a series of HTML tags in combination with the content of the element

n Actions may output XML tags obtained via transformation of original XML data

n Actions may output non-markup text, or run simple scripts or programsn Actions may use JavaScript to provide more complex, and dynamic behaviors

15

XSL Transformations (continued)

n Conceptual Representation of XSL Transformations:

<xsl><rule>

<target-element type=“tagname”/>action

</rule><rule>

(…)</rule>

</xsl>

Page 6: g22 3033 002 c71.prn · n JAXP: Java API for XML Processing n Parsers comparison n Latest W3C APIs and Standards for Processing XML n XML Infoset, DOM Level 3, Canonical XML n XML

6

16

XSL-T and Templates

n XSLT rules are also called “Templates”n There may not be rules to match every elementn Elements can be reordered on the output.n XSL style sheet must be well-formed

n e.g., a HTML empty tag specified as <br> must be written as <br/> within a XSL style sheet action

n XSLT elements used as a basis for a simple stylesheet are:n <xsl:stylesheet>, <xsl:template match …>, <xsl:apply -

templates>, <xsl:for-each select ...>, and <xsl:sort select …>

17

XSLT Elements and Functionsn Creating Elements and Attributes

n xsl:element, xsl:attribute

n Iteration and Sorting (e.g., xsl:sort)n Conditional Processing

n xsl:apply-templates select=“ … “, xsl:if, xsl:choose

n Copying Nodes (e.g., xsl:copy)n Combining Stylesheets

n xsl:import, xsl:include

n Defining Variables & Parameters (e.g., xsl:variable)n Scripting with XPath functions

18

Parsers with XSLT Support

n SAX 2.0 or DOM Level 2 1.0 Support Requiredn Apache’s Xalan XSLT parser

n org.apache.xalan.processor/templates/transformern org.apache.xpath

n Saxon XSLT parsern JAXP 1.1 (javax.xml.transform)

n TraXP

n Supported by Xalan 2.0, and Saxon 6.1

n Sun’s XSLTCn Converts stylesheet’s to class files (“translets”)

Page 7: g22 3033 002 c71.prn · n JAXP: Java API for XML Processing n Parsers comparison n Latest W3C APIs and Standards for Processing XML n XML Infoset, DOM Level 3, Canonical XML n XML

7

19

Part II

Extensible Stylesheet LanguageFormatting Object (XSL-FO)

20

XSL Formatting

n XSL flow objects are markup textn Markup language output flow objects can be HTML,

DSSSL, VRML, etc.

n We will focus on HTML output flow objects (simpler, more widely understood, better supported by current tools, and do not require an extra level of translation)

21

XSL Formatting Characteristics

n XSL formatting is simpler than DSSSL (Document Style Semantics and Specification Language, pronounced “dissal”, ISO std 10179:1996)

n XSL formatting is more powerful than CSS (Cascading Style Sheets)

n XSL’s basic formatting syntax is understandable by anybody acquainted with DSSSL or CSS

Page 8: g22 3033 002 c71.prn · n JAXP: Java API for XML Processing n Parsers comparison n Latest W3C APIs and Standards for Processing XML n XML Infoset, DOM Level 3, Canonical XML n XML

8

22

Part III

XML and Document/Content Management

23

What is a XSL Processor?

n A XML document and its associated style sheet are combined by an XSL processor to produce a HTML documentn The XSL Processor applies the style sheet to the XML document

and outputs static HTMLn The process can be automated with CGI scripts, Java servlets, or

ActiveX controls to convert XML to HTML on the fly

n A XSL processor is a standalone program or is part of a larger XML browser

24

How Does a XSL Processor Work?

n The XSL processor consults the style sheet to find the rule that matches the element

n The XSL processor takes whatever action is associated to the rule:n outputs element’s content plus assorted markupn performs more complicated operations (sorting XML data before

outputting it, running a Javascript program on the XML data, adding missing content to XML data, etc.)

Page 9: g22 3033 002 c71.prn · n JAXP: Java API for XML Processing n Parsers comparison n Latest W3C APIs and Standards for Processing XML n XML Infoset, DOM Level 3, Canonical XML n XML

9

25

How Does a XSL Processor Work?(continued)

n XSL processor formats each element upon receipt

n XSL processor may process elements recursivelyn XSL processor receives input from XML processor and

outputs formatted data based on the nature of the elements it receivesn E.g., XSL processor receives <strong> element

n XSL processor may output same content as bold textn If processor is an audio renderer, it may pump up the volume a n otch...2

26

How Does a XSL Processor Work?

n The XSL processor consults the style sheet to find the rule that matches the element

n The XSL processor takes whatever action is associated to the rule:n outputs element’s content plus assorted markupn performs more complicated operations (sorting XML

data before outputting it, running a Javascript program on the XML data, adding missing content to XML data, etc.)

27

How Does a XSL Processor Work?(continued)

n XSL processor formats each element upon receiptn XSL processor may process elements recursivelyn XSL processor receives input from XML processor

and outputs formatted data based on the nature of the elements it receivesn E.g., XSL processor receives <strong> element

n XSL processor may output same content as bold textn If processor is an audio renderer, it may pump up the volume a n otch...

Page 10: g22 3033 002 c71.prn · n JAXP: Java API for XML Processing n Parsers comparison n Latest W3C APIs and Standards for Processing XML n XML Infoset, DOM Level 3, Canonical XML n XML

10

28

Mainstream XSL Processors

n See Microsoft’s XML and XSL Samples and Demos at http://msdn.microsoft.com/xml

n See IBM’s LotusXSL, Apache’s xalan, and fop. Look at Appendix E of the class textbook for relevant information on XSL

n A comprehensive list of XSL formatters, and XSLT engines/editors/utilities is available at http//www.xmlsoftware.comn Includes links to latest product pagesn Includes Version numbers, Licensing information,

and Platform details

29

DOM 1.0 XSL Processing Support

n The DOM Level 1 specification does not support XSL stylesheets

n Microsoft’s initial version of MSXML DOM included a DOM Level 1 extension that added support for XSL stylesheetsn The function transformNode(…) was used to apply an

XSL stylesheet to an existing XML document

n Similar extensions were emulated early on by other XSL processors (LotusXSL, xalan, fop, etc.)

n DOM Level 2 1.0 formalizes rendering support

30

Mainstream XSL Processors

n See Microsoft’s XML and XSL Samples and Demos at http://msdn.microsoft.com/xml

n See IBM’s LotusXSL, Apache’s xalan, and fop. Look at Appendix E of the class textbook for relevant information on XSL

n A comprehensive list of XSL formatters, and XSLT engines/editors/utilities is available at http//www.xmlsoftware.comn Includes links to latest product pagesn Includes Version numbers, Licensing information, and

Platform details

Page 11: g22 3033 002 c71.prn · n JAXP: Java API for XML Processing n Parsers comparison n Latest W3C APIs and Standards for Processing XML n XML Infoset, DOM Level 3, Canonical XML n XML

11

31

Xalan

n Xalan-J version 2.1.0 is the latestn Provides XSL-T processing for transforming

XML documents into HTML, text, or other XML document types

n Built on top of SAX 2.0, DOM Level 2 1.0, JAXP 1.1

n Implements the TraX subset of JAXP 1.1

32

FOP

n Latest version is 0.19n xml.apache.org/fop, www.jtauber.com

n Print formatter driven by XSL-FO objectsn Formatted output is in PDF format for nown Can be embedded in a Java application by

instantiating org.apache.fop.apps.Driver

33

Frameworks

n Cocoon 2n Xangn Batik

Page 12: g22 3033 002 c71.prn · n JAXP: Java API for XML Processing n Parsers comparison n Latest W3C APIs and Standards for Processing XML n XML Infoset, DOM Level 3, Canonical XML n XML

12

34

Part IV

Conclusions

35

Summary

n XSL style sheets describe how individual elements are displayed in HTML

n A XSL processor like LotusXSL converts an XML document and its associated style sheet into an HTML document that can be read by current web browsers

n Style instructions are stored in rule elements

36

Summary(continued)

n Each rule has a pattern and an actionn The pattern define the elements to which the rule

applies

n The action specifies the flow objects that the XSL processor outputs when the rule fires

n When multiple rules apply to one element, only the most specific rule is applied

n Flow objects usually include the content of the element, along with some combination of HTML markup

Page 13: g22 3033 002 c71.prn · n JAXP: Java API for XML Processing n Parsers comparison n Latest W3C APIs and Standards for Processing XML n XML Infoset, DOM Level 3, Canonical XML n XML

13

37

Readingsn Readings

n XML Development with Java 2: Chapter 5

n Professional Java XML: Chapters 7,8, and Appendix Gn XML and Java: Chapter 4n Handouts posted on the course web site

n Review WCAG status on W3C web site

n Project Frameworks Setup (ongoing)n Apache’s Web Server, TomCat/JRun, and Cocoon

n Apache’s Xerces, Xalan, Saxonn Antenna House XML Formatter, Apache’s FOP, X-smiles

n Publishing Systems at http://www.xmlsoftware.comn Visibroker 4.5, WebLogic 6.1n POSE & KVM (See Session 3 handout)

38

Assignment

n Assignment #4:n This part of the project focuses on the application content

model design/development using XML information rendering technology. The design/development process should adhere to the following steps: (a) Identifying rendering/transformation

targets, (b) Defining the optimal rendering approach for each target, (c) Considering data rendering issues when designing an

overall application data modeln More specific project related information, and extra credit

assignments will be provided during the session

39

Next Session:XML Information Rendering (Part II)

n XML/XSL and JSP/JavaBeans Rendering Technologyn Internationalization Issuesn Web Content Accessibility Guidelines (WCAG)