Cocoon & WebDAV Gianugo Rabellino, Matthew Langham Cocoon GetTogether 2003
Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.
-
Upload
jemimah-paul -
Category
Documents
-
view
227 -
download
0
description
Transcript of Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.
![Page 1: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/1.jpg)
Cocoon
An XML Web Publishing Framework From the Apache
Project
Roland Schweitzer
![Page 2: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/2.jpg)
6 August 2002 OAR Web Shop 2
Today’s Topics:• Definitions• Motivation• Required Tools (Java, Apache Tomcat and
Cocoon)• Basic Cocoon Operation
– Matchers, Generators, Transforms and Serializers. Oh My!
– sitemap.xml glues it all together.
![Page 3: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/3.jpg)
6 August 2002 OAR Web Shop 3
Cocoon• An XML-based WWW publishing
framework implemented as a Java Servlet.– Web site content stored in XML files (or
RDBMS, LDAP Server or other source) is transformed (mostly via XSLT) into new XML files (to exclude certain info for example) and then serialized into human usable output (like an HTML or PDF file).
![Page 4: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/4.jpg)
6 August 2002 OAR Web Shop 4
Reusable Content
![Page 5: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/5.jpg)
6 August 2002 OAR Web Shop 5
Motivation for using Cocoon• We distribute climate data• Users (including scientists) find data via
public search engines like google• Public search engines index HTML content• NOAA and other scientific organization use
special purpose search engines that use FGDC (or DIF derived from FGDC)
![Page 6: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/6.jpg)
6 August 2002 OAR Web Shop 6
Motivation continued• These facts add up to maintaining separate
“documents” for each purpose• XML and Cocoon offers a (yet another
potential) way out of the morass of many special purpose document collections
![Page 7: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/7.jpg)
6 August 2002 OAR Web Shop 7
Suppose info was stored as XML
<page><title>Reynolds Sea Surface Temperature </title><prefix>data.sst</prefix><abstract><para>The optimum interpolation (OI) SST analysis…<para></abstract><contact><name>CDC Data Management Personel</name><address1>325 Broadway</address1><phone>(303) 497-6244</phone>
<email>[email protected]</email></contact>
…</page>
![Page 8: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/8.jpg)
6 August 2002 OAR Web Shop 8
The Power of XML Content• Can be parsed with standard XML tools
– Can be easily used for another purpose besides the Web
– Can be written with powerful XML GUI tools (e.g. XML spy)
– (Might be) easier to maintain
![Page 9: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/9.jpg)
6 August 2002 OAR Web Shop 9
Reusable Content
![Page 10: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/10.jpg)
6 August 2002 OAR Web Shop 10
Schematic of the Solution Using Cocoon
R DBMSwith netCD F m etadata
Local standard HTML output
for hum ans
Hum an readableFG DC (output as H TML) with
nice form atting anchors and links
FG DC O utput fordom ain specific search
engines
XML D ocum ent
Cocoon Some other process
![Page 11: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/11.jpg)
6 August 2002 OAR Web Shop 11
Required Tools• On Solaris 7 and 8 I have used the binary
distributions of:– Java 1.4.0 (java.sun.com)– Tomcat 4.0.4 (www.apache.org)– Cocoon 2.0.3 (xml.apache.org)
• At this time, these are the latest releases.• Follow the installation instructions for each
package.
![Page 12: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/12.jpg)
6 August 2002 OAR Web Shop 12
Basic Operation• Cocoon is based on pipelines:
A Bit of SoftwareXML File New XML File
A Bit of Software
New XML FileA Bit of Software Info to client
(e.g HTML to browser)
![Page 13: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/13.jpg)
6 August 2002 OAR Web Shop 13
Basic Operation• Cocoon is based on pipelines. An XML document
is pushed through a pipeline consisting of one Generator (read a file, create a file from an LDAP server, etc.), zero or more Transforms (for example, to leave out sensitive information for external users) and ends with a Serializer that transforms the XML to binary or character data for consumption by the client (Web browser).
• The entire site could use only one pipeline.
![Page 14: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/14.jpg)
6 August 2002 OAR Web Shop 14
Basic Operation• If you need more than one pipeline…• Matchers (wildcard and regular expression)
and Selectors (Boolean expressions) can be used to control the pipeline used to process the XML content.
![Page 15: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/15.jpg)
6 August 2002 OAR Web Shop 15
Components• Matchers, Generators, Transforms and
Serializers are all Cocoon Components.• Pipelines are build out of Components.• Components are declared and pipelines are
constructed in the sitemap.xmap file.• The “Bit of Software” needed for each
Component is provided by Cocoon or built by you.
![Page 16: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/16.jpg)
6 August 2002 OAR Web Shop 16
Components (Matchers)• Suppose you wanted these URI patterns to
be handled by cocoon:– For example the wildcard patterns:– http://www.cdc.noaa.gov/cocoon/data/*.htmland– http://www.cdc.noaa.gov/cocoon/data/*.pdf
could result in two pipelines with two different outputs types.
![Page 17: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/17.jpg)
6 August 2002 OAR Web Shop 17
Components (Matchers)• Need a “bit of software” that looks at:
– http://www.cdc.noaa.gov/cocoon/data/data.sst.html– Matches the the URL www.cdc.noaa.gov/cocoon/data– And the extension “.html”– Extracts the wildcard part of the URL data.sst– Starts the pipeline to produce HTML output from the
data.sst.xml file (the wildcard plus the .xml extension).
![Page 18: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/18.jpg)
6 August 2002 OAR Web Shop 18
The WildCard Matcher• We’re in luck!• A Matcher Component already exists in
Cocoon to do what we want.• To use a Component we must declare it in
the sitemap.xmap file that controls our Cocoon installation.
![Page 19: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/19.jpg)
6 August 2002 OAR Web Shop 19
Declare the WildCard MatcherIn sitemap.xmap configuration file:<map:matchers default=“wildcard”> <map:matcher
name=“wildcard”src=
"org.apache.cocoon.matchingWildcardURIMatcher"/>…</map:matchers>
![Page 20: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/20.jpg)
6 August 2002 OAR Web Shop 20
Use the Matcher on a URI• We’ve declared the Matcher Component• Use the Matcher component in our pipeline
to grab the * part of the pattern and use it to specify the source XML file that will be send through the pipeline.
![Page 21: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/21.jpg)
6 August 2002 OAR Web Shop 21
Use the Matcher in a Pipeline• This pipeline uses the default Matcher,
which is the WildCard Matcher we declared in the previous slide
<map:match pattern=“data/*.html"> <map:generate src=" data/{1}.xml"/>
![Page 22: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/22.jpg)
6 August 2002 OAR Web Shop 22
Now What?• We have successfully declared and used a
Matcher to decide which pipeline we will use to process the first of our two examples URIs.
• Now we need to declare and use a Generator, which is always the first step of the pipeline.
![Page 23: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/23.jpg)
6 August 2002 OAR Web Shop 23
Components (Generators)• Declare a generator in sitemap.xmap:<map:generators default=“file”> <map:generator name=“file” src=
“org.apache.cocoon.generationFileGenerator”/>…</map:generators>
![Page 24: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/24.jpg)
6 August 2002 OAR Web Shop 24
Use the Generator in a Pipeline• The File Generator was declared as the default.• Its only job is to read the a file from the file system.<map:pipelines>
<map:pipeline><match pattern=“data/*.html”>
<map:generate src=“data/{1}.xml”/>
…
![Page 25: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/25.jpg)
6 August 2002 OAR Web Shop 25
Review: Matcher and Generator• Components (Matchers)• Need a “bit of software” that looks at:
– http://www.cdc.noaa.gov/cocoon/data/data.sst.html– Matches the the URL www.cdc.noaa.gov/cocoon/data– And the extension “.html”– Extracts the wildcard part of the URL data.sst– Starts the pipeline to produce HTML output from the
data.sst.xml file (the wildcard plus the .xml extension).
![Page 26: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/26.jpg)
6 August 2002 OAR Web Shop 26
Review: Pipeline Components• Conditional use of pipeline via the Matcher• One Generator (FileGenerator)• Zero or more Transforms (?)• Ends with a Serializer (?)
![Page 27: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/27.jpg)
6 August 2002 OAR Web Shop 27
Components (Transforms)• Declare a Transform:
<map:transformers default="xslt"><map:transformer name="xslt“ src="org. apache.cocoon.transformation.TraxTransformer">
<use-request-parameters>false
</use-request-parameters><use-browser-capabilities-db>
false</use-browser-capabilities-db>
</map:transformer>
![Page 28: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/28.jpg)
6 August 2002 OAR Web Shop 28
• Different from previous declarations we’ve seen.
• This declaration includes two additional configuration parameters.
The XSLT Transformer
<use-request-parameters>
<use-browser-capabilities-db>
![Page 29: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/29.jpg)
6 August 2002 OAR Web Shop 29
Add the Transformer to Pipeline<map:match pattern="*.html">
<map:generate src=" {1}.xml"/> <map:transform
src=“datastyle/HTMLstyle.xsl"/>
![Page 30: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/30.jpg)
6 August 2002 OAR Web Shop 30
The Stylesheet written in XSLT:<HTML> <HEAD> <TITLE><xsl:value-of select="/page/title"/></TITLE> </HEAD> <BODY>…<xsl:template match="/page/abstract"> <h2>Abstract:</h2> <xsl:apply-templates select="para"/></xsl:template>
![Page 31: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/31.jpg)
6 August 2002 OAR Web Shop 31
Components (Serializers)• The last step of each Pipeline is a Serializer• It consumes XML (in the form of SAX
events) and generates a character stream for a client (Web browser, Acrobat Reader, etc.).
![Page 32: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/32.jpg)
6 August 2002 OAR Web Shop 32
Declare the SerializerIn sitemap.xmap:<map:serializers default="html">
<map:serializer mime-type="text/html" name="html"
src=“...HTMLSerializer"> <buffer-size>1024</buffer-size> </map:serializer>
![Page 33: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/33.jpg)
6 August 2002 OAR Web Shop 33
The Completed Pipeline
<map:match pattern=“data/*.html"> <map:generate src=“data/{1}.xml"/> <map:transform
src=“datastyle/HTMLstyle.xsl"/><map:serialize/>
</map:match>
![Page 34: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/34.jpg)
6 August 2002 OAR Web Shop 34
Pipeline to make PDF output
<map:match pattern=“data/*.pdf"> <map:generate src=“data/{1}.xml"/> <map:transformsrc="stylesheets/FOstyle.xsl"/><map:serialize type="fo2pdf"/>
</map:match>
![Page 35: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/35.jpg)
6 August 2002 OAR Web Shop 35
![Page 36: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/36.jpg)
6 August 2002 OAR Web Shop 36
http://www.cdc.noaa.gov/cocoon/data/data.sst.html
![Page 37: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/37.jpg)
6 August 2002 OAR Web Shop 37
http://www.cdc.noaa.gov/cocoon/data/data.sst.pdf
![Page 38: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/38.jpg)
6 August 2002 OAR Web Shop 38
The Dreaded Demo• Demo Data Set Descriptions at CDC.
![Page 39: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/39.jpg)
6 August 2002 OAR Web Shop 39
Cocoon is all this and more!• Action Components to do complex
initialization (e.g. get database connection pool) during pipeline setup.
• Resource Components are internal reusable pipeline fragments.
• XSP and Logic Sheets offer capabilities similar to JSP with further separation of the logic.
![Page 40: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/40.jpg)
6 August 2002 OAR Web Shop 40
Resources• www.apache.org• Inside XSLT by Steven Holzner (New
Riders)• Java and XSLT by Eric M. Burke (O’Reilly)
![Page 41: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/41.jpg)
6 August 2002 OAR Web Shop 41
Reality Check!• We have not (yet) put this system in
production.• Still designing the XML representation.• Still learning about using Cocoon with a
relational database.• Considering using XSP pages.
![Page 42: Cocoon An XML Web Publishing Framework From the Apache Project Roland Schweitzer.](https://reader036.fdocuments.in/reader036/viewer/2022062223/5a4d1b3b7f8b9ab05999ec32/html5/thumbnails/42.jpg)
6 August 2002 OAR Web Shop 42
Conclusions• Cocoon offers the potential to use and reuse
one bit of XML content for many purposes.• Most operations for Web hosting the XML
content are built-in to Cocoon.• Unlimited customization by writing your
own Components.• Content is easily maintained and separated
from presentation.