XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk...

37
XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte, statistics, graphs

Transcript of XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk...

Page 1: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

XML/XSL and Information

continue with XML/XSL

Ideas from Edward Tufte on data density & data junk

Homework: find & report on data presentation, Tufte, statistics, graphs

Page 2: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Comments on XSLT

• declarative as opposed to procedural language– no side effects (variables can't be changed, order of

application of templates is somewhat flexible)

• one main use is matching parts of XML tree using patterns and 'declaring' results– push processing: pushes out results based on applying

templates

– pull processing: pulls in relevant information and produces results

Page 3: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Comments, cont.

• XML still under development (definition of next standard)– New version will have what are now done with so-

called extensions.

– Other options are server-side or client side programming

• XML Schema possible replacement for DTDs• XML-Formatting Objects focus [more] on

formatting

Page 4: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

XSLT examples

• use of variables defined by [more intricate] Xpath expressions

• use of recursive calls to named templates– template calling itself with new parameters

Other mechanisms (for you to look up as needed)– mode for template: facility to examine (transform the

same nodes under different conditions=modes)– key function: facility to categorize nodes according to

some calculated expression.

Page 5: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

World cup data

• Previous example transformed each match, dependent on whether or not match marked as 'feature' in attribute.

• What about producing a table of the results?

Page 6: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,
Page 7: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

What do we want to do

• Transform logic:– produce HTML table, one row for each team– calculate for team certain values that use all the

matches that that team is 'in'.

• Implementation– use XSLT variables

Page 8: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

XSLT mechanics

<xsl:variable name="teams" select="//team[not(.=preceding::team)]"/>• The value of variables is set by the Select pattern. THEY CANNOT

BE CHANGED. • The //team means find all the team nodes anywhere.• The . means the node you are considering now.• The square brackets define a condition for which teams are to be

selected. This variable is a node set.• the preceding:: is an example of what is called an axis. It is a

qualifier.• This says: make up a node set consisting of teams, but don't include

any that have occurred previously.

Page 9: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

<?xml version="1.0" encoding="UTF-8" ?><xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"><xsl:output method="html"/>

<xsl:variable name="teams" select="//team[not(.=preceding::team)]"/><xsl:variable name="matches" select="//match"/>

<xsl:template match="/results"> <html> <head><title>Results of World Cup </title><LINK REL="stylesheet" TYPE="text/css" HREF="results.css"/></head> <body> <h2> Results of World Cup </h2>

Page 10: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

<table cellpadding="5">

<tr>

<th> Team </th>

<th> Played </th>

<th> Won </th>

<th> Lost </th>

<th> Tied </th>

<th> For </th>

<th> Against </th>

<th> Points </th>

</tr>

Page 11: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

<xsl:for-each select ="$teams">

<xsl:variable name="this" select="."/>

<xsl:variable name="played" select="count($matches[team=$this])"/>

<xsl:variable name="won" select="count($matches[team[.=$this]/@score &gt; team[.!=$this]/@score])"/>

<xsl:variable name="lost" select="count($matches[team[.=$this]/@score &lt; team[.!=$this]/@score])"/>

<xsl:variable name="tied" select="count($matches[team[.=$this]/@score = team[.!=$this]/@score])"/>

<xsl:variable name="for" select="sum($matches/team[.=current()]/@score)"/>

<xsl:variable name="against" select="sum($matches[team=current()]/team/@score)-$for"/>

<xsl:variable name="points" select="3*$won+$tied"/>

$ indicates variable

Page 12: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

<tr><td><xsl:value-of select="."/></td><td><xsl:value-of select="$played"/></td><td><xsl:value-of select="$won"/></td><td><xsl:value-of select="$lost"/></td><td><xsl:value-of select="$tied"/></td><td><xsl:value-of select="$for"/></td><td><xsl:value-of select="$against"/></td><td><xsl:value-of select="$points"/> </td></tr></xsl:for-each></table> </body> </html></xsl:template></xsl:transform>

Page 13: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

New example

• http://99-bottles-of-beer.ls-la.net/

• A web site with programs in over 300 different programming languages to display all verses to …

• This is my version in xml/xslt – They have another version that is xslt stand-

alone. Check it out.

Page 14: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Example<?xml version="1.0" ?><?xml-stylesheet href="lyrics.xsl" type="text/xsl"?><!DOCTYPE lyrics [<!ELEMENT lyrics (line1, line2, line3)><!ATTLIST lyrics start CDATA #REQUIRED> <!ELEMENT line1 (#PCDATA)><!ELEMENT line2 (#PCDATA)><!ELEMENT line3 (#PCDATA)>]><lyrics start="3"> <line1> bottles of beer on the wall</line1> <line2> bottles of beer</line2> <line3> take one down and pass it around</line3></lyrics>

could be 99

Page 15: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Document Type Definition

• defines what is a valid XML document• Validation can be done with external validator

http://www.stg.brown.edu/service/xmlvalid/xmlvalid.var

• Alternative to DTD is XML Schema– XML Schemas are XML trees.

– less advanced with respect to being official standard

Page 16: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Demonstration• Go tohttp://www.stg.brown.edu/service/xmlvalid/xmlvalid.var

• (Since it is short), copy and paste lyrics.xml into text area

• Click on validate– returns ok

• Now, change xml to NOT match DTD– remove start attribute– add element

• Click on validate– indicates problems

Page 17: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

What do we want to do?• Transformation

– produce well-formed HTML– start with the 'start' attribute and, using it as a string,

output it as start of first line.– output as HTML the line.– using start as number, subtract one to get new value.

Using this value as a string, output with line1.– repeat process

• Implementation– The 'repeat' will be done as a recursive call, that is, a

template will call itself.– The template will be a named template, with parameter

the value, starting with the value of the start attribute.

Page 18: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Outline of the xsl file

• header/instructional stuff

• template that matches the main node (lyrics)

• the so-called named template to be called by the main template AND also called by itself

Page 19: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Technical note

• <xsl:copy> and <xsl:copy-of> copies information from the source document to the result.

• <xsl:copy> copies only the node whereas <xsl:copy-of> copies the node and any descendants (called a deep copy)

• In this example, either could be used.

Page 20: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

<?xml version="1.0" encoding="UTF-8" ?><xsl:transform

xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output method="html"/><xsl:template match="/lyrics"><html><head><title>Singing </title></head><body><xsl:call-template name="singverse"> <xsl:with-param name="counter"> <xsl:value-of select="@start"/> </xsl:with-param></xsl:call-template></body> </html></xsl:template>

templates can have parameters. Here one parameter is set with value of start attribute.

Page 21: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

<xsl:template name="singverse"><xsl:param name="counter" /> <br/><xsl:copy-of select="$counter"/> <xsl:value-of select="line1" /><br/><xsl:copy-of select="$counter"/><xsl:value-of select="line2" /><br/><xsl:value-of select="line3" /> <br/><xsl:variable name="next" select="$counter - 1" /><xsl:copy-of select="$next"/> <xsl:value-of select="line1" /><br/> **** <br/><xsl:if test="$next >=1" ><xsl:call-template name="singverse"> <xsl:with-param name="counter" select="$next" /></xsl:call-template></xsl:if> </xsl:template> </xsl:transform>

Page 22: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

User centered design• Build the interface/application for the person using it!

This is generally not you.• (Sometimes), it is important to distinguish between the

system owner and system users.• If possible, use a more descriptive name: client,

customer, patient, player, museum visitor, tourist, for users.

• Determine for the user(s), what is best– organization – vocabulary

• Determine for the user(s), what are the important/all – platform, access, etc.

Page 23: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Challenges

• More than one 'user' category– first time (novice) versus repeat (expert)

• System owners (the paying client) may want system to serve different audiences– intranet, employees at client locations, employees at

hotels and on planes, perhaps using cell phones….

• Web sites: visitors may enter site in different ways.– Search engines may make it important for what you

think of as inner pages to be stand-alone.

Page 24: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Data presentation

• Edward Tufte (and others) promote presentation of data that features the data as opposed to (what he calls) chartjunk.– You need data=content.

• Compare amount of space devoted to data versus everything else, including– descriptions, annotation, labels – illustration without content

• Also, make sure space for navigation is not overdone.

Page 25: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Tufte: Challenges of display

• 'life' is multi-dimensional, multi-variety but paper and screens are two-dimensional.– How do you escape flatland?

• Your content may require more resolution than you have, especially if limited by computer screens– How do you manage data density?

Solution: thoughtful, inventive, creative design!Design is clear thinking made visible.

Page 26: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Tufte advice

• What's the problem? Who cares? (why care)? What is solution?

• Particular – general – particular• Teaching by example—study books

– Minard march– Connecticut radar– Challenger disaster– (Columbia disaster)

Page 27: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,
Page 28: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Connecticut auto deaths

• Traffic deaths following (intervention) of radar – no context– Is it normal fluctuation, with 'normal'

regression to the mean after an extreme (outlier) year or the effects of policy?

– Need to look at context (in space and time)

Page 29: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Challenge Disaster

• Failure of---technical presentation

• The Morton Thiokol engineers did not want to approve the launch because they thought that the O-rings would not work in cold weather.– They made a presentation, which did not

succeed.– The launch went. Their prediction was correct!

Page 30: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,
Page 31: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Tufte's proposal

Page 32: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Principles

• Show visual comparisons. Try to make comparisons in space and not 'stacked in time'

• Show causality.• Show multi-variables/dimensions.• Integrate word, number and image.• Document: where did data come from? Annotate.• Everything depends on quality, relevance and

integrity of content.

Page 33: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Screen

• Term: screen real estate used to indicate the value of each part of the screen.

• You (system owners) may need to share screen with other organizations, for example, ad space.

• White space, that is, space with nothing on it, is valuable for clarity.

Page 34: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Data dimension

• The data that is worth presenting in graphics form (as opposed to clear text) is generally complex: multi-dimensional. more on this next class.

• Tufte (and others): don't give data dimensions it doesn't have.– recall 3D bar graph– recall army marching in and out of Russia.– New York Times interactives on 9/11 focused on time,

location in the towers, Fire Companies from different places in NYC. Audio was often real: police calls, calls on cell phones.

Page 35: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

As with 3D bar charts when you only have points,

avoid rainbow, when the data is one-dimensional

(Note: shades of blue chart better for color-blind

visitors.)

Page 36: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

My defense

• Tufte recommends: no PowerPoint (no charts with bullets)– I try for whole sentences.– I avoid decorations.– Charts are my notes (which I share with you).

Page 37: XML/XSL and Information continue with XML/XSL Ideas from Edward Tufte on data density & data junk Homework: find & report on data presentation, Tufte,

Homework

• Find and report via CourseInfo on one of these or related topics: Tufte, visual presentation, good/bad use of graphs, user-centered design.

• Continue taking on-line XML/XSLT tutorials.• Do your own versions of XML/XSLT exercises• (Try writing (simple) DTD and doing validation.)