Collisions, Chimera and Consonance in Web Content

24
Collisions, Chimera and Consonance in Web Content Jeni Tennison Sunday, 5 February 12 Suggested talking about microdata & RDFa, or about my work on legislation.gov.uk, got the reply "yes, all of that!" Kinda hard to see how to bring them together, so I've had to go large- scale...

description

Four formats wrestle with each other for web glory. By their acronyms we shall know them: HTML, XML, JSON, RDF. Sometimes they clash, and sometimes they merge, forming weird and wonderful hybrids. Is there any way for them to work together? I will talk about the problems of mixing models and describe how we are using these formats together in legislation.gov.uk.

Transcript of Collisions, Chimera and Consonance in Web Content

Page 1: Collisions, Chimera and Consonance in Web Content

Collisions, Chimera and Consonance in Web Content

Jeni Tennison

Sunday, 5 February 12

Suggested talking about microdata & RDFa, or about my work on legislation.gov.uk, got the reply "yes, all of that!" Kinda hard to see how to bring them together, so I've had to go large-scale...

Page 2: Collisions, Chimera and Consonance in Web Content

what is the web? hypermedia = HTML

http://www.flickr.com/photos/believekevin/6490737589 from believekevin

Sunday, 5 February 12

In the beginning, the web was about hypertext, and shortly afterwards hypermedia: individual pages of simple content whose revolutionary power was not a powerful, well-thought-out, semantic document structure, but the fact they contained links.

Page 3: Collisions, Chimera and Consonance in Web Content

what is the web? structured documents = XML

http://www.flickr.com/photos/marcus_hansson/87885327 from Marcus Hansson

Sunday, 5 February 12

People with SGML experience thought that the web could provide even more value if it was not limited to a single, not particularly meaningful, language. This led to the birth and energetic childhood of XML, one spent doing everything it could possibly do and more.

Page 4: Collisions, Chimera and Consonance in Web Content

what is the web? (meta)data = RDF

http://www.flickr.com/photos/proimos/6033969880 from Alex E. Proimos

Sunday, 5 February 12

Around the same time, others had the notion that the web was not just for providing documents, but for providing metadata about those documents, and data about things like people and traffic and buildings, which gave us RDF and another stack of technologies.

Page 5: Collisions, Chimera and Consonance in Web Content

what is the web? applications = JSON

Sunday, 5 February 12

Meanwhile computers got faster and web sites became about providing valuable services to their users rather than access to either documents or data. The focus of web sites turned to interaction, and to applications. Concise, application-specific messages, easy to use with Javascript, meant JSON.

Page 6: Collisions, Chimera and Consonance in Web Content

four formats different answers

HTML JSON

RDFXML

Sunday, 5 February 12

So we have ended up with four formats with which you can deliver content on the web, each arising from a different view of what the web is.

Page 7: Collisions, Chimera and Consonance in Web Content

each format has strengths and weaknesses

HTML JSON

RDFXML

lingua franca

hard to get wrongconcise

application-native data

single source format

flexible

web-native data

graph model

Sunday, 5 February 12

Each format has advantages, and so each looks at others advantages jealously:HTML's ubiquityXML's flexibility and ease of parsingRDF's reach to a real-worldJSON's practicality

One result is ghettoisation: "you should not exist! you have no point! I am all that's needed!"Another result is self-doubt: "what am I here for? what should I be?"

Page 8: Collisions, Chimera and Consonance in Web Content

I wanna be like you ... or you should be more like me

Sunday, 5 February 12

Another result is merged technologies: ones that seek to gain the benefits of two or more formats.

"If we make RDF more like HTML, perhaps people will use it""If you turned that crappy JSON into XML, perhaps I might use it"

Page 9: Collisions, Chimera and Consonance in Web Content

hybrid technologies chimera

HTML JSON

RDFXML RDF/XML

XHTML JSON-LD

microdata

RDFa

Sunday, 5 February 12

These hybrid technologies are chimera, constructed from constituent parts of two or more technologies.

How people judge chimera depends on their background and experience with the technologies that have been merged.

Page 10: Collisions, Chimera and Consonance in Web Content

looks a bit stupid but it's cute underneath

Sunday, 5 February 12

Page 11: Collisions, Chimera and Consonance in Web Content

you can put lipstick on a pig but it's still a pig

Sunday, 5 February 12

Page 12: Collisions, Chimera and Consonance in Web Content

serendipity something new and wonderful

Sunday, 5 February 12

Sometimes, of course, you might get something wonderful and new in its own right.

Like XSLT! :)

Page 13: Collisions, Chimera and Consonance in Web Content

chimera are usually ugly foolish or impossible fantasies

Sunday, 5 February 12

The original Chimera was a monster made from a lion, goat and snake.

The term now means a foolish or impossible fantasy.

Trouble with chimera is that when you dress up one format as another, the result seldom has the advantages of either. To pick the worst offender, RDF/XML is a horrible way to express RDF, because URLs aren't native in XML, and a horrible pattern for XML because its variability makes it difficult to process with XML tools.

Page 14: Collisions, Chimera and Consonance in Web Content

are chimera the only approach?

Sunday, 5 February 12

Are these hybrid technologies the only way of gaining the advantages that the different core technologies offer?

Page 15: Collisions, Chimera and Consonance in Web Content

being different is fine if you can work together

Sunday, 5 February 12

Or should we think of these four technologies as being like the members of the A-Team? (I'm not going to say which I think is who, except RDF is obviously Murdock.)

What does that mean? - recognise and appreciate their respective strengths and weaknesses; don't try to make one do what another can do better - also understand their similarities: a common language, a common goal

Page 16: Collisions, Chimera and Consonance in Web Content

legislation.gov.uk access and interaction

Sunday, 5 February 12

Public legislation.gov.uk built on XML stack: MarkLogic database, Orbeon pipelines & XSLT, producing HTML or XHTML.

Now working on editorial site to enable experts to help government team get and keep legislation up to date. New requirements: - flexibility in expressing & querying data about relationships between parts of legislation: we need RDF - dynamic and interactive site that supports a task: we need JSON

But we don't need chimera: we need JSON designed for JSON, and RDF as RDF, and XML as XML.

Page 17: Collisions, Chimera and Consonance in Web Content

leaves and branches named with URLs

Sunday, 5 February 12

What enables them to work together well is what the web really is: URLs that name and address resources.

URLs enable hand-off. When XML structures are named with URLs, JSON and RDF can point to document content stored in XML. They provide a common reference point, a common language.

Page 18: Collisions, Chimera and Consonance in Web Content

consonance through URLs weak, flexible links

HTML JSON

RDFXMLURLs

Sunday, 5 February 12

URLs that address structures within formats help those formats to be used together. They can be used for their strengths, without being compromised.

Page 19: Collisions, Chimera and Consonance in Web Content

common micro-syntaxes consonance

URLs

languages

link relations

data types

content types

Sunday, 5 February 12

URLs are one example of a common language or micro-syntax, used within the core technologies.

The formats have problems working together when these common languages are not really common. - URLs in HTML != IRIs used in XML or RDF - datatypes in HTML != those defined in XML Schema != those used in RDF (particularly date/times) - link relations in HTML != those used in Atom != those used in RDFa

These mismatches cause friction, and the most gnarly problems in dealing with microdata and RDFa differences are caused by them. But then, no team is perfect.

Page 20: Collisions, Chimera and Consonance in Web Content

closing thoughts

Sunday, 5 February 12

Strong theme of this conference is reflecting on the role of XML on the web.XML had a over-achieving youth, where it thought it could do everything, and the realisation it can't is perhaps a little painful.We are right to reflect on where we are, and what we want to become.

Page 21: Collisions, Chimera and Consonance in Web Content

the web is varied complex, dynamic, beautiful

Sunday, 5 February 12

A monoculture web would not survive. The web thrives because it is a diverse ecosystem, hosting 800lb gorillas and tiny mice with long long tails.

Page 22: Collisions, Chimera and Consonance in Web Content

so much beneath the crust core qualities != surface qualities

Sunday, 5 February 12

The web is also more than what you see, and it's a mistake to think that only the outwardly visible parts matter. Without the structures below the crust, it would implode.

Assess XML's role in that context.

Page 23: Collisions, Chimera and Consonance in Web Content

what changes make sense? chimera or consonance

http://www.flickr.com/photos/randyread/1007678907 from Randy Read

Sunday, 5 February 12

Another theme here is XML's relationship with other technologies, the use of XML technologies with non-XML formats and how XML might change in the future.

We should be asking: - are these chimera? are they beautiful new things, or pigs in lipstick? - do these changes make it XML better at what it does, or not as bad at doing what something else already does better? - does this help XML work better in concert with other technologies?

XML will not improve by trying to be someone else, but by working better in the team of web technologies: by doing its job well, and by communicating well with the others.

Page 24: Collisions, Chimera and Consonance in Web Content

thank you

Sunday, 5 February 12