Web data exchange formats Introduction and Overview.

85
Web data exchange formats Introduction and Overview

Transcript of Web data exchange formats Introduction and Overview.

Page 1: Web data exchange formats Introduction and Overview.

Web data exchange formats

Introduction and Overview

Page 2: Web data exchange formats Introduction and Overview.

Web data exchange formats

• XML

• JSON

• YAML

Page 3: Web data exchange formats Introduction and Overview.

XML outline

• What is XML & Why XML

• The rules of XML documents

• XML schema and validation

• XML processing• DOM• SAX• JAXP• JAXB• Digester

Page 4: Web data exchange formats Introduction and Overview.

Before XML

• HTML, Hyper-Text Markup Language, the most successful markup language of all the times

• First definition, HTML 1.0 – 1992

• Latest version, HTML 4.01 – 1999

• Fixed collection of markup tags

• <head>, <body>, <h1>, <br>, etc…

Page 5: Web data exchange formats Introduction and Overview.

What is XML?

• XML, Extensible Markup Language, is a framework for defining markup languages

• Created by the World Wide Web Consortium (W3C) to overcome the limitations of HTML

• Like HTML, XML is based on SGML - Standard Generalized Markup Language

• XML was designed with the Web in mind!

Page 6: Web data exchange formats Introduction and Overview.

XML design goals

1. XML shall be straightforwardly usable over the Internet

2. XML shall support a wide variety of applications

3. XML shall be compatible with SGML

4. It shall be easy to write programs which process XML documents

5. The number of optional features in XML is to be kept to the absolute minimum, ideally zero

Page 7: Web data exchange formats Introduction and Overview.

XML design goals

6. XML documents should be human-legible and reasonably clear

7. The XML design should be prepared quickly

8. The design of XML shall be formal and concise

9. XML documents shall be easy to create

10. Terseness in XML markup is of minimal importance

Page 8: Web data exchange formats Introduction and Overview.

Typical XML usages

• Web development and content management

• Data exchange

• Data storage

• Configuration files

• Web services

Page 9: Web data exchange formats Introduction and Overview.

Historical outline

• The development of XML began in the mid-90s

• Initial XML draft – November 1996

• XML 1.0, W3C recommendation – February 1998

• XML 1.1 – February 2004

Page 10: Web data exchange formats Introduction and Overview.

More about XML

• XML lets us define our own tags

• Each XML language is targeted to a particular application domain

• XML specification says nothing about the semantics of the markup tags

• XML is internationalized and platform independent

Page 11: Web data exchange formats Introduction and Overview.

XML specification

• Is located at • XML 1.0: http://www.w3.org/TR/REC-xml/ • XML 1.1: http://www.w3.org/TR/xml11/

• Defines the basic rules for XML documents

Page 12: Web data exchange formats Introduction and Overview.

Sample XML document

<?xml version="1.0" encoding="UTF-8"?>

<people>

<person id="person_1">

<name>David</name>

<surname>Gilmour</surname>

</person>

<person id="person_2">

<name>Richard</name>

<surname>Wright</surname>

</person>

<person id="person_3">

<name>Nick</name>

<surname>Mason</surname>

</person>

</people>

Page 13: Web data exchange formats Introduction and Overview.

Examples of XML markups

• XHTML• WML - Wireless Markup Language• MathML – Mathematical Markup Language• ebXML - Electronic Business XML• CML - Chemical Markup Language• MusicXML – Musical Scores Markup Language• ThML - Theological Markup Language

See more athttp://en.wikipedia.org/wiki/List_of_XML_markup_languages

Page 14: Web data exchange formats Introduction and Overview.

XHTML versus HTML

• XHTML 1.0 is W3C’s XMLification of HTML 4.01

• The most notable differences:

• HTML allows certain elements to omit the end tag (forbidden in XML)

• Element and attribute names must be lowercase

• Attribute values in XHTML must be present and they must be surrounded by quotes

Page 15: Web data exchange formats Introduction and Overview.

XML document rules

• The creators of XML decided to enforce document structure from the beginning

• The XML specification requires a parser to reject any XML document that doesn't follow the basic rules

• A parser is a piece of code that attempts to read a document and interpret its contents

Page 16: Web data exchange formats Introduction and Overview.

Three kinds of XML documents

• Invalid documents • Don't follow the syntax rules defined by XML

specification or DTD/schema

• Valid documents • Follow both the XML syntax rules and the rules

defined in their DTD/schema

• Well-formed documents• Follow the XML syntax rules but don't have a

DTD/schema

Page 17: Web data exchange formats Introduction and Overview.

How to check XML document?

Easy way to check if XML document is well-formed:• Simply open it in a browser

Page 18: Web data exchange formats Introduction and Overview.

XML main notions

• There are three common terms used to describe parts of an XML document: • tags

• elements

• attributes

<people>

<person id="person_1">

<name>David</name>

<surname>Gilmour</surname>

</person>

</people>

<people>

<person id="person_1">

<name>David</name>

<surname>Gilmour</surname>

</person>

</people>

<people>

<person id="person_1">

<name>David</name>

<surname>Gilmour</surname>

</person>

</people>

Page 19: Web data exchange formats Introduction and Overview.

Rule: The root element

An XML document must be contained in a single element

<?xml version="1.0"?>

<!-- A well-formed document -->

<greeting>

Hello, World!

</greeting>

<?xml version="1.0"?>

<!-- An invalid document -->

<greeting>

Hello, World!

</greeting>

<greeting>

Hola, el Mundo!

</greeting>

Page 20: Web data exchange formats Introduction and Overview.

Rule: Elements can't overlap

Invalid XML documents:

<?xml version="1.0"?>

<!-- An invalid document -->

<person><name>Jonh Brown</person></name>

<?xml version="1.0"?>

<!-- An invalid document -->

<p>

<b>My name is <i>John Brown</b>.</i>

</p>

Page 21: Web data exchange formats Introduction and Overview.

Rule: End tags are required

• You can't leave out any end tags

• If an element contains no markup at all it is called an empty element

• In empty elements in XML documents, you can put the closing slash in the start tag

<!-- Two equivalent break elements -->

<br></br>

<br />

<!-- NOT legal XML markup -->

<p>My name is John Brown

<p>I am 25 years old

<p>...

Page 22: Web data exchange formats Introduction and Overview.

Rule: Elements are case sensitive

In HTML, <h1> and <H1> are the same; in XML, they're not

<!-- NOT legal XML markup -->

<Person>

Elements are case sensitive

</person>

<!-- legal XML markup -->

<person>Elements are case sensitive </person>

Page 23: Web data exchange formats Introduction and Overview.

Rule: Quoted attribute values

There are two rules for attributes in XML documents:• Attributes must have values• Those values must be enclosed within

quotation marks (single or double)

<!-- NOT legal XML markup -->

<ol compact>

<!-- legal XML markup -->

<ol compact="yes">

Page 24: Web data exchange formats Introduction and Overview.

XML declarations

• Most XML documents start with an XML declaration that provides basic information about the document to the parser

• An XML declaration is recommended, but not required

<?xml version="1.0"

encoding="UTF-8" standalone="no"?>

Page 25: Web data exchange formats Introduction and Overview.

XML document as a tree

• Conceptually, an XML document is a hierarchical structure called an XML tree

• Although there is no consensus on the terminology used on XML trees, at least two standard terminologies exist:

• XPath Data Model

• XML Information Set

http://www.ibm.com/developerworks/xml/library/x-hands-on-xsl/

Page 26: Web data exchange formats Introduction and Overview.

Namespaces

• Different XML languages may use the same tags

• Namespaces• a solution for a name clashing problem

<?xml version="1.0"?>

<customer_summary

xmlns:addr="http://www.xyz.com/addresses/"

xmlns:books="http://www.zyx.com/books/"

xmlns:mortgage="http://www.yyz.com/mortage/">

... <addr:title>Mrs.</addr:title> ...

... <books:title>Lord of the Rings</books:title> ...

... <mortgage:title>NC2948-388-1983</mortgage:title> ...

Page 27: Web data exchange formats Introduction and Overview.

Namespaces

• XML namespaces are similar to Java packages

• The string in a namespace definition looks like a URL, but it’s just a string!

• For simplicity, unprefixed element names are assigned a default namespace (xmlns=“ ”)

• Can be overridden using a declaration in a form

xmlns=“URI”

Page 28: Web data exchange formats Introduction and Overview.

Defining document content

• The elements of particular XML language have to be defined in some way

• A schema is a formal definition of the syntax of an XML-based language

• Two main schema languages:• DTD• XML Schema

Page 29: Web data exchange formats Introduction and Overview.

DTD - Document Type Definition

• Built-in schema language since the first XML working draft

• DTD is not itself written in XML notations

<!-- address.dtd -->

<!ELEMENT address (name, street, city, postal-code)>

<!ELEMENT name (title? first-name, last-name)>

<!ELEMENT title (#PCDATA)>

<!ELEMENT first-name (#PCDATA)>

<!ELEMENT last-name (#PCDATA)>

<!ELEMENT street (#PCDATA)>

<!ELEMENT city (#PCDATA)>

<!ELEMENT postal-code (#PCDATA)>

Page 30: Web data exchange formats Introduction and Overview.

Document Type Declaration

• An XML document may contain a reference to a DTD schema

• XHTML documents often contain:

<?xml version="1.1">

<!DOCTYPE people

SYSTEM "http://www.music.com/people.dtd">

<!DOCTYPE html

PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

Page 31: Web data exchange formats Introduction and Overview.

DTD – Element declaration

• An element declaration looks as follows:

<!ELEMENT element-name content-model>

• Content model defines the validity requirements of the contents (the sequence of its immediate child nodes) of all elements of the given name

Page 32: Web data exchange formats Introduction and Overview.

DTD – Content model

Constructs used in content model description:

EMPTY Empty contentsANY Any contents#PCDATA Character dataelement name An element, Concatenation| Union? Optional* Zero or more repetitions+ One or more repetitions

Page 33: Web data exchange formats Introduction and Overview.

DTD: example

<!ELEMENT people (person+)>

<!ELEMENT person (name, surname, birthdate?, address*)>

<!ELEMENT name (#PCDATA)>

<!ELEMENT surname (#PCDATA)>

<!ELEMENT birthdate (#PCDATA)>

<!ELEMENT address (#PCDATA)>

Page 34: Web data exchange formats Introduction and Overview.

DTD: Attribute-List declarations

• An attribute-list declarations looks as follows:

<!ATTLIST element-name attribute-definitions>

• attribute-definitions is a list, each element in a form:

attribute-name attribute-type default-declaration

• Default

declarations:

#REQUIRED Required#IMPLIED Optional, no default

“value” Optional, value is default

#FIXED “value” As the previous, but only this value is permitted

Page 35: Web data exchange formats Introduction and Overview.

DTD: examples<!ELEMENT rectangle EMPTY> <!ATTLIST rectangle length CDATA "0px" width CDATA "0px">

<rectangle width="80px" length="40px"/>

<!ELEMENT img EMPTY> <!ATTLIST img alt CDATA #REQUIRED src CDATA #REQUIRED

width CDATA #IMPLIED height CDATA #IMPLIED>

<img src="xmlj.jpg" alt="XMLJ Image" width="300"/>

<!ELEMENT address (#PCDATA)> <!ATTLIST address country CDATA #FIXED "USA">

<address country="USA">123 15th St. Troy NY 12180</ADDRESS>

Page 36: Web data exchange formats Introduction and Overview.

XML Schema

• Shortly after XML 1.0, the W3C initiated the development of the next generation schema language to attack the problems with DTD

• Some judicious guiding design principles, that the new schema language should be:• More expressive that XML DTD

• Expressed in XML

• Self-describing

• Simple enough

Page 37: Web data exchange formats Introduction and Overview.

XML Schema Specification

• Published in 2001

• Specification consist of the following parts:

• Part 0 - Primer: http://w3.org/TR/xmlschema-0

• Part 1 - Document structures: http://w3.org/TR/xmlschema-1

• Part 2 - Datatypes: http://w3.org/TR/xmlschema-2

Page 38: Web data exchange formats Introduction and Overview.

XML Schema

• Unfortunately, the resulting language does not fulfill the original requirement

• Although it provides good support for namespaces, modularization and datatypes, but

• It is not simple – Part 1 alone is more than 160 pages, and even XML experts do not find it human-readable

• It is not fully self-describing – there is a schema for XML Schema, but it doesn’t capture all syntactical aspects of the language

Page 39: Web data exchange formats Introduction and Overview.

XML Schema advantages

Several advantages over DTDs

• XML schemas use XML syntax• You can process a schema just like any other document

• XML schemas support datatypes• Integers, floating point numbers, dates, times, strings,

URLs

• XML schemas are extensible• User-defined datatypes, derived datatypes

• XML schemas have more expressive power

• XML schemas support namespaces

Page 40: Web data exchange formats Introduction and Overview.

XSD

An XML Schema instance is an XML Schema Definition (XSD) and typically has the filename extension ".xsd"

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="country" type="Country"/> <xsd:complexType name="Country"> <xsd:sequence> <xsd:element name="name" type="xsd:string"/> <xsd:element name="population" type="xsd:decimal"/> </xsd:sequence> </xsd:complexType></xsd:schema>

Page 41: Web data exchange formats Introduction and Overview.

Example: people.xsd

<?xml version="1.0" encoding="UTF-8"?>

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">

<xsd:element name="people" type="peopleType"/>

<xsd:complexType name="peopleType">

<xsd:sequence maxOccurs="unbounded">

<xsd:element name="person" type="personType"/>

</xsd:sequence>

</xsd:complexType>

<xsd:complexType name="personType">

<xsd:sequence>

<xsd:element name="name" type="xsd:string"/>

<xsd:element name="surname" type="xsd:string"/>

</xsd:sequence>

<xsd:attribute name="id" type="xsd:string"/>

</xsd:complexType>

</xsd:schema>

Page 42: Web data exchange formats Introduction and Overview.

Declaring XML SchemaTo declare that people.xml uses people.xsd schema, need to add the following:

<?xml version="1.0" encoding="UTF-8"?>

<!–- schema is located in the same folder -->

<people xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:noNamespaceSchemaLocation="people.xsd">

. . .

</people>

<?xml version="1.0" encoding="UTF-8"?><!–- schema location specified as URL --><people xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation= "http://www.ante.lv/lab01-music-serverside/data/people.xsd">

. . .

</people>

Page 43: Web data exchange formats Introduction and Overview.

XML Schema: Defining elements

• To define an element is to define its name and content model (type)

• A type can be simple or complex

• A simple type cannot contain elements or attributes in its value

• A complex type can create the effect of embedding elements in other elements or it can associate attributes with an element

Page 44: Web data exchange formats Introduction and Overview.

Simple, non-nested elements

An element that does not contain attributes or other elements can be defined to be of a • simple type

• predefined

• user-defined

<element name='name' type='string'/> <element name='birthday' type='date'/><element name='age' type='integer'/> <element name='price' type='decimal'/>

http://www.ibm.com/developerworks/xml/library/xml-schema/sidetable2.html

Page 45: Web data exchange formats Introduction and Overview.

Complex types

• Elements with attributes must have a complex type

• Elements that embed other elements must have a complex type

<complexType name="personType"><sequence>

<element name="name" type="string"/> <element name="surname" type="string"/> </sequence> <attribute name="id" type="string"/></complexType>

Page 46: Web data exchange formats Introduction and Overview.

Expressing constraints on elements

• XML Schema offers greater flexibility than DTD for expressing constraints on the content model of elements

• For example, element occurrence definition:

• DTD: * + ?

• XML Schema: • maxOccurs• minOccurs

<element name='Book'> <complexType> <element ref='Title' minOccurs='0'/> <element ref='Author' maxOccurs='2'/> </complexType></element>

Page 47: Web data exchange formats Introduction and Overview.

XML validation

• Online XML validator against XML Schema:http://tools.decisionsoft.com/schemaValidate/

• Java API also provides a way to make a XML parser validate a document

Page 48: Web data exchange formats Introduction and Overview.

XML processing APIs

• The three basic XML parsing interfaces are:• Document Object Model (DOM)• Simple API for XML (SAX)• Streaming API for XML (StAX)

• Java API for XML Processing (JAXP)• Provides common interfaces for processing XML

documents (using DOM, SAX or StAX)

• XML to Java classes binding• Java Architecture for XML Binding (JAXB)• Digester

Page 49: Web data exchange formats Introduction and Overview.

DOM

• The Document Object Model defines a set of interfaces to the parsed version of an XML document

• The parser reads in the entire document and builds an in-memory tree

• Your code can then use the DOM interfaces to manipulate the tree

Page 50: Web data exchange formats Introduction and Overview.

DOM

Using DOM API you can

• move through the tree to see what the original document contained

• delete sections of the tree

• rearrange the tree

• add new branches

• and so on . . .

Page 51: Web data exchange formats Introduction and Overview.

DOM issues

• The DOM builds an in-memory tree of an entire document. If the document is very large, this requires a significant amount of memory. This could cause also a significant delay.

• The DOM creates objects that represent everything in the original document, including elements, text, attributes, and whitespace. It may be extremely wasteful to create all those objects that will never be used.

Page 52: Web data exchange formats Introduction and Overview.

SAX

• To get around the DOM issues, the XML-DEV participants created the SAX interface

• A SAX parser sends events to your code

• The parser tells you when it• finds the start of an element

• the end of an element

• text

• the start or end of the document

• and so on . . .

Page 53: Web data exchange formats Introduction and Overview.

SAX

• You decide which events are important to you

• A SAX parser doesn't create any objects at all

• You decide what kind of data structures you want to create to hold the data from SAX events

Page 54: Web data exchange formats Introduction and Overview.

SAX issues

• SAX events are stateless• SAX event simply gives you the text that was found; it

does not tell you what element contains that text.

• You have to write the state management code yourself.

• SAX events are not permanent• If your application needs a data structure that models the

XML document, you have to write that code yourself

• SAX is not controlled by a centrally managed organization (such as the W3C)

Page 55: Web data exchange formats Introduction and Overview.

Proprietary XML parsers in Java

• jDOM• http://www.jdom.org

• Xerces• http://xerces.apache.org/xerces-j/

• Woodstox• http://woodstox.codehaus.org/

It is recommended to use a standard:Java API for XML Processing

Page 56: Web data exchange formats Introduction and Overview.

Java API for XML Processing

• Problem: the process of creating, for example, a DOMParser object in a Java program differs from one DOM parser to the next

• JAXP provides common interfaces for processing XML documents (using DOM, SAX, StAX or XSLT)

• JAXP provides interfaces such as the DocumentBuilderFactory and the DocumentBuilder that provide a standard interface to different parsers

Page 57: Web data exchange formats Introduction and Overview.

JAXP DOM API diagram

http://docs.oracle.com/javase/tutorial/jaxp/dom/index.html

Page 58: Web data exchange formats Introduction and Overview.

JAXP SAX API diagram

http://docs.oracle.com/javase/tutorial/jaxp/sax/index.html

Page 59: Web data exchange formats Introduction and Overview.

JAXP + DOM: parsing XML

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

factory.setNamespaceAware(true);factory.setValidating(true);factory.setAttribute(

"http://java.sun.com/xml/jaxp/properties/schemaLanguage","http://www.w3.org/2001/XMLSchema");

DocumentBuilder db = factory.newDocumentBuilder();org.w3c.dom.Document doc = db.parse(input);// process the DOM documentElement root = doc.getDocumentElement();for (Node node = root.getFirstChild();

node != null; node = node.getNextSibling()){. . .

}

Page 60: Web data exchange formats Introduction and Overview.

JAXP + DOM: creating XML [1]DocumentBuilderFactory factory =

DocumentBuilderFactory.newInstance();

DocumentBuilder docBuilder = factory.newDocumentBuilder();

Document doc = docBuilder.newDocument();

<!– Populate XML document content -->

Element root = doc.createElement("music-summary");

doc.appendChild(root);

Element reportId = doc.createElement("report-id");

String reportIdString = generateUniqueId();

Text reportIdText = doc.createTextNode(reportIdString);

reportId.appendChild(reportIdText);

root.appendChild(reportId);

. . .

Page 61: Web data exchange formats Introduction and Overview.

JAXP + DOM: creating XML [2]

. . .

<!– Create and save XML file -->

TransformerFactory transfactory = TransformerFactory.newInstance();

Transformer transformer = transfactory.newTransformer();

transformer.setOutputProperty(OutputKeys.INDENT, "yes");

FileWriter fw = new FileWriter(outputFile);

StreamResult result = new StreamResult(fw);

DOMSource source = new DOMSource(doc);

transformer.transform(source, result);

Page 62: Web data exchange formats Introduction and Overview.

Java Architecture for XML Binding

• JAXB allows to map Java classes to XML representations

• Steps using JAXB:• Bind the schema for the XML document

• Unmarshal the document into Java content objects

http://docs.oracle.com/javase/tutorial/jaxb/

Page 63: Web data exchange formats Introduction and Overview.

Apache Jakarta Commons Digester

• Digester is a layer on top of the SAX API to make it easier to process XML input

• Digester makes it easy to create and initialise a tree of objects based on an XML input file

• The developer needs to write rules that tell Digester how to map input XML into Java objects

• Digester supports only one-way mapping: XML Java objects

Page 64: Web data exchange formats Introduction and Overview.

Code sample: DigesterDigester digester = new Digester();digester.setValidating(false);

digester.addObjectCreate("people", PeopleHolder.class);

digester.addObjectCreate("people/person", Person.class);

digester.addSetProperties("people/person", "id", "idString");

digester.addBeanPropertySetter("people/person/name", "name");

digester.addBeanPropertySetter("people/person/surname", "surname");

digester.addSetNext("people/person", "addPerson");

PeopleHolder peopleHolder = (PeopleHolder)digester.parse(input);

Vector<Person> people = peopleHolder.getPeople();

Page 65: Web data exchange formats Introduction and Overview.

Other XML standards

• XSL (Extensible Stylesheet Language)• XSLT (XSL Transformations)• XPath (XML Path Language)• XLink, XPointer• XML security• Web Services

• SOAP, WSDL, UDDI

• SVG, SMIL• Many more. . .

Page 66: Web data exchange formats Introduction and Overview.

JSON

Page 67: Web data exchange formats Introduction and Overview.

JSON

• JSON (JavaScript Object Notation) is a lightweight computer data interchange format

• Text-based, human-readable format for representing simple data structures and associative arrays• easy for humans to read and write

• easy for machines to parse and generate

• Is based on a subset of the JavaScript programming language

• MIME type: application/json

Page 68: Web data exchange formats Introduction and Overview.

JSON example{ "firstName": "John", "lastName": "Smith", "address": { "streetAddress": "21 2nd Street", "city": "New York", "state": "NY", "postalCode": 10021 }, "phoneNumbers": [ { "type": "home", "number": "212 555-1234" }, { "type": "fax", "number": "646 555-4567" } ], "newSubscription": false, "companyName": null }

JSON Web page: http://json.org/

Page 69: Web data exchange formats Introduction and Overview.

JSON structure

http://json.org/

Page 70: Web data exchange formats Introduction and Overview.

JSON from Facebook

{ "data": [ { "name": "Ann Blue", "id": "100002771239557" }, { "name": "David Green", "id": "100002808391341" } ]} Friends

{ "data": [ { "name": "Second event", "start_time": "2011-10-04T16:00:00", "end_time": "2011-10-04T18:00:00", "location": "14. auditorija", "id": "196365027094566", "rsvp_status": "attending" } ], "paging": { "previous": "https://graph.facebook.com/100002774971272/events?format=json&limit=25&since=1317744000", "next": "https://graph.facebook.com/100002774971272/events?format=json&limit=25&until=1317744000" }}

Friends Events

Page 71: Web data exchange formats Introduction and Overview.

User Profile JSON from Google+{ "kind": "plus#person", "etag": "\"GZR2X3-UK6zXRwPjCsTmgE7l6CI/feNs8dXzP9_SaZJBtANkXqtESTI\"", "urls": [ { "value": "https://plus.google.com/107192656717644038166" }, { "value": "https://www.googleapis.com/plus/v1/people/107192656717644038166" } ], "id": "107192656717644038166", "displayName": "Brian Red", "name": { "familyName": "Red", "givenName": "Brian" }, "url": "https://plus.google.com/107192656717644038166", "image": { "url": "https://lh4.googleusercontent.com/-S_Y0PMoBgT0/AAAAAAAAAAI/AAAAAAAAAAA/xzau2wEOUo8/photo.jpg?sz=50" }}

Page 72: Web data exchange formats Introduction and Overview.

Venues JSON from Foursquare{ ... response: { groups: [ { ... items: [ {

... venue: {

id: "4cf4102d899c6ea84fd0fec1" name: "Innocent Cafe" ... } venue: { id: "4c93a04a58d4b60c2b012129" name: "MiiT" ...

} ...

}]........................................................}

Page 73: Web data exchange formats Introduction and Overview.

JSON parsers

• JSON-lib• http://json-lib.sourceforge.net/

• Google’s GSON• http://code.google.com/p/google-gson/

• FlexJSON• http://flexjson.sourceforge.net/

Java API for JSON Processing:a part of Java EE 7 (standard)

Page 74: Web data exchange formats Introduction and Overview.

JSON-lib

<dependency>

<groupId>net.sf.json-lib</groupId><artifactId>json-lib</artifactId>

<version>2.4</version><classifier>jdk15</classifier>

</dependency>

•JSON-lib is a Java library for transforming beans, maps, collections, arrays and XML to JSON and back again to beans

•Very simple for core JSON tasks

•Maven dependency:

Page 75: Web data exchange formats Introduction and Overview.

JSON-lib

•JSON object == net.sf.json.JSONObject

•JSON array == net.sf.json.JSONArray

•net.sf.json.JSONSerializer can transform any Java object to JSON notation and back with a simple and clean interface, leveraging all the builders in JSONObject and JSONArray

Page 76: Web data exchange formats Introduction and Overview.

JSON-lib exampleimport net.sf.json.*;

InputStream response = httpClient.download(url);

String content = IOUtils.toString(response);

Object json = JSONSerializer.toJSON(content);

if (json instanceof JSONObject){

JSONObject jsonObject = (JSONObject)json;

Object data = jsonObject.get("data");

if (data instanceof JSONArray){

JSONArray jsonArray = (JSONArray)data;

for (Object jsonElement: jsonArray){

if (jsonElement instanceof JSONObject){

JSONObject jsonFriend = (JSONObject)jsonElement;

String friendId = (String)jsonFriend.get("id");

String friendName = (String)jsonFriend.get("name");

...

}

}

}

}

Apache Commons IOhttp://commons.apache.org/proper/commons-io/javadocs/api-2.4/org/apache/commons/io/IOUtils.html

Page 77: Web data exchange formats Introduction and Overview.

GSON

<dependency>

<groupId>com.google.code.gson</groupId>

<artifactId>gson</artifactId>

<version>2.2.4</version>

</dependency>

•Library that can be used to convert:•Java Objects into their JSON representation•JSON string to an equivalent Java object

•Maven dependency:

Page 78: Web data exchange formats Introduction and Overview.

GSON examplepublic class GooglePlusUser {

private String id;

private Name name;

class Name {

public String givenName;

public String familyName;

public Name() {}

}

...

}

Gson gson = new Gson(); GooglePlusUser googleUser =

gson.fromJson(json, GooglePlusUser.class);

1. Define a class:

2. Parse JSON:

Page 79: Web data exchange formats Introduction and Overview.

Java API for JSON Processing

• Provides portable APIs to parse, generate, transform, and query JSON using:

• Object model API (similar to DOM)

• creates a random-access, tree-like structure that represents the JSON data in memory

• Streaming API

• provides a way to parse and generate JSON in a streaming fashion

http://www.oracle.com/technetwork/articles/java/json-1973242.html

Page 80: Web data exchange formats Introduction and Overview.

YAML

• YAML is yet another human-readable data serialization format (first proposed in 2001)

• Takes concepts from programming languages such as C, Perl, and Python, and ideas from XML

• YAML is a recursive acronym for

"YAML Ain't Markup Language“

(data-oriented, rather than document markup)• Early in its development:

"Yet Another Markup Language"

Page 81: Web data exchange formats Introduction and Overview.

YAML example

---receipt: Oz-Ware Purchase Invoicedate: 2007-08-06customer: given: Dorothy family: Gale

items: - part_no: A4786 descrip: Water Bucket (Filled) price: 1.47 quantity: 4

- part_no: E1628 descrip: High Heeled "Ruby" Slippers price: 100.27 quantity: 1...

Page 82: Web data exchange formats Introduction and Overview.

JSON versus YAML

• JSON syntax is a subset of YAML 1.2

• Most JSON documents can be parsed by a YAML parser

• JSON's semantic structure is equivalent to the optional "inline-style" of writing YAML

• The Official YAML Web Site:http://www.yaml.org/

Page 83: Web data exchange formats Introduction and Overview.

XML Resources

• Book “An Introduction to XML and Web Technologies”, A. Moller and M. Schwartzbach, 2006

• Articles, online tutorials, and other technical resources on XML standards and technologieshttp://www.ibm.com/developerworks/xml

• IBM developerWorks: Introduction to XMLhttp://www.ibm.com/developerworks/edu/x-dw-xmlintro-i.html

Page 84: Web data exchange formats Introduction and Overview.

XML Resources

• Java Tutorial: Java API for XML Processing (JAXP)http://docs.oracle.com/javase/tutorial/jaxp/

• Java Tutorial: Introduction to JAXBhttp://docs.oracle.com/javase/tutorial/jaxb/intro/

Page 85: Web data exchange formats Introduction and Overview.

JSON Resources

• Java API for JSON Processing: An Introduction to JSONhttp://www.oracle.com/technetwork/articles/java/json-

1973242.html

• Java EE 7 Tutorial: JSON Processing http://docs.oracle.com/javaee/7/tutorial/doc/jsonp.htm