1 Dr Alexiei Dingli XML Technologies XML. 2 XML stands for EXtensible Markup Language XML is a...

Post on 02-Jan-2016

222 views 1 download

Transcript of 1 Dr Alexiei Dingli XML Technologies XML. 2 XML stands for EXtensible Markup Language XML is a...

1

Dr Alexiei Dingli

XML Technologies

XML

2

• XML stands for EXtensible Markup Language

• XML is a markup language much like HTML

• XML was designed to carry data, not to display data

• XML tags are not predefined. You must define your own tags

• XML is designed to be self-descriptive

• XML is a W3C Recommendation

What is XML?

3

• XML– Is a meta language– Focus on transport and storage of data– Not a replacement to HTML!

• HTML – Is a vocabulary of SGML– Focus on display (formatting)

XML vrs HTML

4

• Was not designed to do anything ...• Just structure, store and transport

information

<stickynote>

<to>Joseph</to>

<from>Tom</from>

<body>Puchase tickets!</body>

</stickynote>

XML is just pure information

5

• The format of a .xml document is plain text• Only XML aware applications can interpret

it correctly• But it can be easily viewed/edited by

anyone using a simple text editor

Tip: Internet Explore can be used as a viewer and validator of XML (Eg1, Eg2)

Format

6

• The tags in the example above (like <to> and <from>) are not defined in any XML standard. These tags are "invented" by the author of the XML document

• That is because the XML language has no predefined tags, it’s a meta language!

• The tags used in HTML (and the structure of HTML) are predefined. HTML documents can only use tags defined in the HTML standard (like <p>, <h1>, etc.)

• XML allows the author to define his own tags and his own document structure

Let’s be creative ...

7

XML is a software and hardware independent tool for carrying

information

Note: the specification of the language can be found http://www.w3.org/XML/

Definition

8

• To display dynamic data in your HTML document, it will take a lot of work to edit the HTML each time the data changes

• With XML, data can be stored in separate XML files

• User can concentrate on using HTML for layout and display, and be sure that changes in the underlying data will not require any changes to the HTML

• With a few lines of JavaScript, one can read an external XML file and update the data content of the HTML.

Content Vs. Layout

9

• Most systems have data in incompatible formats

• XML is stored in plain text, thus it is software/hardware independent

• Much easier to share information

Simple data sharing

10

• As such, you can create new languages ...– XHTML the latest version of HTML  – WSDL for describing available web services– WAP and WML as markup languages for

handheld devices– RSS languages for news feeds– RDF and OWL for describing resources and

ontology– SMIL for describing multimedia for the web 

XML is a meta language!

11

• All xml documents are in the form of a tree

<root>

<child>

<subchild>.....</subchild>

</child>

</root>

XML Tree (1)

12

root

Child 1 Child 2 Child 3

SubChild 1

SubSubChild 1

SubChild 2

XML Tree (2)

13

• Simple example ...

<stickynote>

<to>Joseph</to>

<from>Tom</from>

<body>Puchase tickets!</body>

</stickynote>

XML Tree (3)

14

• Root element<stickynote>

• Children elements<to>

<from>

<body>

XML Tree (4)

15

xml

stickynote

to

Joseph

from

Tom

body

Purchase tickets!

XML Tree (4)

16

• Create the tree for ...

<bookstore>

<book category="COOKING">

<title lang="en">Everyday Italian</title>

<author>Giada De Laurentiis</author>

<year>2005</year>

<price>30.00</price>

</book>

<book category="CHILDREN">

<title lang="en">Harry Potter</title>

<author>J K. Rowling</author>

<year>2005</year>

<price>29.99</price>

</book>

</bookstore>

34U exercise

17

XML Commandments

18

For every opening Tag, there must be a closing Tag

<p>This is a paragraph

<p>This is a paragraph</p>

Commandment 1

19

XML Tags are case sensitive

<Message>This is incorrect</message>

<message>This is correct</message>

Commandment 2

20

XML Elements Must be Properly Nested

<b><i>This text is bold and italic</b></i>

<b><i>This text is bold and italic</i></b>

Commandment 3

21

XML Documents must have a root element

<root>

<child>

<subchild>.....</subchild>

</child>

</root>

Commandment 4

22

XML attributes must be quoted

<stickynote date=1/10/2008>

<stickynote date=“12/11/2007”>

Commandment 5

23

Some characters have special meaning in XML

<message>Meet me at Tom’s place</message>

<message>Meet me at Tom &apos; s place</message>

Commandment 6

Shortcut Symbol Meaning

&lt; < less than

&gt; > greater than

&amp; & ampersand 

&apos; ' apostrophe

&quot; " quotation mark

24

Comments in XML

<!-- This is a comment -->

Commandment 7

25

<book category="CHILDREN">

<title>Harry Potter</title>

</book>

• book is an element– which can contain

• other elements (such as title)• Or text content (such as Harry Potter in title)

• category is an attribute– Whose value is CHILDREN

Elements Vrs Attributes

26

• Naming rules ...

– Names can contain letters, numbers and other characters

– Names must not start with a number or punctuation character

– Names must not start with the letters xml (or XML, or Xml, etc)

– Names cannot contain spaces

What’s in a name?

27

• Make names descriptive. Names with an underscore separator are nice: <first_name>, <last_name>.

• Names should be short and simple, like this: <book_title> not like this: <the_title_of_the_book_which_i_am_currently_reading>. 

• Avoid "-" characters. If you name something "first-name," some software may think you want to subtract name from first.

• Avoid "." characters. If you name something "first.name," some software may think that "name" is a property of the object "first."

• Avoid ":" characters. Colons are reserved to be used for something called namespaces.

• XML documents often have a corresponding database. A good practice is to use the naming rules of your database for the elements in the XML documents.

• Non-English letters like éòá are perfectly legal in XML, but watch out for problems if your software vendor doesn't support them.

Best (Name) Practices

28

• Generally used to provide additional information not part of the data

• Use quotes and for a quote within a quote, use “&quot;”

• Some limitations of attributes– attributes cannot contain multiple values – attributes cannot contain tree structures– attributes are not easily expandable in future

• If in doubt use elements

More into attributes ...

29

<note date="10/01/2008">

<to>Tove</to>

<from>Jani</from>

<heading>Reminder</heading>

<body>Don't forget me this weekend!</body>

</note>

<note>

<date>10/01/2008</date>

<to>Tove</to>

<from>Jani</from>

<heading>Reminder</heading>

<body>Don't forget me this weekend!</body>

</note>

Spot the difference ...<note>

<date>

<day>10</day>

<month>01</month>

<year>2008</year>

</date>

<to>Tove</to>

<from>Jani</from>

<heading>Reminder</heading>

<body>Don't forget me this weekend!</body>

</note>

30

1. XML documents must have a root element

2. XML elements must have a closing tag

3. XML tags are case sensitive

4. XML elements must be properly nested

5. XML attribute values must be quoted

Well formed documents ...

31

• Is a "Well Formed" XML document, which also conforms to the rules of a Document Type Definition (DTD)

<!DOCTYPE note SYSTEM "Note.dtd"> <note>

</note>

Valid documents ...

32

• A DTD is used to define the structure of an XML document but its not in XML!

<!DOCTYPE note [

<!ELEMENT note (to,from,heading,body)>

<!ELEMENT to (#PCDATA)>

<!ELEMENT from (#PCDATA)>

<!ELEMENT heading (#PCDATA)>

<!ELEMENT body (#PCDATA)>

]>

Example DTD

33

• An XSchema is an XML alternative to a DTD

<xs:element name="note">

<xs:complexType>

<xs:sequence>

<xs:element name="to" type="xs:string"/>

<xs:element name="from" type="xs:string"/>

<xs:element name="heading" type="xs:string"/>

<xs:element name="body" type="xs:string"/>

</xs:sequence>

</xs:complexType>

</xs:element>

Example XSchema

34

• Errors in XML documents will stop your XML applications

• XML software should be small, fast, and compatible

• HTML browsers will display documents with errors (like missing end tags)

• HTML browsers are big and incompatible because they have a lot of unnecessary code to deal with (and display) HTML errors

Errors!

35

• Just use a normal browser ...– simple.xml– cd_catalog.xml– plant_catalog.xml

XML Viewing

36

• Just use the Cascading Style Sheet (CSS)

• Without CSS

• With CSS

Better XML Viewing

37

CATALOG { background-color: #ffffff; width: 100%; }

CD { display: block; margin-bottom: 30pt; margin-left: 0; }

TITLE { color: #FF0000; font-size: 20pt; }

ARTIST { color: #0000FF; font-size: 20pt; }

COUNTRY,PRICE,YEAR,COMPANY { display: block; color: #000000; margin-left: 20pt; }

The CSS

38

• Use XSLT– XSLT is the recommended style sheet

language of XML– XSLT (eXtensible Stylesheet Language

Transformations) is far more sophisticated than CSS

– One way to use XSLT is to transform XML into HTML before it is displayed

Even better XML viewing

39

• The XML

• The XSL

• The result

XSL Example

40

• Amazon just commissioned you to create an XML file for the following book as follows:– Title A.I. a modern approach– Author Russel and Norvig– Publisher Prentice Hall– Date of Publication 2000– ISBN 1234567– Dimensions 10 x 5 – Number of Pages 500– Comments 2 in store 1, 3 in store 2– Review Quite interesting!– Image http://www.amazon.com/AIBook

Exercise