XML Introduction

31
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013 XML Introduction Ing. Marco BRESCIANI

description

Alcatel Italia 2006-04-13 - 2006-04-18 - Course held for Wireless Transmission Division, R&D Software Competence Center.

Transcript of XML Introduction

Page 1: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

XML Introduction

Ing. Marco BRESCIANI

Page 2: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 2

What Is XML?

XML is a W3C Recommendation (http://www.w3.org/XML/); the acronym

means eXtensible Markup Language:

It’s a mark-up, just like HTML.

It was designed to describe data (metadata language):

It does not define or perform operations on data;

It needs a grammar to describe data:

XML Schema is a standard grammar;

DTD is a standard grammar, old-fashioned.

It does not define tags:

HTML is <html></html>;

XML is NOT <xml></xml>.

It is self-descriptive: an XML file with its grammar is self-contained.

Page 3: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 3

Why XML?

XML describes data:

Keeps data separated from their graphical layout;

Allows automatic data management and exchange;

Can define new languages in order to produce specific data formats;

A single XML file and the data it contains can be managed in different

ways.

XML is precise and safe:

HTML allows this: <HtML> content </HTml> or <HTML>

content (?);

XML force this: <tag> content </tag> or <tag />.

Refers to a grammar that can be used to do more checks.

Page 4: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 4

How XML?

XML has been defined to be managed through software in

order to produce different outputs, to evaluate data and to

transform them in many ways… simply using text files!

A simple XML file:

<article>

<paragraph title="Paragraph Title">

<text>This is the text</text>

</paragraph>

</article>

Page 5: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 5

Some XML Details

There are some constrains that each XML file must observe:

All XML files must start with a prologue:

<?xml version='1.0' encoding='utf-8'?>

where version can be “1.0” or “1.1” and encoding is the name of a

valid ISO charset: “utf-8”, “iso-8859-15”, “utf-16”, “iso-2022-jp”, …;

All elements are case-sensitive. Standard suggests the used of

lowercase: <TAG> or <Tag> or <tag> represent three different

elements;

Element tags must be ordered in the open/close sequence;

Attributes must be contained between ’ or “ delimiters;

A document must contain a single root element that contains all the

others:

<?xml version='1.0' encoding='utf-8'?>

<root> Document content (other element tags) </root>

Page 6: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 6

XML Definition

As stated, XML does not define tags so we cannot say (as for

HTML), <em> tag identifies an emphasis on text or <table>

tag identifies the beginning of a tabular data structure and

so on.

How can we define XML? XML can be defined by defining an

XML application.

What is an XML application? An XML application is a

language, based on XML structure/grammar, that defines

the structure of a data set.

What is XML? XML is a standard way to define data

structures (and much more… we’ll see…).

Page 7: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 7

Smart XML Example

As states, XML does not define how

to use data, only their logical

representation. This is a set of data:

<?xml version="1.0" encoding="UTF-8"?>

<X3D profile="Immersive"

xmlns:xsd="http://www.w3.org/2001/XMLSchema-instance"

xsd:noNamespaceSchemaLocation="http://www.web3d.org/specifi

cations/x3d-3.0.xsd">

<head>

<meta content="Scacchiera.x3d" name="filename"/>

<meta content="Scacchiera Tridimensionale di Star Trek"

name="description"/>

<meta content="Marco Bresciani" name="author"/>

<meta content="Marco Bresciani" name="translator"/>

<meta content="1998" name="created"/>

<meta content="2004-09-01" name="translated"/>

<meta content="2005-02-23" name="revised"/>

<meta content="200502.23.1905" name="version"/>

<meta content="http://marcobresciani.altervista.org"

name="reference"/>

How can this data be represented?

Look on the right!

Page 8: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 8

Another XML Example

Remember: XML does not define

how to use data!

<project_statistics>

<master_url>http://setiathome.ssl.berkeley.edu/</master_url

>

<daily_statistics>

<day>1138233600.000000</day>

<user_total_credit>25489.045579</user_total_credit>

<user_expavg_credit>89.772845</user_expavg_credit>

<host_total_credit>5259.731942</host_total_credit>

<host_expavg_credit>89.661763</host_expavg_credit>

</daily_statistics>

<daily_statistics>

<day>1138579200.000000</day>

<user_total_credit>25615.527457</user_total_credit>

<user_expavg_credit>70.733574</user_expavg_credit>

<host_total_credit>5386.213820</host_total_credit>

<host_expavg_credit>70.659423</host_expavg_credit>

</daily_statistics>

Data above could be represented

as on the left or in any other ways.

Page 9: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 9

XML: « My Computer » Example (1 of 5)

Now, we begin working with some basic XML with a(n almost) complete

example. After this, we will describe XML grammar(s) in detail.

This sample XML file (briefly) represents the structure and components of

a generic PC (not completely true… we’ll see that): the parts that

compose the PC, the details about those parts and so on.

The XML file, even if be simple, allows the beginners to have a smart

view of the XML standard itself, by introducing its features and

possibilities with a simple structure based on a known thing.

Let’s see the XML file: please note that elements (tags) are uppercase to

enhance readability…

Page 10: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 10

XML: « My Computer » Example (2 of 5)

<?xml version="1.0"?>

<!DOCTYPE PARTS SYSTEM "parts.dtd">

<?xml-stylesheet type="text/css" href="xmlpartsstyle.css"?>

<PARTS><TITLE>Computer Parts</TITLE>

<PART>

<ITEM>Motherboard</ITEM>

<MANUFACTURER>ASUS</MANUFACTURER>

<MODEL>P3B-F</MODEL>

<COST>123.00</COST>

</PART>

<PART>

<ITEM>Video Card</ITEM>

<MANUFACTURER>ATI</MANUFACTURER>

<MODEL>All-in-Wonder Pro</MODEL>

<COST>160.00</COST>

</PART>

<PART>

<ITEM>Sound Card</ITEM>

<MANUFACTURER>Creative

Labs</MANUFACTURER>

<MODEL>Sound Blaster

Live</MODEL>

<COST>80.00</COST>

</PART>

<PART>

<ITEM>19 inch Monitor</ITEM>

<MANUFACTURER>LG

Electronics</MANUFACTURER>

<MODEL>995E</MODEL>

<COST>290.00</COST>

</PART>

</PARTS>

Page 11: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 11

XML: « My Computer » Example (3 of 5)

In the previous page we saw:

A computer is a list of <PARTS> with a name defined by a <TITLE>;

The list of <PARTS> is composed by a number or <PART> element;

Each <PART> has its own details:

An <ITEM> that describes the kind of part we are writing about;

A <MANUFACTURER> that states the builder of the <PART>;

A <MODEL> that renders the description of the <PART>, also given by

<ITEM>;

A <COST> that specifies the money you spent to buy that component.

We saw a <!DOCTYPE PARTS SYSTEM “parts.dtd”>

tag too: this is used to relate the XML to its grammar. Let’s

see it…

Page 12: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 12

XML: « My Computer » Example (4 of 5)

Using the DTD standard (simpler, for beginners) we can define the

grammar of our “My Computer” description:

<!ELEMENT PARTS (TITLE?, PART*)>

<!ELEMENT TITLE (#PCDATA)>

<!ELEMENT PART (ITEM, MANUFACTURER, MODEL, COST)+>

<!ATTLIST PART type (computer|auto|airplane) #IMPLIED>

<!ELEMENT ITEM (#PCDATA)>

<!ELEMENT MANUFACTURER (#PCDATA)>

<!ELEMENT MODEL (#PCDATA)>

<!ELEMENT COST (#PCDATA)>

The DocType element described before defines the language we are

going to use in our XML file. The grammar file defines the details of such

language Let’s check the DTD grammar, step-by-step…

Page 13: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 13

XML: « My Computer » Example (4a of h)

We said an XML file must contains a single element which

will contain all the other elements:

<!ELEMENT PARTS (TITLE?, PART*)>

This description states that the root element is called PARTS

and that it will contain a sequence (, symbol) of zero-or-one

(? symbol) TITLE elements and zero-or-more (* symbol)

PART elements.

So, a file that will contain:

<PARTS><TITLE></TITLE><TITLE></TITLE>…

Wont’ be correct due to the wrong number of TITLE

elements.

Page 14: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 14

XML: « My Computer » Example (4b of h)

The second description in the DTD file will represent the “content-type” of

the TITLE element:

<!ELEMENT TITLE (#PCDATA)>

This means that the TITLE element can contain any sequence of data

described as a #PCDATA by the XML standard itself.

This is a generic sequence of valid characters, number, symbols and so

on. Valid #PCDATA data are:

Afe5o 8ghert

4534

-.f,.5,g

… and so on. The user that’ll define this element would probably use

meaningful data as: <TITLE>This is My Computer</TITLE>.

Page 15: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 15

XML: « My Computer » Example (4c of h)

The third element describes the structure of the PART

element:

<!ELEMENT PART (ITEM, MANUFACTURER, MODEL, COST)+>

as a sequence (, symbol) of four other elements: ITEM,

MANUFACTURER, MODEL and cost.

Each sequence (that is: each PART element) must be present

one-or-more time (+ symbol).

Page 16: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 16

XML: « My Computer » Example (4d to h)

All the other elements are quite simple and state the fact that

the PART sub-elements are defined as simple text

(#PCDATA):

<!ELEMENT ITEM (#PCDATA)>

<!ELEMENT MANUFACTURER (#PCDATA)>

<!ELEMENT MODEL (#PCDATA)>

<!ELEMENT COST (#PCDATA)>

So, even those elements are defined as generic and not

constrained to a specific format.

Page 17: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 17

XML: « My Computer » Example (5 of 5)

What we learned?

XML can defined structured data;

Every one can define its own XML application;

Each XML application represents a new language related to data it can

represent and manage;

An XML-based language defines it capabilities through a grammar that

states how data can be structured and related.

What are we going to learn?

Why a grammar?

Which kind of grammar?

Many other aspects of XML…

Page 18: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 18

The Needs of the Grammar

As stated, a grammar defines the “words” (elements or tags) we can use

in our XML-based language. Is it needed?

XML is defined to be managed by software:

How can a software knows if a tag (element) is allowed in my

language? BASIC programming language allows the PRINT

instruction while Java does not;

How can a software knows if elements are written in the correct

order? PRINT “HELLO” is a valid instruction while “HELLO” PRINT is not;

How can a software knows how to relate data? An address belongs

to an house or to the owner of the house?

XML can also be interpreted and produced in different ways. Grammar

helps software in describing data content;

Grammars are also used by automatic parsers, validators and editors

in order to help the user in hand-writing XML document.

Page 19: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 19

Which Grammar?

XML can use and be defined by a couple of standard

grammar: DTD and XML Schema.

Few notes about DTD:

It’s older;

It uses a specific syntax different from XML;

Cannot define or constrain data types;

Few notes about XML Schema:

It’s a newer standard;

It’s a XML application: defined using XML, uses XML to define XML

languages!

Allows constraints on data types and contents;

Page 20: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 20

XML Schema: a Brief Introduction (1 of 2)

The W3C XML Schema Definition Language is an XML language (W3C

Recommendation) for describing and constraining the content of XML

documents. The purpose of an XML Schema is to define the legal

building blocks of an XML document, just like a DTD. An XML Schema:

defines elements that can appear in a document;

defines attributes that can appear in a document;

defines which elements are child elements;

defines the order of child elements;

defines the number of child elements;

defines whether an element is empty or can include text;

defines data types for elements and attributes;

defines default and fixed values for elements and attributes.

Page 21: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 21

XML Schema: a Brief Introduction (2 of 2)

XML Schema has more features than DTD:

Supports data type;

Uses XML syntax;

Secures data communications

A date like this: “03-11-2004” could be interpreted as 3. November

or as 11. March. An XML element like: <date

type="date">2004-03-11</date> ensures a mutual

understanding because the XML data type date requires the format

YYYY-MM-DD.

Is extensible

A Schema can be used inside another schema or

modified/extended/merged by other schemas.

Page 22: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 22

XML Schema Vs. DTD: Example (1 of 4)

This is a simple example of a XML file:

<?xml version="1.0"?>

<note>

<to>Tove</to>

<from>Jani</from>

<heading>Reminder</heading>

<body>Don't forget me this weekend!</body>

</note>

Where elements and file (data) structure are very simple.

Page 23: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 23

XML Schema Vs. DTD: Example (2 of 4)

The DTD file that describes the previous XML sample could

be:

<!ELEMENT note (to, from, heading, body)>

<!ELEMENT to (#PCDATA)>

<!ELEMENT from (#PCDATA)>

<!ELEMENT heading (#PCDATA)>

<!ELEMENT body (#PCDATA)>

And we saw elements descriptions and meaning before…

Page 24: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 24

XML Schema Vs. DTD: Example (3 of 4)

The XML Schema that describe the XML sample data file could be:

<?xml version="1.0"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified">

<xs:element name="note">

<xs:complexType><xs:sequence>

<xs:element name="to" type="xs:string"/>

<xs:element name="from" type="xs:string"/>

<xs:element name="heading" type="xs:string"/>

<xs:element name="body" type="xs:string"/>

</xs:sequence>

</xs:complexType>

</xs:element>

</xs:schema>

Page 25: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 25

XML Schema Vs. DTD: Example (4 of 4)

Let’s focus on elements: the attributes contained in the

<xs:schema> tag are a detail information we don’t need

(now).

With respect to elements, the DTD <!ELEMENT> object becomes

<xs:element>: the name of the element is an attribute of the tag

itself;

A <xs:complexType> means that an element contains other

elements;

The <xs:sequence> elements describe a mandatory list of elements

inside another element, just like the <,> operator did in DTD;

The xs:string attribute defines a type (!) and says that the related

elements can contain strings: <to>content string<to>.

Page 26: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 26

References to XML Schema or DTD

So, how can a XML file refer to a given XML Schema or DTD that

describe it? Let’s see the XML Schema:

<?xml version="1.0"?>

<note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd">

… </note>

And then the DTD:

<?xml version="1.0"?>

<!DOCTYPE note SYSTEM "http://www.w3schools.com/dtd/note.dtd">

<note> … </note>

Page 27: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 27

XML Schema Data Types

This grammar language defines many data types:

xs:string – generic sequence of characters;

xs:boolean – true/false values that can be represented with 1/0 too;

xs:decimal – sequence of numbers only, with a possible minus sign

(-) preponed and a decimal separator (.);

xs:positiveInteger – integer non-negative values;

xs:byte – a single byte, from –128 to 127;

xs:unsignedByte – an integer number, from 0 to 255;

xs:complexType – can contain other elements;

xs:sequence – describe a sequence of elements;

xs:choice – describe alternate elements;

Page 28: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 28

XML: Some (Very Few) References

The XML Standard: http://www.w3.org/XML/;

XML Schema: http://www.w3.org/XML/Schema;

A tutorial: http://www.w3schools.com/schema/default.asp;

DTD Tutorial: http://www.w3schools.com/dtd/default.asp;

Java Technology & XML: http://java.sun.com/xml/;

XML and C:

The XML C parser and toolkit of Gnome: http://xmlsoft.org/;

Apache Xerces API: http://xml.apache.org/index.html;

Web Syndication (with XML):

http://en.wikipedia.org/wiki/Web_syndication;

Page 29: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 29

XML: Where From Here?

XML standard(s) has many other features and capabilities:

It allows querying a database through XML Query standard;

It allows automatic translations from XML to any other language

((X)HTML in primis), data structure, … through the use of XSL

Transformation language family;

Can describe “well-known” data and information such as:

geometric drawings and diagrams through SVG (Scalable Vector

Graphics);

mathematical expressions, through MathML;

three-dimensional environments, with X3D;

chess games and players information, with LCARS-ML, ChessGML,

ChessML, …;

It allows information spreading and syndication with RSS, Atom, …

To be continued… Let’s take a look at the “big picture”!

Page 30: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 30

XML Family: The Big Picture

Page 31: XML Introduction

All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013

Page 31

www.alcatel.com