XML Introduction
-
Upload
marco-bresciani -
Category
Technology
-
view
253 -
download
1
description
Transcript of XML Introduction
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
XML Introduction
Ing. Marco BRESCIANI
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 2
What Is XML?
XML is a W3C Recommendation (http://www.w3.org/XML/); the acronym
means eXtensible Markup Language:
It’s a mark-up, just like HTML.
It was designed to describe data (metadata language):
It does not define or perform operations on data;
It needs a grammar to describe data:
XML Schema is a standard grammar;
DTD is a standard grammar, old-fashioned.
It does not define tags:
HTML is <html></html>;
XML is NOT <xml></xml>.
It is self-descriptive: an XML file with its grammar is self-contained.
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 3
Why XML?
XML describes data:
Keeps data separated from their graphical layout;
Allows automatic data management and exchange;
Can define new languages in order to produce specific data formats;
A single XML file and the data it contains can be managed in different
ways.
XML is precise and safe:
HTML allows this: <HtML> content </HTml> or <HTML>
content (?);
XML force this: <tag> content </tag> or <tag />.
Refers to a grammar that can be used to do more checks.
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 4
How XML?
XML has been defined to be managed through software in
order to produce different outputs, to evaluate data and to
transform them in many ways… simply using text files!
A simple XML file:
<article>
<paragraph title="Paragraph Title">
<text>This is the text</text>
</paragraph>
</article>
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 5
Some XML Details
There are some constrains that each XML file must observe:
All XML files must start with a prologue:
<?xml version='1.0' encoding='utf-8'?>
where version can be “1.0” or “1.1” and encoding is the name of a
valid ISO charset: “utf-8”, “iso-8859-15”, “utf-16”, “iso-2022-jp”, …;
All elements are case-sensitive. Standard suggests the used of
lowercase: <TAG> or <Tag> or <tag> represent three different
elements;
Element tags must be ordered in the open/close sequence;
Attributes must be contained between ’ or “ delimiters;
A document must contain a single root element that contains all the
others:
<?xml version='1.0' encoding='utf-8'?>
<root> Document content (other element tags) </root>
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 6
XML Definition
As stated, XML does not define tags so we cannot say (as for
HTML), <em> tag identifies an emphasis on text or <table>
tag identifies the beginning of a tabular data structure and
so on.
How can we define XML? XML can be defined by defining an
XML application.
What is an XML application? An XML application is a
language, based on XML structure/grammar, that defines
the structure of a data set.
What is XML? XML is a standard way to define data
structures (and much more… we’ll see…).
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 7
Smart XML Example
As states, XML does not define how
to use data, only their logical
representation. This is a set of data:
<?xml version="1.0" encoding="UTF-8"?>
<X3D profile="Immersive"
xmlns:xsd="http://www.w3.org/2001/XMLSchema-instance"
xsd:noNamespaceSchemaLocation="http://www.web3d.org/specifi
cations/x3d-3.0.xsd">
<head>
<meta content="Scacchiera.x3d" name="filename"/>
<meta content="Scacchiera Tridimensionale di Star Trek"
name="description"/>
<meta content="Marco Bresciani" name="author"/>
<meta content="Marco Bresciani" name="translator"/>
<meta content="1998" name="created"/>
<meta content="2004-09-01" name="translated"/>
<meta content="2005-02-23" name="revised"/>
<meta content="200502.23.1905" name="version"/>
<meta content="http://marcobresciani.altervista.org"
name="reference"/>
…
How can this data be represented?
Look on the right!
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 8
Another XML Example
Remember: XML does not define
how to use data!
<project_statistics>
<master_url>http://setiathome.ssl.berkeley.edu/</master_url
>
<daily_statistics>
<day>1138233600.000000</day>
<user_total_credit>25489.045579</user_total_credit>
<user_expavg_credit>89.772845</user_expavg_credit>
<host_total_credit>5259.731942</host_total_credit>
<host_expavg_credit>89.661763</host_expavg_credit>
</daily_statistics>
<daily_statistics>
<day>1138579200.000000</day>
<user_total_credit>25615.527457</user_total_credit>
<user_expavg_credit>70.733574</user_expavg_credit>
<host_total_credit>5386.213820</host_total_credit>
<host_expavg_credit>70.659423</host_expavg_credit>
</daily_statistics>
…
Data above could be represented
as on the left or in any other ways.
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 9
XML: « My Computer » Example (1 of 5)
Now, we begin working with some basic XML with a(n almost) complete
example. After this, we will describe XML grammar(s) in detail.
This sample XML file (briefly) represents the structure and components of
a generic PC (not completely true… we’ll see that): the parts that
compose the PC, the details about those parts and so on.
The XML file, even if be simple, allows the beginners to have a smart
view of the XML standard itself, by introducing its features and
possibilities with a simple structure based on a known thing.
Let’s see the XML file: please note that elements (tags) are uppercase to
enhance readability…
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 10
XML: « My Computer » Example (2 of 5)
<?xml version="1.0"?>
<!DOCTYPE PARTS SYSTEM "parts.dtd">
<?xml-stylesheet type="text/css" href="xmlpartsstyle.css"?>
<PARTS><TITLE>Computer Parts</TITLE>
<PART>
<ITEM>Motherboard</ITEM>
<MANUFACTURER>ASUS</MANUFACTURER>
<MODEL>P3B-F</MODEL>
<COST>123.00</COST>
</PART>
<PART>
<ITEM>Video Card</ITEM>
<MANUFACTURER>ATI</MANUFACTURER>
<MODEL>All-in-Wonder Pro</MODEL>
<COST>160.00</COST>
</PART>
<PART>
<ITEM>Sound Card</ITEM>
<MANUFACTURER>Creative
Labs</MANUFACTURER>
<MODEL>Sound Blaster
Live</MODEL>
<COST>80.00</COST>
</PART>
<PART>
<ITEM>19 inch Monitor</ITEM>
<MANUFACTURER>LG
Electronics</MANUFACTURER>
<MODEL>995E</MODEL>
<COST>290.00</COST>
</PART>
</PARTS>
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 11
XML: « My Computer » Example (3 of 5)
In the previous page we saw:
A computer is a list of <PARTS> with a name defined by a <TITLE>;
The list of <PARTS> is composed by a number or <PART> element;
Each <PART> has its own details:
An <ITEM> that describes the kind of part we are writing about;
A <MANUFACTURER> that states the builder of the <PART>;
A <MODEL> that renders the description of the <PART>, also given by
<ITEM>;
A <COST> that specifies the money you spent to buy that component.
We saw a <!DOCTYPE PARTS SYSTEM “parts.dtd”>
tag too: this is used to relate the XML to its grammar. Let’s
see it…
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 12
XML: « My Computer » Example (4 of 5)
Using the DTD standard (simpler, for beginners) we can define the
grammar of our “My Computer” description:
<!ELEMENT PARTS (TITLE?, PART*)>
<!ELEMENT TITLE (#PCDATA)>
<!ELEMENT PART (ITEM, MANUFACTURER, MODEL, COST)+>
<!ATTLIST PART type (computer|auto|airplane) #IMPLIED>
<!ELEMENT ITEM (#PCDATA)>
<!ELEMENT MANUFACTURER (#PCDATA)>
<!ELEMENT MODEL (#PCDATA)>
<!ELEMENT COST (#PCDATA)>
The DocType element described before defines the language we are
going to use in our XML file. The grammar file defines the details of such
language Let’s check the DTD grammar, step-by-step…
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 13
XML: « My Computer » Example (4a of h)
We said an XML file must contains a single element which
will contain all the other elements:
<!ELEMENT PARTS (TITLE?, PART*)>
This description states that the root element is called PARTS
and that it will contain a sequence (, symbol) of zero-or-one
(? symbol) TITLE elements and zero-or-more (* symbol)
PART elements.
So, a file that will contain:
<PARTS><TITLE></TITLE><TITLE></TITLE>…
Wont’ be correct due to the wrong number of TITLE
elements.
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 14
XML: « My Computer » Example (4b of h)
The second description in the DTD file will represent the “content-type” of
the TITLE element:
<!ELEMENT TITLE (#PCDATA)>
This means that the TITLE element can contain any sequence of data
described as a #PCDATA by the XML standard itself.
This is a generic sequence of valid characters, number, symbols and so
on. Valid #PCDATA data are:
Afe5o 8ghert
4534
-.f,.5,g
… and so on. The user that’ll define this element would probably use
meaningful data as: <TITLE>This is My Computer</TITLE>.
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 15
XML: « My Computer » Example (4c of h)
The third element describes the structure of the PART
element:
<!ELEMENT PART (ITEM, MANUFACTURER, MODEL, COST)+>
as a sequence (, symbol) of four other elements: ITEM,
MANUFACTURER, MODEL and cost.
Each sequence (that is: each PART element) must be present
one-or-more time (+ symbol).
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 16
XML: « My Computer » Example (4d to h)
All the other elements are quite simple and state the fact that
the PART sub-elements are defined as simple text
(#PCDATA):
<!ELEMENT ITEM (#PCDATA)>
<!ELEMENT MANUFACTURER (#PCDATA)>
<!ELEMENT MODEL (#PCDATA)>
<!ELEMENT COST (#PCDATA)>
So, even those elements are defined as generic and not
constrained to a specific format.
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 17
XML: « My Computer » Example (5 of 5)
What we learned?
XML can defined structured data;
Every one can define its own XML application;
Each XML application represents a new language related to data it can
represent and manage;
An XML-based language defines it capabilities through a grammar that
states how data can be structured and related.
What are we going to learn?
Why a grammar?
Which kind of grammar?
Many other aspects of XML…
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 18
The Needs of the Grammar
As stated, a grammar defines the “words” (elements or tags) we can use
in our XML-based language. Is it needed?
XML is defined to be managed by software:
How can a software knows if a tag (element) is allowed in my
language? BASIC programming language allows the PRINT
instruction while Java does not;
How can a software knows if elements are written in the correct
order? PRINT “HELLO” is a valid instruction while “HELLO” PRINT is not;
How can a software knows how to relate data? An address belongs
to an house or to the owner of the house?
XML can also be interpreted and produced in different ways. Grammar
helps software in describing data content;
Grammars are also used by automatic parsers, validators and editors
in order to help the user in hand-writing XML document.
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 19
Which Grammar?
XML can use and be defined by a couple of standard
grammar: DTD and XML Schema.
Few notes about DTD:
It’s older;
It uses a specific syntax different from XML;
Cannot define or constrain data types;
Few notes about XML Schema:
It’s a newer standard;
It’s a XML application: defined using XML, uses XML to define XML
languages!
Allows constraints on data types and contents;
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 20
XML Schema: a Brief Introduction (1 of 2)
The W3C XML Schema Definition Language is an XML language (W3C
Recommendation) for describing and constraining the content of XML
documents. The purpose of an XML Schema is to define the legal
building blocks of an XML document, just like a DTD. An XML Schema:
defines elements that can appear in a document;
defines attributes that can appear in a document;
defines which elements are child elements;
defines the order of child elements;
defines the number of child elements;
defines whether an element is empty or can include text;
defines data types for elements and attributes;
defines default and fixed values for elements and attributes.
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 21
XML Schema: a Brief Introduction (2 of 2)
XML Schema has more features than DTD:
Supports data type;
Uses XML syntax;
Secures data communications
A date like this: “03-11-2004” could be interpreted as 3. November
or as 11. March. An XML element like: <date
type="date">2004-03-11</date> ensures a mutual
understanding because the XML data type date requires the format
YYYY-MM-DD.
Is extensible
A Schema can be used inside another schema or
modified/extended/merged by other schemas.
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 22
XML Schema Vs. DTD: Example (1 of 4)
This is a simple example of a XML file:
<?xml version="1.0"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
Where elements and file (data) structure are very simple.
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 23
XML Schema Vs. DTD: Example (2 of 4)
The DTD file that describes the previous XML sample could
be:
<!ELEMENT note (to, from, heading, body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
And we saw elements descriptions and meaning before…
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 24
XML Schema Vs. DTD: Example (3 of 4)
The XML Schema that describe the XML sample data file could be:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified">
<xs:element name="note">
<xs:complexType><xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 25
XML Schema Vs. DTD: Example (4 of 4)
Let’s focus on elements: the attributes contained in the
<xs:schema> tag are a detail information we don’t need
(now).
With respect to elements, the DTD <!ELEMENT> object becomes
<xs:element>: the name of the element is an attribute of the tag
itself;
A <xs:complexType> means that an element contains other
elements;
The <xs:sequence> elements describe a mandatory list of elements
inside another element, just like the <,> operator did in DTD;
The xs:string attribute defines a type (!) and says that the related
elements can contain strings: <to>content string<to>.
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 26
References to XML Schema or DTD
So, how can a XML file refer to a given XML Schema or DTD that
describe it? Let’s see the XML Schema:
<?xml version="1.0"?>
<note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd">
… </note>
And then the DTD:
<?xml version="1.0"?>
<!DOCTYPE note SYSTEM "http://www.w3schools.com/dtd/note.dtd">
<note> … </note>
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 27
XML Schema Data Types
This grammar language defines many data types:
xs:string – generic sequence of characters;
xs:boolean – true/false values that can be represented with 1/0 too;
xs:decimal – sequence of numbers only, with a possible minus sign
(-) preponed and a decimal separator (.);
xs:positiveInteger – integer non-negative values;
xs:byte – a single byte, from –128 to 127;
xs:unsignedByte – an integer number, from 0 to 255;
xs:complexType – can contain other elements;
xs:sequence – describe a sequence of elements;
xs:choice – describe alternate elements;
…
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 28
XML: Some (Very Few) References
The XML Standard: http://www.w3.org/XML/;
XML Schema: http://www.w3.org/XML/Schema;
A tutorial: http://www.w3schools.com/schema/default.asp;
DTD Tutorial: http://www.w3schools.com/dtd/default.asp;
Java Technology & XML: http://java.sun.com/xml/;
XML and C:
The XML C parser and toolkit of Gnome: http://xmlsoft.org/;
Apache Xerces API: http://xml.apache.org/index.html;
Web Syndication (with XML):
http://en.wikipedia.org/wiki/Web_syndication;
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 29
XML: Where From Here?
XML standard(s) has many other features and capabilities:
It allows querying a database through XML Query standard;
It allows automatic translations from XML to any other language
((X)HTML in primis), data structure, … through the use of XSL
Transformation language family;
Can describe “well-known” data and information such as:
geometric drawings and diagrams through SVG (Scalable Vector
Graphics);
mathematical expressions, through MathML;
three-dimensional environments, with X3D;
chess games and players information, with LCARS-ML, ChessGML,
ChessML, …;
It allows information spreading and syndication with RSS, Atom, …
To be continued… Let’s take a look at the “big picture”!
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 30
XML Family: The Big Picture
All rights reserved © 2005, Alcatel XML Introduction / 6 November, 2013
Page 31
www.alcatel.com