XML Extensible Markup Language. What is XML? ● meta-markup language ● a language for defining a...
-
Upload
thomas-bruce -
Category
Documents
-
view
235 -
download
0
Transcript of XML Extensible Markup Language. What is XML? ● meta-markup language ● a language for defining a...
XML
Extensible Markup Language
What is XML?
● meta-markup language● a language for defining a family of languages● semantic/structured mark-up language
– defines structure and meaning, NOT formatting or presentation
Why XML?
● supports construction of domain specific markup languages
● creates a common data format● facilitates data interchange● structures large, complex documents
History
● Standard Generalized Markup Language (SGML)– too complex
● Hyper-Text Markup Language (HTML)– not extensible, limited to small set of fixed tags– polluted with non-semantic tags (e.g. <center>, <i>,
and the dreaded <blink>● XML working group formed in 1996● XML is really a slimmed down SGML
XML Applications
● Chemical Markup Language (CML)– originaly a SGML application– used to describe: molecular structures and sequences,
spectrographic analysis, crystallography, chemical databases, and so on
● Mathematical Markup Language (MathML)– adequate for almost all: education, scientific,
engineering, business, economics, and statistics needs– limited for advances math/theoretical physics
MathML example
<apply><power/><apply>
<plus/><ci>a</ci><ci>b</ci>
</apply><cn>2</cn>
</apply>
(a+b)2
XML Document “Goodness”
● well-formed– satisfies the basic rules of XML syntax
● valid– satisifes the domain specific rules for the language as
definded in the Document Type Definintion (DTD)
Well-formed
1.Must start with an XML declaratoin
2.Elements with content must contain matching start and end tags
3.Empty elements must end with />
4.The document must contain exactly one element that contains all other elements
5.Elements may nest but not overlap
XML Declaration
<?xml version=”1.0” standalone=”yes” ?>– standalone – yes if this file contains a complete
document
Tags
● anything that begins with < and ends with >● end tags begin with </● empty tags end with />● tag names
– start with letter or underscore (_)– remianing characters can be letters, numbers, _,
hyphens or periods
Attributes
● start tags can include zero or more attributes● attributes are name/value pairs separated by and
equals sign (=)● the rules for attribute names are the same as for
tag names● the value is any string enclosed in quotes (single
or double)● if the sting contains quotes entity references must
be used: ' or "
Comments
<!-- important message -->● can't be nested or contained inside start/end tags
Entity References
& &
< <
> >
" “
' '
CDATA
● used for content that resembles XML:
<![CDATA[
<?xml version=”1.0” standalone=”yes” ?>
<greeting>
Hello!
</greeting>
]]>
Valid Documents
● body matches Document Type Definition (DTD)
DTDs
<?xml version=”1.0” standalone=”yes” ?>
<!DOCTYPE greeting [
<!ELEMENT greeting (#PCDATA)>
]>
<greeting>
Hello!
</greeting>
ELEMENTs
● name follwowed pattern● pattern is similar to a regular expression
ATTLIST
● specifies attributes for a tag
<!ATTLIST greeting language CDATA “english”>
Internal Document Type
<!DOCTYPE root_element_name [declarations
]>
<!DOCTYPE greeting [<!ELEMENT greeting (#PCDATA)>
]>
External Document Type (System)
<!DOCTYPE root_element_name SYSTEM DTD_URL>
<!DOCTYPE greetingSYSTEM “http://abc.com/greet.dtd” >
External Document Type (Public)
<!DOCTYPE root_element_name PUBLIC DTD_name DTD_URL>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/ xhtml1/DTD/xhtml1-strict.dtd">
Internal General Entity
<!ENTITY name “replacement text”>
<!ENTITY bk “Brian K. Koehler”><!ENTITY bkc “Copyright 2002, &bk;”>...THE END &bkc;
Internal Parameter Entity
<!ENTITY % name “replacement text”>
<!ENTITY event “(place?, date?)”>...<!ELEMENT ticket (vehicle,%event;)>
ELEMENTs
<!ELEMENT name content_type>
<!ELEMENT document ANY><!ELEMENT street (#PCDATA)><!ELEMENT accident (place,date)><!ELEMENT head (title,meta*)>
ELEMENT content_type
● ANY – anything● #PCDATA – only character data – no contained
elements● EMPTY – element contains no content● reg_exp – a regular expression denoting
acceptable children
Regular expressions
● element_name● re+ 1 or more● re* 0 or more● re? 0 or 1● (re1 | re2 | ... | reN ) re1 or re2 ... or reN● (re1, re2, ..., reN ) re1 followed by re2, ... reN
XHTML head element
● what is the declaration for the XHTML “head” element:– must contain exactly one “title” element– may contain at most one “base” element– may contain 0 or more “meta” elements– title, base, and meta elements can appear in any order
ATTLISTs
<!ATTLIST element_nameattribute_nametype“default value”>
<!ATTLIST img alt CDATA #REQUIRED><!ATTLIST table border CDATA “1”>